Home
Repositories
Models
Pricing
Docs
Blog
Community
Login
Sign up
Repositories
Models
Pricing
Learn
Log in
Sign up
ox
/
smol_arxiv
Data
Branches
Evaluations
Fine-tune
Loading...
Copyright © 2026 Oxen Labs, Inc., All Rights Reserved
Careers
Trust Center
Privacy Policy
Terms and Conditions
smol_arxiv
public
Star
Fork
Loading...
About
Here is a small subset of Arxiv with 1000 papers as a demo for PDF renderering
4
commits
1
contributor
2
downloads
3.3 gb
tex
67.6 mb
adding dataset chunks
2 years ago
pdfs
3.2 gb
adding dataset chunks
2 years ago
chunks.jsonl
68.3 mb
adding dataset chunks
tabular
2 years ago
chunk_dataset.py
5 kB
adding dataset chunks
text
2 years ago
README.md
127 B
adding readme and schema metadata
text
2 years ago
dataset.jsonl
1.6 mb
adding smol dataset
tabular
2 years ago
Last commit cannot be located
0
stars
Repository contents
binary
50.2%
text
49.7%
tabular
< 1%
3.3 gb
1K
1
1K
Contributors
@ox
smol_arxiv
/
1 branch