Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
weberer
on Feb 7, 2025
|
parent
|
context
|
favorite
| on:
Meta torrented & seeded 81.7 TB dataset containing...
The article says they got datasets from Anna's Archive. It was most likely the scihub/libgen torrent which is 96.0 TB right now and contains 92,872,581 files. That's about 1 megabyte per file.
https://annas-archive.org/datasets
southernplaces7
on Feb 9, 2025
[–]
Where does one find these torrent datasets? Did they download the books in bits and pieces or as a single huge multi-TB file?
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search:
https://annas-archive.org/datasets