Skip to main content

The Dataset Collection

The Dataset Collection consists of large data archives from both sites and individuals.



rss RSS

9,244
RESULTS


Show sorted alphabetically

Show sorted alphabetically

SHOW DETAILS
up-solid down-solid
eye
Title
Date Archived
Creator
Academic Data and Datasets
Academic Data and Datasets
collection
0
ITEMS
280
VIEWS
collection

eye 280

A collection of datasets and data related to academic issues.
Academic Torrents
Academic Torrents
collection
2,264
ITEMS
608,462
VIEWS
by ACADEMICTORRENTS.COM
collection

eye 608,462

Welcome to Academic Torrents! Making 14.15TB of research data available. We've designed a distributed system for sharing enormous datasets - for researchers, by researchers. The result is a scalable, secure, and fault-tolerant repository for data, with blazing fast download speeds.
C.elegans behavioural database
C.elegans behavioural database
collection
0
ITEMS
0
VIEWS
collection

eye 0

This experiment is part of the C.elegans behavioural database. For more information and the complete collection of experiments visit http://movement.openworm.org
Dumps of DISCOGS.ORG Metadata (2008-Present)
Dumps of DISCOGS.ORG Metadata (2008-Present)
collection
145
ITEMS
5,114
VIEWS
by DISCOGS.ORG
collection

eye 5,114

This is an unofficial mirror of the DISCOGS.ORG data collection, which is located at http://www.discogs.com/data/ . Discogs, short for discographies, is a website and database of information about audio recordings, including commercial releases, promotional releases, and bootleg or off-label releases. The Discogs servers, currently hosted under the domain name discogs.com, are owned by Zink Media, Inc., and are located in Portland, Oregon, USA. Discogs is one of the largest online databases of...
The Dataset Collection
data

eye 2

favorite 3

comment 0

Archive of gamefaqs text files downloaded from the following gopher site. gopher://gopher.endangeredsoft.org/1/gamefaqs-archive/ See also the following related sitegrab: https://archive.org/details/Gamespot_Gamefaqs_TXTs
Topic: txt gamefaq gamesfaq
The Dataset Collection
by /u/prograc
texts

eye 153

favorite 20

comment 0

More info here: https://web.archive.org/web/20200707075947/https://www.reddit.com/r/DataHoarder/comments/ftsdbs/gamespot_txt_gamefaqs_full_archive_32320/
Topics: txt, gamefaq, gamesfaq
Harvard Dataverse
Harvard Dataverse
collection
1
ITEMS
171
VIEWS
collection

eye 171

Imageboard Datasets
Imageboard Datasets
collection
0
ITEMS
129
VIEWS
collection

eye 129

A collection of datasets arranged around imageboards.
Internet Census 2012
Internet Census 2012
collection
15
ITEMS
2,839
VIEWS
by Anonymous
collection

eye 2,839

Abstract While playing around with the Nmap Scripting Engine (NSE) we discovered an amazing number of open embedded devices on the Internet. Many of them are based on Linux and allow login to standard BusyBox with empty or default credentials. We used these devices to build a distributed port scanner to scan all IPv4 addresses. These scans include service probes for the most common ports, ICMP ping, reverse DNS and SYN scans. We analyzed some of the data to get an estimation of the IP address...
MusicBrainz Data Dumps
MusicBrainz Data Dumps
collection
851
ITEMS
7,480
VIEWS
collection

eye 7,480

The MusicBrainz Database is built on the PostgreSQL relational database engine and contains all of MusicBrainz' music metadata. This data includes information about artists, release groups, releases, recordings, works, and labels, as well as the many relationships between them. The database also contains a full history of all the changes that the MusicBrainz community has made to the data. Core data Artists Name, sort name, IPI, aliases, type, begin and end dates, disambiguation comment, MBID...
NIH Data Commons
NIH Data Commons
collection
10
ITEMS
1,406
VIEWS
collection

eye 1,406

The Data Commons Pilot Phase Consortium (DCPPC) is an NIH project to tackle the challenges of data-driven and data-intensive biomedical research: The data sets are too large to download There's minimal interoperability between and across data set providers Local compute capacity often is too limited to meet dynamic research needs These challenges are preventing biomedical data from reaching its full potential in basic research, clinical, and translational medicine. DCPPC aims to improve this...
OpenStreetMap datasets
OpenStreetMap datasets
collection
4,636
ITEMS
26,907
VIEWS
by OpenStreetMap contributors
collection

eye 26,907

OpenStreetMap (OSM) is a collaborative project to create a free editable map of the world. What is available? Planet.osm in XML format (current and full history), dumped weekly Planet.osm in the custom Protocolbuffer Binary Format (PBF) (current and full history), dumped weekly Metadata of all changes (changesets) in XML format, dumped weekly All discussions in XML format, dumped weekly User contributed notes, dumped daily How do I search this collection? The items in this collection are...
Topics: openstreetmap, osm, maps, data, mapping, map, dumps
Screenshot Compilations
Screenshot Compilations
collection
0
ITEMS
65
VIEWS
collection

eye 65

Compilations of screenshots generated automatically or semi-automatically.
Unsorted Datasets
Unsorted Datasets
collection
125
ITEMS
13,608
VIEWS
collection

eye 13,608

Unsorted Datasets
YFCC Datasets
YFCC Datasets
collection
0
ITEMS
17
VIEWS
collection

eye 17

Part of an August 2021 download of roughly 40 % of the Flickr images referenced in the YFCC100M dataset.