Wolfram Research

Sample Data: Abalone Measurements

Predict the age of abalone from physical measurements

Sample Data: Spam Email

Dataset of email statistics for the classification of spam email

FashionMNIST

A small MNIST-like fashion product image dataset

FER-2013

The Facial Expression Recognition 2013 (FER-2013) Dataset

Sample Data: Car Evaluation

Predicting car acceptability by attribute.

Wind Speed Measurements

Average daily wind speed at 12 meteorological stations in the Republic of Ireland 1961-1978

MNIST

Database of handwritten digits commonly used for training image processing systems

The Time Machine

Plaintext for H. G. Wells' "The Time Machine"

Sample Data: Gene Sequences

Splice-junction Gene Sequences for Primate DNA

Europarl English-Spanish Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-German Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-Italian Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-French Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Sample Data: UCI Letter

Letter recognition dataset

SQuAD v1.1

A dataset for question answering and reading comprehension from a set of Wikipedia articles

SQuAD v1.1 Tokens Generated with WL

A list of isolated words and symbols from the SQuAD dataset, which consists of a set of Wikipedia articles labeled for question answering and reading comprehension

Sample Data: Satellite

Classify the type of land surface of a scene photographed by the Landsat MSS satellite given four digital images of the scene taken in different spectral bands

The 20-Task bAbI Question-Answering Dataset v1.2

A dataset for question answering and text understanding in both Hindi and English

SQuAD v2.0 Tokens Generated with WL

A list of isolated words and symbols from the SQuAD dataset, which consists of a set of Wikipedia articles labeled for question answering and reading comprehension

Sample Data: Movie Review Sentence Polarity

Movie review data

Audio Cats and Dogs

Dataset consisting of recordings of cats and dogs

SQuAD v2.0

A dataset for question answering and reading comprehension from a set of Wikipedia articles

Sample Data: Wine Quality

Quality of white wines given the physical properties of the wines

Sample Data: Mushroom Classification

Determine whether a mushroom is edible based on physical characteristics

Japanese-English Subtitle Corpus

A parallel corpus for machine translation systems, information extraction and other language processing techniques

CIFAR-100

CIFAR-100 computer-vision training dataset

CIFAR-10

CIFAR-10 computer-vision training dataset

Sample Data: Ceramic Strength

Effect of machining factors on the strength of ceramics

Spoken Digit Commands

A dataset consisting of recordings of spoken digits

Sample Data: Boston Homes

Housing values in suburbs of Boston

Sample Data: Titanic Survival

Classify whether a passenger on board the maiden voyage of the RMS Titanic in 1912 survived given their age, sex and class

Kyoto Free Translation Task Data

A parallel corpus for the evaluation and development of Japanese-English machine translation systems

Japanese-English Legal Parallel Corpus

A parallel corpus for machine translation systems, information extraction and other language processing techniques

GDB-9 Database

Database of molecular quantum calculations

Sample Data: Fisher's Irises

Fisher's iris data

Clinical Concepts from Massive Sources of Medical Data

A dataset of medical concepts

NYC Emergency Response Incidents

NYC Open Data makes the wealth of public data generated by various New York City agencies and other City organizations available for public use. This catalog offers access to a repository of government-produced, machi...

Raw Data For The Long Term Selection Experiment For Oil And Protein In Corn

Raw data from each ear analyzed each year of the Illinois long-term selection experiment for oil and protein in corn (1896-2004)