Wolfram Research

FashionMNIST

A small MNIST-like fashion product image dataset

FER-2013

The Facial Expression Recognition 2013 (FER-2013) Dataset

Sample Data: Abalone Measurements

Predict the age of abalone from physical measurements

Sample Data: Spam Email

Dataset of email statistics for the classification of spam email

MNIST

Database of handwritten digits commonly used for training image processing systems

Sample Data: Car Evaluation

Predicting car acceptability by attribute.

Wind Speed Measurements

Average daily wind speed at 12 meteorological stations in the Republic of Ireland 1961-1978

Sample Data: UCI Letter

Letter recognition dataset

Sample Data: Gene Sequences

Splice-junction Gene Sequences for Primate DNA

Sample Data: Satellite

Classify the type of land surface of a scene photographed by the Landsat MSS satellite given four digital images of the scene taken in different spectral bands

SQuAD v1.1

A dataset for question answering and reading comprehension from a set of Wikipedia articles

SQuAD v1.1 Tokens Generated with WL

A list of isolated words and symbols from the SQuAD dataset, which consists of a set of Wikipedia articles labeled for question answering and reading comprehension

CIFAR-100

CIFAR-100 computer-vision training dataset

CIFAR-10

CIFAR-10 computer-vision training dataset

Europarl English-Spanish Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-German Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-Italian Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-French Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

The 20-Task bAbI Question-Answering Dataset v1.2

A dataset for question answering and text understanding in both Hindi and English

SQuAD v2.0 Tokens Generated with WL

A list of isolated words and symbols from the SQuAD dataset, which consists of a set of Wikipedia articles labeled for question answering and reading comprehension

Sample Data: Movie Review Sentence Polarity

Movie review data

Audio Cats and Dogs

Dataset consisting of recordings of cats and dogs

SQuAD v2.0

A dataset for question answering and reading comprehension from a set of Wikipedia articles

Sample Data: Wine Quality

Quality of white wines given the physical properties of the wines

Sample Data: Mushroom Classification

Determine whether a mushroom is edible based on physical characteristics

Japanese-English Subtitle Corpus

A parallel corpus for machine translation systems, information extraction and other language processing techniques

Spoken Digit Commands

A dataset consisting of recordings of spoken digits

Sample Data: Boston Homes

Home values for 506 Boston suburbs with potential influential factors.

Sample Data: Titanic Survival

Classify whether a passenger on board the maiden voyage of the RMS Titanic in 1912 survived given their age, sex and class

Kyoto Free Translation Task Data

A parallel corpus for the evaluation and development of Japanese-English machine translation systems

Japanese-English Legal Parallel Corpus

A parallel corpus for machine translation systems, information extraction and other language processing techniques

Sample Data: Fisher's Irises

Fisher's iris data

Clinical Concepts from Massive Sources of Medical Data

A dataset of medical concepts