Wolfram Research

CIFAR-100

CIFAR-100 computer-vision training dataset

CIFAR-10

CIFAR-10 computer-vision training dataset

MNIST

Database of handwritten digits commonly used for training image processing systems

Europarl English-Spanish Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-German Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-Italian Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-French Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Kyoto Free Translation Task Data

A parallel corpus for the evaluation and development of Japanese-English machine translation systems

Japanese-English Subtitle Corpus

A parallel corpus for machine translation systems, information extraction and other language processing techniques

Sample Data: Boston Homes

Home values for 506 Boston suburbs with potential influential factors.

Sample Data: Mushroom Classification

Determine whether a mushroom is edible based on physical characteristics

Sample Data: Fisher's Irises

Fisher's iris data

Spoken Digit Commands

A dataset consisting of recordings of spoken digits

SQuAD v1.1

A dataset for question answering and reading comprehension from a set of Wikipedia articles

Sample Data: Movie Review Sentence Polarity

Movie review data

SQuAD v2.0

A dataset for question answering and reading comprehension from a set of Wikipedia articles

FashionMNIST

A small MNIST-like fashion product image dataset

Sample Data: UCI Letter

Letter recognition dataset

SQuAD v1.1 Tokens Generated with WL

A list of isolated words and symbols from the SQuAD dataset, which consists of a set of Wikipedia articles labeled for question answering and reading comprehension

SQuAD v2.0 Tokens Generated with WL

A list of isolated words and symbols from the SQuAD dataset, which consists of a set of Wikipedia articles labeled for question answering and reading comprehension

Sample Data: Titanic Survival

Classify whether a passenger on board the maiden voyage of the RMS Titanic in 1912 survived given their age, sex and class

The 20-Task bAbI Question-Answering Dataset v1.2

A dataset for question answering and text understanding in both Hindi and English