Wolfram Research

CIFAR-100

CIFAR-100 computer-vision training dataset

CIFAR-10

CIFAR-10 computer-vision training dataset

Kyoto Free Translation Task Data

A parallel corpus for the evaluation and development of Japanese-English machine translation systems

Spoken Digit Commands

A dataset consisting of recordings of spoken digits

Sample Data: Mushroom Classification

Determine whether a mushroom is edible based on physical characteristics

Sample Data: Movie Review Sentence Polarity

Movie review data

Sample Data: Boston Homes

Housing values in suburbs of Boston

Sample Data: Fisher's Irises

Fisher's iris data

Europarl English-Spanish Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-Italian Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-French Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-German Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

MNIST

Database of handwritten digits commonly used for training image processing systems

FashionMNIST

A small MNIST-like fashion product image dataset

Sample Data: UCI Letter

Letter recognition dataset

Sample Data: Titanic Survival

Classify whether a passenger on board the maiden voyage of the RMS Titanic in 1912 survived given their age, sex and class

SQuAD v1.1

A dataset for question answering and reading comprehension from a set of Wikipedia articles

SQuAD v2.0

A dataset for question answering and reading comprehension from a set of Wikipedia articles

SQuAD v2.0 Tokens Generated with WL

A list of isolated words and symbols from the SQuAD dataset, which consists of a set of Wikipedia articles labeled for question answering and reading comprehension

SQuAD v1.1 Tokens Generated with WL

A list of isolated words and symbols from the SQuAD dataset, which consists of a set of Wikipedia articles labeled for question answering and reading comprehension

The 20-Task bAbI Question-Answering Dataset v1.2

A dataset for question answering and text understanding in both Hindi and English