Wolfram Computation Meets Knowledge

Category

Machine Learning

26 items

Filter by Type

MNIST

Database of handwritten digits commonly used for training image processing systems

CIFAR-10

CIFAR-10 computer-vision training dataset

CIFAR-100

CIFAR-100 computer-vision training dataset

Sample Data: Fisher's Irises

Fisher's iris data

Audio Cats and Dogs

Dataset consisting of recordings of cats and dogs

Wind Speed Measurements

Average daily wind speed at 12 meteorological stations in the Republic of Ireland 1961-1978

FashionMNIST

A small MNIST-like fashion product image dataset

Sample Data: Car Evaluation

Predicting car acceptability by attribute.

Sample Data: Gene Sequences

Splice-junction Gene Sequences for Primate DNA

Sample Data: Abalone Measurements

Predict the age of abalone from physical measurements

Sample Data: Titanic Survival

Classify whether a passenger on board the maiden voyage of the RMS Titanic in 1912 survived given their age, sex and class

Europarl English-French Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-Italian Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-German Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-Spanish Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Japanese-English Subtitle Corpus

A parallel corpus for machine translation systems, information extraction and other language processing techniques

Kyoto Free Translation Task Data

A parallel corpus for the evaluation and development of Japanese-English machine translation systems

Sample Data: Spam Email

Dataset of email statistics for the classification of spam email

Spoken Digit Commands

A dataset consisting of recordings of spoken digits

The 20-Task bAbI Question-Answering Dataset v1.2

A dataset for question answering and text understanding in both Hindi and English

SQuAD v1.1 Tokens Generated with WL

A list of isolated words and symbols from the SQuAD dataset, which consists of a set of Wikipedia articles labeled for question answering and reading comprehension

SQuAD v1.1

A dataset for question answering and reading comprehension from a set of Wikipedia articles

Clinical Concepts from Massive Sources of Medical Data

A dataset of medical concepts

Japanese-English Legal Parallel Corpus

A parallel corpus for machine translation systems, information extraction and other language processing techniques

SQuAD v2.0

A dataset for question answering and reading comprehension from a set of Wikipedia articles

SQuAD v2.0 Tokens Generated with WL

A list of isolated words and symbols from the SQuAD dataset, which consists of a set of Wikipedia articles labeled for question answering and reading comprehension