Wolfram Computation Meets Knowledge

Category

Machine Learning

24 items

Filter by Type

MNIST

Database of handwritten digits commonly used for training image processing systems

CIFAR-10

CIFAR-10 computer-vision training dataset

CIFAR-100

CIFAR-100 computer-vision training dataset

Sample Data: Fisher's Irises

Fisher's iris data

Audio Cats and Dogs

Dataset consisting of recordings of cats and dogs

Sample Data: Abalone Measurements

Predict the age of abalone from physical measurements

Sample Data: Gene Sequences

Splice-junction Gene Sequences for Primate DNA

Wind Speed Measurements

Average daily wind speed at 12 meteorological stations in the Republic of Ireland 1961-1978

Europarl English-French Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-Italian Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-German Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-Spanish Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Japanese-English Subtitle Corpus

A parallel corpus for machine translation systems, information extraction and other language processing techniques

Kyoto Free Translation Task Data

A parallel corpus for the evaluation and development of Japanese-English machine translation systems

Sample Data: Titanic Survival

Classify whether a passenger on board the maiden voyage of the RMS Titanic in 1912 survived given their age, sex and class

Sample Data: Spam Email

Dataset of email statistics for the classification of spam email

Sample Data: Car Evaluation

Predicting car acceptability by attribute.

Spoken Digit Commands

A dataset consisting of recordings of spoken digits

The 20-Task bAbI Question-Answering Dataset v1.2

A dataset for question answering and text understanding in both Hindi and English

SQuAD v1.1 Tokens Generated with WL

A list of isolated words and symbols from the SQuAD dataset, which consists of a set of Wikipedia articles labeled for question answering and reading comprehension

SQuAD v1.1

A dataset for question answering and reading comprehension from a set of Wikipedia articles

Clinical Concepts from Massive Sources of Medical Data

A dataset of medical concepts

FashionMNIST

A small MNIST-like fashion product image dataset

Japanese-English Legal Parallel Corpus

A parallel corpus for machine translation systems, information extraction and other language processing techniques