Wolfram Computation Meets Knowledge

The 20-Task bAbI Question-Answering Dataset v1.2

A dataset for question answering and text understanding in both Hindi and English

Spoken Digit Commands

A dataset consisting of recordings of spoken digits

FashionMNIST

A small MNIST-like fashion product image dataset

United States Supreme Court Decisions 1946-present

Datasets relating to Supreme Court cases from 1946 to present

Audio Cats and Dogs

Dataset consisting of recordings of cats and dogs

Periodic Groundwater Level Measurements

Dataset of seasonal and long-term groundwater level measurements in groundwater basins in California

USB Device Vendors and Devices

A dataset of the vendors and devices in the Linux usb.ids file

CIFAR-100

CIFAR-100 computer-vision training dataset

CIFAR-10

CIFAR-10 computer-vision training dataset

Human Cell Counts

Dataset of total number of cells in organs/systems in adult human body

Clinical Concepts from Massive Sources of Medical Data

A dataset of medical concepts

US Coal Fields

This dataset represents coal fields in Alaska and the conterminous United States.

Europarl English-Spanish Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-Italian Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-French Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Enron Email Network

Graph of the Enron email communication network within a dataset of around half a million emails

Worldwide Complexity Institutes

Dataset of known complexity institutes worldwide

California Crop Mapping

Dataset of agricultural land use and irrigated acres in California

Nuclear Latency Dataset

Facility-specific information on sensitive nuclear plants constructed from 1939 to 2012

Video Games Until April 2017

Dataset from the Internet Games Database API as of April 2017

Europarl English-German Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

World Atlas of Language Structures

Dataset of structural properties of languages

Sample Data: Solar System Planets and Moons

Sample dataset containing the mass and radius of planets and moons in the Solar System

Path of the Total Solar Eclipse of August 21st, 2017

Dataset of the Path of the Total Solar Eclipse of August 21st, 2017

Peace Corps Volunteer Demographics (FY 2016)

Dataset of the demographics of Peace Corps volunteers in the 2016 fiscal year

Paul Revere's Social Network in Colonial Boston

Dataset of associations among political groups in colonial Boston 1762 - 1775

Minimal Inequivalent Square Tilings

A dataset of images and constraints for the minimal inequivalent square tilings, along with the allowed tiles that generate the tiling

UFO Sightings 2015

Dataset of UFO sightings in the United States in 2015

SQuAD v1.1

A dataset for question answering and reading comprehension from a set of Wikipedia articles

US Fatal Injuries 1999-2014

Dataset of deaths and crude rates of fatal injuries in the United States from 1999 to 2014

Global Events of Organized Violence

Georeferenced dataset of individual events of organized violence from the Uppsala Conflict Data Program

SQuAD v1.1 Tokens Generated with WL

A list of isolated words and symbols from the SQuAD dataset, which consists of a set of Wikipedia articles labeled for question answering and reading comprehension

Sample Data: Spam Email

Dataset of email statistics for the classification of spam email

Solutions to Examples of Post’s Correspondence Problem

A dataset of instances and solutions (if they exist) for Post’s correspondence problem

Atlantic Hurricane Data 1851-2017

A modification of the NOAA "Hurdat2" Dataset on Atlantic Hurricanes to facilitate use with the Wolfram Language

Kyoto Free Translation Task Data

A parallel corpus for the evaluation and development of Japanese-English machine translation systems

Atlanta Police Department Crime Data (2009-2017)

Dataset of crime data reported by the Atlanta Police from 2009 to May 17, 2017

Executions in the United States

A dataset about executions (the death penalty) in the United States since the 1976 Supreme Court decision in Gregg v. Georgia (428 U.S. 153)

Japanese-English Legal Parallel Corpus

A parallel corpus for machine translation systems, information extraction and other language processing techniques

Sample Data: Gene Sequences

Splice-junction Gene Sequences for Primate DNA

Japanese-English Subtitle Corpus

A parallel corpus for machine translation systems, information extraction and other language processing techniques

Irish-Viking Networks in 'Cogadh Gaedhel re Gallaibh'

Graph datasets for Irish and Viking character relationships in the medieval Irish text 'Cogadh Gaedhel re Gallaibh' ('The War of the Gaedhil with the Gaill')

Transcriptional Regulation Network of Escherichia coli

Dataset of the transcriptional regulation network of Escherichia coli

Gridded World Population Density

UN-adjusted gridded world population density for the years 2000, 2005, 2010, and 2015

California Urban Water Supplier Monitoring Reports

Monthly reports of the larger urban water suppliers in California on water production and conservation activities, from the State of California's Drinking Water Information Clearinghouse (DRINC)