Wolfram Computation Meets Knowledge

SQuAD v2.0

A dataset for question answering and reading comprehension from a set of Wikipedia articles

SQuAD v1.1

A dataset for question answering and reading comprehension from a set of Wikipedia articles

SQuAD v2.0 Tokens Generated with WL

A list of isolated words and symbols from the SQuAD dataset, which consists of a set of Wikipedia articles labeled for question answering and reading comprehension

The 20-Task bAbI Question-Answering Dataset v1.2

A dataset for question answering and text understanding in both Hindi and English

SQuAD v1.1 Tokens Generated with WL

A list of isolated words and symbols from the SQuAD dataset, which consists of a set of Wikipedia articles labeled for question answering and reading comprehension

Spoken Digit Commands

A dataset consisting of recordings of spoken digits

FashionMNIST

A small MNIST-like fashion product image dataset

Global Events of Organized Violence

Georeferenced dataset of individual events of organized violence from the Uppsala Conflict Data Program

United States Supreme Court Decisions 1946-present

Datasets relating to Supreme Court cases from 1946 to present

Periodic Groundwater Level Measurements

Dataset of seasonal and long-term groundwater level measurements in groundwater basins in California

Audio Cats and Dogs

Dataset consisting of recordings of cats and dogs

Nuclear Latency Dataset

Facility-specific information on sensitive nuclear plants constructed from 1939 to 2012

Clinical Concepts from Massive Sources of Medical Data

A dataset of medical concepts

Europarl English-Spanish Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-Italian Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-French Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Enron Email Network

Graph of the Enron email communication network within a dataset of around half a million emails

Europarl English-German Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Worldwide Complexity Institutes

Dataset of known complexity institutes worldwide

California Crop Mapping

Dataset of agricultural land use and irrigated acres in California

CIFAR-100

CIFAR-100 computer-vision training dataset

Minimal Inequivalent Square Tilings

A dataset of images and constraints for the minimal inequivalent square tilings, along with the allowed tiles that generate the tiling

CIFAR-10

CIFAR-10 computer-vision training dataset

Sample Data: Anscombe Regression Lines

Anscombe's 4 regression line data

Atlanta Police Department Crime Data (2009-2017)

Dataset of crime data reported by the Atlanta Police from 2009 to May 17, 2017

SNAP Retailers

A comprehensive list of all retailers accepting SNAP payments in the U.S.

UFO Sightings 2015

Dataset of UFO sightings in the United States in 2015

Video Games Until April 2017

Dataset from the Internet Games Database API as of April 2017

Sample Data: Solar System Planets and Moons

Sample dataset containing the mass and radius of planets and moons in the Solar System

UPS Facilities

Information about U.S. UPS facilities, including exact locations

Indian Reservations

Geographical descriptions and locations of Indian Reservations throughout the U.S.

US Coal Fields

This dataset represents coal fields in Alaska and the conterminous United States.

Washington, D.C. Metro Bus Stops

Regional bus stops in Washington, D.C. area

Sample Data: Spam Email

Dataset of email statistics for the classification of spam email

Kyoto Free Translation Task Data

A parallel corpus for the evaluation and development of Japanese-English machine translation systems

Human Cell Counts

Dataset of total number of cells in organs/systems in adult human body

USB Device Vendors and Devices

A dataset of the vendors and devices in the Linux usb.ids file

Peace Corps Volunteer Demographics (FY 2016)

Dataset of the demographics of Peace Corps volunteers in the 2016 fiscal year

2015 Chicago Marathon Data

2015 Chicago Marathon participant data

Japanese-English Legal Parallel Corpus

A parallel corpus for machine translation systems, information extraction and other language processing techniques

Western Europe Grape Harvest

Western Europe 650 year Grape Harvest Data from 1354 to 2007

Near-Earth Comets

J2000 heliocentric ecliptic orbital elements of 160 Near-Earth Comets

State of the Union Addresses

Complete text of State of the Union addresses from 1790 to 2018

Fog Frequency by WMO Station

Average annual and monthly number of days with fog for 22 nations

US Public Housing Authorities 2016

2016 HUD public housing authority data

Mister Rogers' Sweater Colors

Colors of sweaters worn by Fred Rogers on episodes of Mister Rogers' Neighborhood

Sample Data: Gene Sequences

Splice-junction Gene Sequences for Primate DNA

U.S. Baby Names By State

A comprehensive list of frequencies of baby names in the U.S, listed by year and state, since 1910

Solutions to Examples of Post’s Correspondence Problem

A dataset of instances and solutions (if they exist) for Post’s correspondence problem

Japanese-English Subtitle Corpus

A parallel corpus for machine translation systems, information extraction and other language processing techniques

U.S. Farmers Markets

A comprehensive directory of U.S. farmers markets

Washington, D.C. Metro Stations

Regional Metro stations in Washington, D.C.

HCAHPS Patient Care Survey

Responses from a standardized survey on the quality of American hospital care

Dust Frequency by WMO Station

Average annual and monthly number of days with dusty weather for Iran, Jordan, and Saudi Arabia

Thunder Frequency by WMO Station

Average annual and monthly number of days with thunder

Executions in the United States

A dataset about executions (the death penalty) in the United States since the 1976 Supreme Court decision in Gregg v. Georgia (428 U.S. 153)

U.S. Baby Name Trends By State

A comprehensive list of time series for baby names in the U.S, listed by state, since 1910

Rain Frequency by WMO Station

Average annual and monthly number of days with rain for 28 nations

Snowfall Frequency by WMO Station

Average annual and monthly number of days with snowfall for 9 nations

Solid Waste Landfill Facilities

Solid Waste Landfill Facilities: US Territories

Transcriptional Regulation Network of Escherichia coli

Dataset of the transcriptional regulation network of Escherichia coli

Stack Overflow Survey 2016

Results from Stack Overflow's 2016 Developer Survey

FAA Wildlife Strikes

All reports of birds and other wildlife striking aircraft in the U.S. since 1990

Fireballs and Bolides

Data on several of the brightest fireballs and bolides that were detected from 2009-2015 by U.S. Government sensors

Path of the Total Solar Eclipse of August 21st, 2017

Dataset of the Path of the Total Solar Eclipse of August 21st, 2017

Timeline of Systematic Data & Computable Knowledge

Dataset of nearly 200 notable events in the history of computable knowledge

Overcast Frequency by WMO Station

Average annual and monthly number of days without sunshine

New Orleans Slave Sales 1856-1861

Slave sales recorded by the New Orleans register of conveyance, October 1856 to August 1861

US State Fairgrounds

State Fairgrounds data of United States

Meteorite Landings

A collection of known meteorite landings

Spacecraft Materials Outgassing Data

Data on outgassing of materials intended for spacecraft

Gridded World Population Density

UN-adjusted gridded world population density for the years 2000, 2005, 2010, and 2015

Sample Data: Pacific Walrus Haulouts

Congregations of Pacific walruses off the coast of U.S. and Russia, 1852-2016

Commuter-Adjusted Daytime Population by U.S. County

2006-2010 data on daytime commuting patterns in the U.S.

Equity in U.S. GED Programs

A report on the equity of school districts and their GED programs across the U.S. during the 2013-14 school year

USDA Aggregate Tenant Data on Active Properties

Aggregate tenant data for use by the USDA

California Urban Water Supplier Monitoring Reports

Monthly reports of the larger urban water suppliers in California on water production and conservation activities, from the State of California's Drinking Water Information Clearinghouse (DRINC)

Atlantic Hurricane Data 1851-2017

A modification of the NOAA "Hurdat2" Dataset on Atlantic Hurricanes to facilitate use with the Wolfram Language

Food Access Research Atlas

The USDA's comprehensive report on food accessibility in America

World Atlas of Language Structures

Dataset of structural properties of languages

Commuter-Adjusted Daytime Population by U.S. Place

2006-2010 data on daytime commuting patterns in the U.S.

Natural Amenities by U.S. County

A 1999 study measuring desirable natural characteristics in U.S. counties

US Fatal Injuries 1999-2014

Dataset of deaths and crude rates of fatal injuries in the United States from 1999 to 2014

Irish-Viking Networks in 'Cogadh Gaedhel re Gallaibh'

Graph datasets for Irish and Viking character relationships in the medieval Irish text 'Cogadh Gaedhel re Gallaibh' ('The War of the Gaedhil with the Gaill')

Public Housing Developments 2015

2015 HUD public housing development data

MLS Players' Salaries

Major League Soccer players' salaries from 2007 to 2016

Paul Revere's Social Network in Colonial Boston

Dataset of associations among political groups in colonial Boston 1762 - 1775

Global Landslide Catalog

Data on known landslides in North and South America, compiled since 2007

IMF World Economic Outlook

Selected macroeconomic data series from the International Monetary Fund

Health Nutrition and Population Statistics

Health Nutrition and Population Statistics from 1960 to 2015

U.S. State Fairgrounds

Locations for United States State and Regional Fairs

Mammals in MZNA-VERT

Data on small mammals obtained from the analysis of barn owl pellets

Infectious Diseases by Country 2009-2014

Reported cases of selected contagious diseases by country from 2009 to 2014

Video Games

Dataset from the Internet Games Database API as of June 2018