Wolfram Research

Nuclear Latency Dataset

Facility-specific information on sensitive nuclear plants constructed from 1939 to 2012

Spoken Digit Commands

A dataset consisting of recordings of spoken digits

Global Events of Organized Violence

Georeferenced dataset of individual events of organized violence from the Uppsala Conflict Data Program

Head Start Locations

Full list of currently active Head Start Program locations

FashionMNIST

A small MNIST-like fashion product image dataset

Hadley Center Central England Temperature (HadCET) Dataset

The CET dataset is the longest instrumental record of temperature in the world

The 20-Task bAbI Question-Answering Dataset v1.2

A dataset for question answering and text understanding in both Hindi and English

Europarl English-Spanish Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-German Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-Italian Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-French Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

United States Supreme Court Decisions 1946-present

Datasets relating to Supreme Court cases from 1946 to present

Sample Data: UCI Letter

Letter recognition dataset

Clinical Concepts from Massive Sources of Medical Data

A dataset of medical concepts

Audio Cats and Dogs

Dataset consisting of recordings of cats and dogs

Worldwide Complexity Institutes

Dataset of known complexity institutes worldwide

SQuAD v1.1

A dataset for question answering and reading comprehension from a set of Wikipedia articles

SQuAD v2.0

A dataset for question answering and reading comprehension from a set of Wikipedia articles

CIFAR-100

CIFAR-100 computer-vision training dataset

CIFAR-10

CIFAR-10 computer-vision training dataset

California Crop Mapping

Dataset of agricultural land use and irrigated acres in California

Atlanta Police Department Crime Data (2009-2017)

Dataset of crime data reported by the Atlanta Police from 2009 to May 17, 2017

Periodic Groundwater Level Measurements

Dataset of seasonal and long-term groundwater level measurements in groundwater basins in California

Sample Data: Spam Email

Dataset of email statistics for the classification of spam email

Minimal Inequivalent Square Tilings

A dataset of images and constraints for the minimal inequivalent square tilings, along with the allowed tiles that generate the tiling

Enron Email Network

Graph of the Enron email communication network within a dataset of around half a million emails

SQuAD v2.0 Tokens Generated with WL

A list of isolated words and symbols from the SQuAD dataset, which consists of a set of Wikipedia articles labeled for question answering and reading comprehension

SQuAD v1.1 Tokens Generated with WL

A list of isolated words and symbols from the SQuAD dataset, which consists of a set of Wikipedia articles labeled for question answering and reading comprehension

UPS Facilities

This dataset represents UPS facilities

World Atlas of Language Structures

Dataset of structural properties of languages

Indian Reservations

This dataset represents Indian Reservations

City of Champaign Street Signs

The various street signs of Champaign, IL. Data on signs include ownership, size, type, etc.

Transcriptional Regulation Network of Escherichia coli

Dataset of the transcriptional regulation network of Escherichia coli

UFO Sightings 2015

Dataset of UFO sightings in the United States in 2015

Washington, D.C. Metro Bus Stops

The District provides a large quantity of government information available to the public. The Open Data Catalog provides hundreds of District government datasets, available as raw downloads in a variety of formats, an...

USB Device Vendors and Devices

A dataset of the vendors and devices in the Linux usb.ids file

Solutions to Examples of Post’s Correspondence Problem

A dataset of instances and solutions (if they exist) for Post’s correspondence problem

FER-2013

The Facial Expression Recognition 2013 (FER-2013) Dataset

Peace Corps Volunteer Demographics (FY 2016)

Dataset of the demographics of Peace Corps volunteers in the 2016 fiscal year

US Coal Fields

This dataset represents coal fields in Alaska and the conterminous United States.

Video Games Until April 2017

Dataset from the Internet Games Database API as of April 2017

Timeline of Systematic Data & Computable Knowledge

Dataset of nearly 200 notable events in the history of computable knowledge

Video Games

Dataset from the Internet Games Database API as of June 2018

Irish-Viking Networks in 'Cogadh Gaedhel re Gallaibh'

Graph datasets for Irish and Viking character relationships in the medieval Irish text 'Cogadh Gaedhel re Gallaibh' ('The War of the Gaedhil with the Gaill')

Sample Data: Solar System Planets and Moons

Sample dataset containing the mass and radius of planets and moons in the Solar System

Paul Revere's Social Network in Colonial Boston

Dataset of associations among political groups in colonial Boston 1762 - 1775

Path of the Total Solar Eclipse of August 21st, 2017

Dataset of the Path of the Total Solar Eclipse of August 21st, 2017

Sea Level and Temperatures Over the Last 40 Million Years

Dataset of eustatic sea level and temperatures over the last 40 million years

US Fatal Injuries 1999-2014

Dataset of deaths and crude rates of fatal injuries in the United States from 1999 to 2014

Atlantic Hurricane Data 1851-2017

A modification of the NOAA "Hurdat2" Dataset on Atlantic Hurricanes to facilitate use with the Wolfram Language

Australian Rules Football - 2018 Final Team Rankings

This dataset consists of the final team standing within the Australian Rules Football league for the 2018 season

Mammals in MZNA-VERT

Data on small mammals obtained from the analysis of barn owl pellets

Paleoclimate Data Records Derived from the Vostok Ice Core

Datasets of CO2 concentration and temperature historical records derived from the air and isotopes trapped in the ice core

Japanese-English Subtitle Corpus

A parallel corpus for machine translation systems, information extraction and other language processing techniques

Kyoto Free Translation Task Data

A parallel corpus for the evaluation and development of Japanese-English machine translation systems

Japanese-English Legal Parallel Corpus

A parallel corpus for machine translation systems, information extraction and other language processing techniques

Orbital Variations and Insolation Database

Dataset of insolation values at different latitudes from 5000000 cal yr BP (-4998050 CE) to 0 cal yr BP (1950 CE)

Executions in the United States

A dataset about executions (the death penalty) in the United States since the 1976 Supreme Court decision in Gregg v. Georgia (428 U.S. 153)

SNAP Retailers

A comprehensive list of all retailers accepting SNAP payments in the U.S.

National Science Foundation Grants - 2015

Data on National Science Foundation grants and associated investigators and institutions awarded in the year 2015

Washington, D.C. Metro Stations

The District provides a large quantity of government information available to the public. The Open Data Catalog provides hundreds of District government datasets, available as raw downloads in a variety of formats, an...

Sample Data: Anscombe Regression Lines

Anscombe's 4 regression line data

US Public Housing Authorities 2016

2016 HUD public housing authority data

2015 Chicago Marathon Data

2015 Chicago Marathon participant data

Mister Rogers' Sweater Colors

Colors of sweaters worn by Fred Rogers on episodes of Mister Rogers' Neighborhood

U.S. Baby Names By State

A comprehensive list of frequencies of baby names in the U.S, listed by year and state, since 1910

U.S. Baby Name Trends By State

A comprehensive list of time series for baby names in the U.S, listed by state, since 1910

Public Housing Developments 2015

HUD's PD&R (Office of Policy Development and Research) is responsible for maintaining current information on housing needs, market conditions, and existing programs, as well as conducting research on priority housing ...

Near-Earth Comets

J2000 heliocentric ecliptic orbital elements of 160 Near-Earth Comets

Dust Frequency by WMO Station

Average annual and monthly number of days with dusty weather for Iran, Jordan, and Saudi Arabia

State of the Union Addresses

Corpus of all the State of the Union addresses from 1790 to 2019.

USDA Aggregate Tenant Data on Active Properties

Demographics and information on the USDA's active tenant properties

HCAHPS Patient Care Survey

Responses from a standardized survey on the quality of American hospital care

Spacecraft Materials Outgassing Data

Data was obtained at the Goddard Space Flight Center (GSFC), utilizing equipment developed at Stanford Research Institute (SRI) under contract to the Jet Propulsion Laboratory (JPL).

Sample Data: Pacific Walrus Haulouts

Congregations of Pacific walruses off the coast of U.S. and Russia, 1852-2016

Fog Frequency by WMO Station

Average annual and monthly number of days with fog for 22 nations

Rain Frequency by WMO Station

Average annual and monthly number of days with rain for 28 nations

Snowfall Frequency by WMO Station

Average annual and monthly number of days with snowfall for 9 nations

U.S. Farmers Markets

A comprehensive directory of U.S. farmers markets

FAA Wildlife Strikes

All reports of birds and other wildlife striking aircraft in the U.S. since 1990

Sample Data: Mushroom Classification

Determine whether a mushroom is edible based on physical characteristics

Thunder Frequency by WMO Station

Average annual and monthly number of days with thunder

Global Landslide Catalog

The Global Landslide Catalog considers all types of mass movements triggered by rainfall, which have been reported in the media, disaster databases, scientific reports, or other sources.

Overcast Frequency by WMO Station

Average annual and monthly number of days without sunshine

New Orleans Slave Sales 1856-1861

Slave sales recorded by the New Orleans register of conveyance, October 1856 to August 1861

Health Nutrition and Population Statistics

The World Bank Group (WBG) is a family of five international organizations that make leveraged loans to developing countries. It is the largest and most famous development bank in the world and is an observer at the U...

Sample Data: Movie Review Sentence Polarity

Movie review data

Stack Overflow Survey 2016

Results from Stack Overflow's 2016 Developer Survey

Equity in U.S. GED Programs

A report on the equity of school districts and their GED programs across the U.S. during the 2013-14 school year

Fireballs and Bolides

Data on several of the brightest fireballs and bolides that were detected from 2009-2015 by U.S. Government sensors

Meteorite Landings

This comprehensive data set from The Meteoritical Society contains information on all of the known meteorite landings.

MLS Players' Salaries

The Major League Soccer Players Union serves as the exclusive collective bargaining representative for all current players in Major League Soccer. Formed in April 2003, the Union ensures protection of the rights of al...

Commuter-Adjusted Daytime Population by U.S. County

2006-2010 data on daytime commuting patterns in the U.S.

California Urban Water Supplier Monitoring Reports

Monthly reports of the larger urban water suppliers in California on water production and conservation activities, from the State of California's Drinking Water Information Clearinghouse (DRINC)

Sample Data: Gene Sequences

Splice-junction Gene Sequences for Primate DNA

IMF World Economic Outlook

Selected macroeconomic data series from the International Monetary Fund

Natural Amenities by U.S. County

A 1999 study measuring desirable natural characteristics in U.S. counties

Solid Waste Landfill Facilities

Oak Ridge National Laboratory is the largest US Department of Energy science and energy laboratory, conducting basic and applied research to deliver transformative solutions to compelling problems in energy and security.

Commuter-Adjusted Daytime Population by U.S. Place

2006-2010 data on daytime commuting patterns in the U.S.

US State Fairgrounds

The National Geospatial-Intelligence Agency (NGA) delivers geospatial intelligence that provides a decisive advantage to policymakers, warfighters, intelligence professionals and first responders.

Food Access Research Atlas

The USDA's comprehensive report on food accessibility in America

Western Europe Grape Harvest

Western Europe 650 year Grape Harvest Data from 1354 to 2007

Infectious Diseases by Country 2009-2014

Number of reported, suspected and reported, and/or newly reported cases to the World Health Organization of selected contagious or infectious diseases like mumps and rubella by country from 2009 to 2014.

Gridded World Population Density

UN-adjusted gridded world population density for the years 2000, 2005, 2010, and 2015

U.S. State Fairgrounds

Locations for United States State and Regional Fairs

Sample Data: Satellite

Classify the type of land surface of a scene photographed by the Landsat MSS satellite given four digital images of the scene taken in different spectral bands