Wolfram Computation Meets Knowledge

World Atlas of Language Structures

Dataset of structural properties of languages

Swadesh Lists

Word lists for common concepts in nearly 1200 languages

Japanese-English Subtitle Corpus

A parallel corpus for machine translation systems, information extraction and other language processing techniques

1911 Encyclopedia Britannica

Plaintext of the complete Encyclopedia Britannica Eleventh Edition (1910-11)

Japanese-English Legal Parallel Corpus

A parallel corpus for machine translation systems, information extraction and other language processing techniques

FDIC Institution EntityStore

A Wolfram Language EntityStore with selected data on FDIC insured institutions

Minecraft Block Types

Wolfram Language EntityStore with IDs and sample images for 150+ types of Minecraft blocks

Europarl English-Spanish Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-Italian Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-French Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-German Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Spoken Digit Commands

A dataset consisting of recordings of spoken digits

Clinical Concepts from Massive Sources of Medical Data

A dataset of medical concepts