Wolfram Research

Category

Language

11 items

Filter by Type

Spoken Digit Commands

A dataset consisting of recordings of spoken digits

1911 Encyclopedia Britannica

Plaintext of the complete Encyclopedia Britannica Eleventh Edition (1910-11)

Europarl English-German Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-French Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-Spanish Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Europarl English-Italian Machine Translation Dataset V7

A parallel corpus for machine translation from the proceedings of the European Parliament

Kyoto Free Translation Task Data

A parallel corpus for the evaluation and development of Japanese-English machine translation systems

Japanese-English Subtitle Corpus

A parallel corpus for machine translation systems, information extraction and other language processing techniques

Swadesh Lists

Word lists for common concepts in nearly 1200 languages

World Atlas of Language Structures

Dataset of structural properties of languages

Japanese-English Legal Parallel Corpus

A parallel corpus for machine translation systems, information extraction and other language processing techniques