Swadesh Lists

Word lists for common concepts in nearly 1200 languages

Sample Data: Spam Email

Dataset of email statistics for the classification of spam email

Sample Audio: Apollo 11 One Small Step

Sample recording of Neil Armstrong's first words from the surface of the moon

SQuAD v1.1 Tokens Generated with WL

A list of isolated words and symbols from the SQuAD dataset, which consists of a set of Wikipedia articles labeled for question answering and reading comprehension

Kyoto Free Translation Task Data

A parallel corpus for the evaluation and development of Japanese-English machine translation systems