A dataset for question answering and text understanding in both Hindi and English

The bAbI-QA is a dataset for question answering and text understanding. The dataset is composed of a set of contexts, with multiple question-answer pairs available based on the contexts. Furthermore, the dataset is in both English and Hindi and is divided into 20 tasks:

Task 1: Single Supporting Fact

Task 2: Two Supporting Facts

Task 3: Three Supporting Facts

Task 4: Two Argument Relations

Task 5: Three Argument Relations

Task 6: Yes/No Questions

Task 7: Counting

Task 8: Lists/Sets

Task 9: Simple Negation

Task 10: Indefinite Knowledge

Task 11: Basic Coreference

Task 12: Conjunction

Task 13: Compound Coreference

Task 14: Time Reasoning

Task 15: Basic Deduction

Task 16: Basic Induction

Task 17: Positional Reasoning

Task 18: Size Reasoning

Task 19: Path Finding

Task 20: Agent’s Motivations

The "ContentElements" field contains three options, "TrainingData", "TestData" and "Dataset". The first two provide rapid access to data formatted for common training tasks. They are extracted from the 1k version in English.

The full dataset "Dataset" contains more information, including the Hindi version of the dataset.


Basic Examples

Retrieve the ResourceObject:


View the data:

ResourceData["bAbI-QA-v12", "Dataset"]

License Information

Creative Commons Attribution 3.0 Unported (CC BY 3.0)

Source Metadata