WinoGrande

Source Notebook

A large-scale dataset of 44k natural language processing problems, inspired by the original Winograd Schema Challenge design

Examples

Basic Examples (3)

Get the WinoGrande dataset:

In[1]:=

$data = ResourceData[\!$\* TagBox["\"\<WinoGrande\>\"", #& , BoxID -> "ResourceTag-WinoGrande-Input", AutoDelete->True]$]$

Out[1]=

A sample row:

In[2]:=

Out[2]=

Number of items in the dataset:

In[3]:=

Out[3]=

Get a random WinoGrande problem:

In[4]:=

$q = Normal[RandomChoice[ResourceData[\!$\* TagBox["\"\<WinoGrande\>\"", #& , BoxID -> "ResourceTag-WinoGrande-Input", AutoDelete->True]$]]]$

Out[4]=

Test an LLM with the problem:

In[5]:=

LLMFunction[
"In the sentence below, which option does _ correspond to? Reply only with one of the specified options and nothing else.

`Sentence`

Available options:
`Options`"][q]

Out[5]=

Verify:

In[6]:=

Out[6]=

Get a random sample of problems:

In[7]:=

$problems = RandomSample[ResourceData[\!$\* TagBox["\"\<WinoGrande\>\"", #& , BoxID -> "ResourceTag-WinoGrande-Input", AutoDelete->True]$], 10] // Normal;$

Each sentence contains a "_" which is a blank that's meant to be filled in:

In[8]:=

Out[8]=

Each problem gives a set of multiple choice options:

In[9]:=

Out[9]=

The correct answer:

In[10]:=

Out[10]=

Scope & Additional Elements (2)

Get a larger version of the WinoGrande dataset:

In[11]:=

$data = ResourceData[\!$\* TagBox["\"\<WinoGrande\>\"", #& , BoxID -> "ResourceTag-WinoGrande-Input", AutoDelete->True]$, "TrainingDatasetExtraLarge"]$

Out[11]=

In[12]:=

Out[12]=

Get a test version of the WinoGrande dataset:

In[13]:=

$ResourceData[\!$\* TagBox["\"\<WinoGrande\>\"", #& , BoxID -> "ResourceTag-WinoGrande-Input", AutoDelete->True]$, "TestDataset"]$

Out[13]=

Analysis (5)

Get a random sample of WinoGrande questions:

In[14]:=

$q = Normal[RandomSample[ResourceData[\!$\* TagBox["\"\<WinoGrande\>\"", #& , BoxID -> "ResourceTag-WinoGrande-Input", AutoDelete->True]$], 100]];$

In[15]:=

text = StringTemplate[
"In the sentence below, which option does _ correspond to? Reply only with one of the specified options and nothing else.

`Sentence`

Available options:
`Options`"] /@ q;

In[16]:=

Out[16]=

Check results using an older LLM:

In[17]:=

In[18]:=

Out[18]=

Compare with a more modern model:

In[19]:=

Out[19]=

Much better performance:

In[20]:=

Out[20]=

View a table comparing a sample of results:

In[21]:=

Style[TableForm[
RandomSample[
Transpose[{MapThread[
If[#1 === #2, "✅ " <> #1, "❌ " <> #1] &, {answers1, correct}], MapThread[
If[#1 === #2, "✅ " <> #1, "❌ " <> #1] &, {answers2, correct}], q[[All, "Sentence"]]}], 10], TableHeadings -> {None, {"GPT-3.5", "GPT-4o", "Sentence"}}], "Text",
FontSize -> 12]

Out[21]=

External Links

WinoGrande: An Adversarial Winograd Schema Challenge at Scale

Bibliographic Citation

Wolfram Research, "WinoGrande" from the Wolfram Data Repository (2024)

License Information

CC-BY

Data Resource History

Date Created: 12 December 2024

Source Metadata

Title: WinoGrande: An Adversarial Winograd Schema Challenge at Scale
Creator: Sakaguchi, Keisuke and Bras, Ronan Le and Bhagavatula, Chandra and Choi, Yejin
Publisher: arXiv preprint arXiv:1907.10641
Date: 2019
Language: English

Publisher Information

Prepared for the Wolfram Data Repository By: Richard Hennigan (Wolfram Research)
Publisher of Record: Wolfram Research