Wolfram Data Repository
Immediate Computable Access to Curated Contributed Data
A dataset consisting of recordings of spoken digits
The dataset contains 10,000 training and 1,000 test recordings of 10 classes corresponding to spoken digits from 0 to 9. The total number of speakers is 997. The dataset is a subset of the Speech Commands Dataset v0.01 released by Google. The selection has been done so that speakers in the training and test sets do not overlap.
Retrieve a sample of the training dataset:
In[1]:= | ![]() |
Out[1]= | ![]() |
Retrieve a sample of the test dataset:
In[2]:= | ![]() |
Out[61]= | ![]() |
Select an Audio object from the dataset:
In[69]:= | ![]() |
Out[70]= | ![]() |
Visualize the waveform:
In[71]:= | ![]() |
Out[71]= | ![]() |
Visualize the spectrum:
In[72]:= | ![]() |
Out[72]= | ![]() |
Visualize the spectrogram:
In[73]:= | ![]() |
Out[73]= | ![]() |
Wolfram Research, "Spoken Digit Commands" from the Wolfram Data Repository (2018)
Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)