Syllabified English Words

Source Notebook

More than 76,000 English words with syllable breaks

Details

Keys are the words; values are the list {ortho,phon}:
orthothe word as splled in English with "•" at each syllable break
phonthe phonetic representation with "•" at each syllable break
Syllabification follows the Maximal Onset Principle (co•nnect) except where pronunication is strongly influenced by morphology (dis•arm).
Phonetic transcription follows the style of WordData[w,"PhoneticForm"], in which primary and secondary stress markers ( ˈ and ˌ ) occur immediately before the nucleus of the syllable.

Examples

Basic Examples (3) 

Retrieve a subset of the data:

In[1]:=
ResourceData[\!\(\*
TagBox["\"\<Syllabified English Words\>\"",
#& ,
BoxID -> "ResourceTag-Syllabified English Words-Input",
AutoDelete->True]\)][[1 ;; 10]]
Out[1]=

Find the length of the data:

In[2]:=
Length[ResourceData[\!\(\*
TagBox["\"\<Syllabified English Words\>\"",
#& ,
BoxID -> "ResourceTag-Syllabified English Words-Input",
AutoDelete->True]\)]]
Out[2]=

Retrieve one record from the data:

In[3]:=
ResourceData[\!\(\*
TagBox["\"\<Syllabified English Words\>\"",
#& ,
BoxID -> "ResourceTag-Syllabified English Words-Input",
AutoDelete->True]\)]["connect"]
Out[3]=

Scope & Additional Elements (2) 

Check how many words also belong to WordData:

In[4]:=
Intersection[WordData[], Keys[ResourceData[\!\(\*
TagBox["\"\<Syllabified English Words\>\"",
#& ,
BoxID -> "ResourceTag-Syllabified English Words-Input",
AutoDelete->True]\)]]] // Length
Out[4]=

Count the number of unique syllables regardless of stress:

In[5]:=
StringSplit[StringDelete[#[[2]], "ˈ" | "ˌ"], "\[Bullet]"] & /@ Values[ResourceData[\!\(\*
TagBox["\"\<Syllabified English Words\>\"",
#& ,
BoxID -> "ResourceTag-Syllabified English Words-Input",
AutoDelete->True]\)]] // Flatten // DeleteDuplicates // Length
Out[5]=

Visualizations (2) 

Visualize the syllable counts of all words in the data:

In[6]:=
sylCts = Length[StringSplit[#[[1]], "\[Bullet]"]] & /@ Values[ResourceData[\!\(\*
TagBox["\"\<Syllabified English Words\>\"",
#& ,
BoxID -> "ResourceTag-Syllabified English Words-Input",
AutoDelete->True]\)]];
Histogram[sylCts]
Out[7]=

Visualize the phoneme counts of all unique syllables in the data:

In[8]:=
syls = StringSplit[StringDelete[#[[2]], "ˈ" | "ˌ"], "\[Bullet]"] & /@ Values[ResourceData[\!\(\*
TagBox["\"\<Syllabified English Words\>\"",
#& ,
BoxID -> "ResourceTag-Syllabified English Words-Input",
AutoDelete->True]\)]] // Flatten // DeleteDuplicates;
Histogram[StringLength[syls]]
Out[9]=

Mark Greenberg, "Syllabified English Words" from the Wolfram Data Repository (2026)  

Data Resource History

Source Metadata

See Also

Data Downloads

Publisher Information