Sample Tabular Data: New Mothers

Source Notebook

Information on new mothers from 1978 through 1988 from NLSY79

Details

This data set consists of the information from 927 first-born children to mothers who chose to breast feed their children and who have complete information for all the variables of interest. The sample was restricted to children born after 1978 and whose gestation age was between 20 and 45 weeks. The response variable in the data set is duration of breast feeding in weeks, followed by an indicator of whether the breast feeding was completed (i.e., the infant is weaned). Explanatory variables for breast-feeding duration include race of mother; poverty status indicator; smoking status of mother; alcohol-drinking status of mother; age of mother at child's birth; year of child's birth; education of mother (in years); and lack of prenatal care status (1 if mother sought prenatal care after third month or never sought prenatal care, 0 if mother sought prenatal care in first three months of pregnancy). Data collected as a part of The National Longitudinal Survey of Youth (NLSY79).

Examples

Basic Examples (4) 

In[1]:=
ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: New Mothers\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: New Mothers-Input",
AutoDelete->True]\)]
Out[1]=

Data dimensions:

In[2]:=
Dimensions[ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: New Mothers\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: New Mothers-Input",
AutoDelete->True]\)]]
Out[2]=

Column keys and types:

In[3]:=
ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: New Mothers\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: New Mothers-Input",
AutoDelete->True]\), "ColumnKeys"]
Out[3]=
In[4]:=
ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: New Mothers\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: New Mothers-Input",
AutoDelete->True]\), "ColumnTypes"]
Out[4]=

Column descriptions:

In[5]:=
ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: New Mothers\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: New Mothers-Input",
AutoDelete->True]\), "ColumnDescriptions"] // Dataset
Out[5]=

Tabular structure of the data:

In[6]:=
TabularStructure[ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: New Mothers\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: New Mothers-Input",
AutoDelete->True]\)]]
Out[6]=

Scope & Additional Elements (3) 

Histogram of the breast feeding durations:

In[7]:=
Histogram[ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: New Mothers\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: New Mothers-Input",
AutoDelete->True]\)] -> "duration", AxesLabel -> Automatic]
Out[7]=

PieChart of the race distribution:

In[8]:=
rc = AggregateRows[ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: New Mothers\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: New Mothers-Input",
AutoDelete->True]\)], "count" -> Function[Length[#race]], "race"]
Out[8]=
In[9]:=
PieChart[Normal[rc[All, "count"]], ChartLabels -> (Style[#, Bold, 12] & /@ Normal[rc[All, "race"]]), PlotTheme -> "Business", ImageSize -> Small]
Out[9]=

Convert "birthyear" column to a date format:

In[10]:=
TransformColumns[ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: New Mothers\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: New Mothers-Input",
AutoDelete->True]\)], "birthyear" -> Function[DateObject[{1900 + #birthyear}, "Year"]]]
Out[10]=
In[11]:=
ColumnTypes[%]["birthyear"]
Out[11]=

Visualizations (2) 

Compute mean breast feeding duration grouped by mother's age:

In[12]:=
AggregateRows[ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: New Mothers\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: New Mothers-Input",
AutoDelete->True]\)], "MeanDuration" -> Function[Mean[#duration]], "motherage"]
Out[12]=

Sort by the mother's age:

In[13]:=
mda = Sort[%, "motherage"]
Out[13]=

Visualize the mean breast feeding duration as a function of mother's age:

In[14]:=
ListLinePlot[mda -> {"motherage", "MeanDuration"}, AxesLabel -> Automatic]
Out[14]=

Compute mean breast feeding duration grouped by year of birth of child and alcohol use:

In[15]:=
PivotTable[ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: New Mothers\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: New Mothers-Input",
AutoDelete->True]\)], "MeanDuration" -> Function[Mean[#duration]], "birthyear", "alcohol"]
Out[15]=

Sort by the birth year:

In[16]:=
mdb = Sort[%, "birthyear"]
Out[16]=

Visualize the mean breast feeding durations as a function of child birth year and alcohol use:

In[17]:=
ListLinePlot[
 mdb -> {{"birthyear", ExtendedKey["MeanDuration", True]}, {"birthyear", ExtendedKey["MeanDuration", False]}}, AxesLabel -> {"year", Automatic}, PlotLegends -> {"alcohol use", "no alcohol use"}]
Out[17]=

Analysis (4) 

Compute how many new mothers smoked and/or used alcohol at birth of child:

In[18]:=
pt = PivotTable[ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: New Mothers\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: New Mothers-Input",
AutoDelete->True]\)], {"smoked" -> (Length[#smoke] &)}, "alcohol", "smoke", IncludeGroupAggregates -> True]
Out[18]=

The resulting pivot table has a key column and therefore has row keys:

In[19]:=
Keys[pt]
Out[19]=

Use RowKey and ExtendedKey to extract values from the pivot table:

In[20]:=
allSmoked = pt[RowKey["All"], ExtendedKey["smoked", True]]
Out[20]=
In[21]:=
allAlcohol = pt[RowKey[True], ExtendedKey["smoked", "All"]]
Out[21]=
In[22]:=
allSmokedAndAlcohol = pt[RowKey[True], ExtendedKey["smoked", True]]
Out[22]=

Create Venn diagram:

In[23]:=
smoke = Disk[{0.5, 0}];
alcohol = Disk[{-0.5, 0}];
subsets = Subsets[{smoke, alcohol}, {1, 2}];
In[24]:=
subsetscolors = Function[cd, {cd[1], cd[2], Blend[{cd[1], cd[2]}]}][ColorData[93]];
labels = (Placed[Style[Column[#, Alignment -> Center], 16, Bold], Center, Background -> None] & /@ {{"smoke", allSmoked}, {"alcohol", allAlcohol}, {"alcohol", "and", "smoke",
       allSmokedAndAlcohol}});
In[25]:=
RegionPlot[
 Evaluate[
  DiscretizeRegion[
     RegionDifference[BooleanRegion[And, #], BooleanRegion[Or, Complement[{smoke, alcohol, EmptyRegion[2]}, #]]]] & /@ subsets], PlotLabels -> labels, PlotStyle -> subsetscolors, BoundaryStyle -> Directive[White, Thickness[.0085]], Frame -> False, PerformanceGoal -> "Speed", AspectRatio -> .7]
Out[25]=

Gosia Konwerska, "Sample Tabular Data: New Mothers" from the Wolfram Data Repository (2025)  

Data Resource History

Source Metadata

See Also

Publisher Information