Sample Tabular Data: NYC Trees

Source Notebook

2015 Street Tree Census in New York City

Details

2015 NYC Street Tree Census, conducted by volunteers and staff organized by NYC Parks & Recreation and partner organizations. Collected data includes tree species, diameter, perception of health, and location.

Examples

Basic Examples (3) 

In[1]:=
ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: NYC Trees\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: NYC Trees-Input",
AutoDelete->True]\)]
Out[1]=

Column keys and types:

In[2]:=
ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: NYC Trees\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: NYC Trees-Input",
AutoDelete->True]\), "ColumnKeys"]
Out[2]=
In[3]:=
ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: NYC Trees\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: NYC Trees-Input",
AutoDelete->True]\), "ColumnTypes"]
Out[3]=

Full tabular structure:

In[4]:=
TabularStructure[ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: NYC Trees\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: NYC Trees-Input",
AutoDelete->True]\)], All, All]
Out[4]=

Column descriptions:

In[5]:=
ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: NYC Trees\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: NYC Trees-Input",
AutoDelete->True]\), "ColumnDescriptions"] // Dataset
Out[5]=

Scope & Additional Elements (2) 

Plot the histogram of the tree diameter at breast height:

In[6]:=
Histogram[ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: NYC Trees\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: NYC Trees-Input",
AutoDelete->True]\)] -> "tree_dbh", AxesLabel -> {"in"}]
Out[6]=

Compute mean diameter at breast height for each tree species and reverse sort by the mean:

In[7]:=
meandbh = ReverseSortBy[AggregateRows[ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: NYC Trees\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: NYC Trees-Input",
AutoDelete->True]\)], "meandbh" -> Function[Mean[#"tree_dbh"]], "spc_common"], "meandbh"]
Out[7]=

Compute the ratio of each mean to the mean of all the trees:

In[8]:=
TransformColumns[meandbh, "ratio" -> Function[#meandbh/ColumnwiseValue[Mean[#meandbh]]]]
Out[8]=

Select the health and species columns, discard all rows with any number of missing (empty strings), and sort by name:

In[9]:=
health = SortBy[Discard[ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: NYC Trees\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: NYC Trees-Input",
AutoDelete->True]\)][All, {"spc_common", "health"}], Count[#, ""] > 0 &], "spc_common"]
Out[9]=

Tally the health conditions for each species:

In[10]:=
status = PivotTable[health, Function[Length[#health]], "spc_common", "health",
   IncludeGroupAggregates -> True]
Out[10]=

Visualize the health conditions:

In[11]:=
PieChart[Normal[status[-1, 2 ;; -2]], ChartLabels -> ColumnKeys[status][[2 ;; 4]], ColorFunction -> "SandyTerrain", ImageSize -> Small]
Out[11]=

Visualizations (2) 

Visualize the tree species counts:

In[12]:=
counts = AggregateRows[ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: NYC Trees\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: NYC Trees-Input",
AutoDelete->True]\)], "count" -> Function[Length[#"spc_common"]], "spc_common"]
Out[12]=

Remove the missing and sort by count:

In[13]:=
counts = SortBy[Discard[counts, Count[#, ""] > 0 &], "count"]
Out[13]=
In[14]:=
labels = Map[Style[# ~~ " ", 9] &, Normal[counts[All, "spc_common"]]];
In[15]:=
BarChart[counts -> "count", ChartLabels -> labels, LabelingFunction -> (Placed[Style[#, 8], After] &), BarOrigin -> Left, ColorFunction -> "RoseColors", AspectRatio -> 2.5]
Out[15]=

Analysis (5) 

Take a subset of columns including location information and remove rows with missing values:

In[16]:=
tab = Discard[ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: NYC Trees\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: NYC Trees-Input",
AutoDelete->True]\)][
   All, {"tree_id", "spc_latin", "spc_common", "borough", "latitude", "longitude"}], Count[#, ""] > 0 &]
Out[16]=

Select all the magnolias:

In[17]:=
mags = Select[tab, StringContainsQ[#"spc_common", "magnolia"] &]
Out[17]=

Count the number of each magnolia species in each borough:

In[18]:=
PivotTable[mags, Function[Length[#"spc_common"]], "spc_common", "borough"]
Out[18]=

Include summary counts:

In[19]:=
PivotTable[mags, Function[Length[#"spc_common"]], "spc_common", "borough", IncludeGroupAggregates -> True]
Out[19]=

Visualize the magnolia tree locations:

In[20]:=
GeoGraphics[{Purple, PointSize[Small], Point@GeoPosition@
    FromTabular[mags[All, {"latitude", "longitude"}], "Matrix"]}]
Out[20]=

Gosia Konwerska, "Sample Tabular Data: NYC Trees" from the Wolfram Data Repository (2024)  

Data Resource History

Source Metadata

See Also

Publisher Information