Sample Tabular Data: Diamonds

Source Notebook

A dataset containing the prices and other attributes of almost 54,000 diamonds

Details

A dataset containing the prices and typical diamond attributes of carat weight, cut, color, clarity as well as direct measurements in mm: length, width, and depth and relative measurements: depth (total depth percentage) and table (width of the top of a diamond relative to the widest point) of almost 54,000 diamonds.

Examples

Basic Examples (3) 

In[1]:=
ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: Diamonds\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: Diamonds-Input",
AutoDelete->True]\)]
Out[1]=

Dimensions:

In[2]:=
Dimensions[ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: Diamonds\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: Diamonds-Input",
AutoDelete->True]\)]]
Out[2]=

Column keys and column descriptions:

In[3]:=
ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: Diamonds\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: Diamonds-Input",
AutoDelete->True]\), "ColumnKeys"]
Out[3]=
In[4]:=
ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: Diamonds\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: Diamonds-Input",
AutoDelete->True]\), "ColumnDescriptions"] // Dataset
Out[4]=

Column types:

In[5]:=
ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: Diamonds\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: Diamonds-Input",
AutoDelete->True]\), "ColumnTypes"]
Out[5]=

Scope & Additional Elements (4) 

Find the heaviest diamond in the data:

In[6]:=
MaximalBy[ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: Diamonds\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: Diamonds-Input",
AutoDelete->True]\)], "carat"]
Out[6]=

Find the most expensive diamond in the data:

In[7]:=
MaximalBy[ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: Diamonds\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: Diamonds-Input",
AutoDelete->True]\)], "price"]
Out[7]=

Compute the average price per carat in the data depending on color attribute:
In[8]:=
SortBy[AggregateRows[ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: Diamonds\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: Diamonds-Input",
AutoDelete->True]\)], {"price" -> Function[Mean[#price/#carat]]}, "color"], "color"]
Out[8]=

Compute the average price per carat in the data depending on all four 'C's - color, cut, clarity, and carat and sort by price:

In[9]:=
ReverseSortBy[AggregateRows[ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: Diamonds\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: Diamonds-Input",
AutoDelete->True]\)], {"price" -> Function[Mean[#price/#carat]]}, {"color", "cut", "clarity", "carat"}], "price"]
Out[9]=

Create a pivot table for the average price per carat depending on color and clarity:

In[10]:=
PivotTable[ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: Diamonds\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: Diamonds-Input",
AutoDelete->True]\)], {"price" -> Function[Mean[#price/#carat]]}, {"clarity"}, {"color"}]
Out[10]=

Visualizations (6) 

Visualize the price as a function of weight:

In[11]:=
ListPlot[ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: Diamonds\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: Diamonds-Input",
AutoDelete->True]\)] -> {"carat", "price"}, AxesLabel -> {"carat", "price"}]
Out[11]=

Assuming carat-price space, analyze the color distribution:

In[12]:=
colorSorted = SortBy[ResourceData[\!\(\*
TagBox["\"\<Sample Tabular Data: Diamonds\>\"",
#& ,
BoxID -> "ResourceTag-Sample Tabular Data: Diamonds-Input",
AutoDelete->True]\)], "color"];
In[13]:=
pts = QuantityMagnitude@
   FromTabular[colorSorted[[All, {"carat", "price"}]], "Matrix"];

To make the plot more readable take a random sample from the data:

In[14]:=
pos = Partition[Sort@RandomSample[Range[Length[pts]], 500], 1];
In[15]:=
points = Extract[pts, pos];
In[16]:=
colors = Extract[Normal[colorSorted[[All, "color"]]], pos];

The bounding rectangle for carat-price points:

In[17]:=
reg = Rectangle[Min /@ {pts[[All, 1]], pts[[All, 2]]}, Max /@ {pts[[All, 1]], pts[[All, 2]]}];

Create SpatialPointData object with "color" annotation:

In[18]:=
spd = SpatialPointData[points -> {"color" -> colors}, reg]
Out[18]=

Use PointValuePlot to visualize the diamond colors across carat-price space:

In[19]:=
PointValuePlot[spd, PlotStyle -> "StarryNightColors", PlotLegends -> Automatic]
Out[19]=

Gosia Konwerska, "Sample Tabular Data: Diamonds" from the Wolfram Data Repository (2025)  

Data Resource History

Source Metadata

See Also

Publisher Information