GDB-9 Database

Database of molecular quantum calculations

Computed geometric, energetic, electronic, and thermodynamic properties for 134k stable small organic molecules made up of carbon, hydrogen, nitrogen, oxygen, and fluorine

(1 entity types, 133885 entities, 39 properties)

Examples

Basic Examples

Retrieve the resource:

In[1]:=
ResourceObject["GDB-9 Database"]
Out[1]=

Retrieve the default content:

In[2]:=
ResourceData["GDB-9 Database"]
Out[2]=

Add the entity store to the global list of entity stores:

In[3]:=
PrependTo[$EntityStores, ResourceData["GDB-9 Database"]]
Out[3]=

Analysis

Find the total number of entities:

In[4]:=
EntityValue["GDB9Chemical", "EntityCount"]
Out[4]=

Get the available properties for the entity store:

In[5]:=
EntityProperties["GDB9Chemical"]
Out[5]=

View the property descriptions in a Dataset:

In[6]:=
Dataset[<|# -> #["Description"] & /@ EntityProperties["GDB9Chemical"]|>]
Out[6]=

Quickly view the properties of a random entity:

In[7]:=
RandomEntity["GDB9Chemical"]["Dataset"]
Out[7]=

Visualization

Make a scatter plot comparing the internal energy calculated at absolute zero versus the HOMO-LUMO gap:

In[8]:=
ListPlot[
 EntityValue[
  RandomSample[
   EntityList["GDB9Chemical"]], {EntityProperty["GDB9Chemical", 
    "InternalEnergy0K"], 
   EntityProperty["GDB9Chemical", "HOMOLUMOGap"]}],
 Frame -> True]
Out[8]=
In[9]:=
ListPlot[
 EntityValue[
  RandomSample[EntityList["GDB9Chemical"]], {"NAtoms", 
   "ZeroPointEnergy"}],
 Frame -> True]
Out[9]=

Use the Wolfram Language Graph functionality to find the number of distinct rings in each molecule, then display this in a histogram:

In[10]:=
structureGraph = Graph[UndirectedEdge @@@ #["EdgeRules"]] &;
nRings = Length@FindCycle[
     structureGraph@#,
     Infinity, All
     ] &;
ringsData = # -> nRings[#] & /@ EntityList["GDB9Chemical"] // 
   Apply[Association];
Histogram[ringsData]
Out[10]=

Some of the molecules have as many as 14 rings! Look at their 3D structure:

In[11]:=
EntityValue[Keys@TakeLargest[ringsData, 5], "MoleculePlot"]
Out[11]=

Make a histogram from the maximum graph-distances:

In[12]:=
maxPathLength = Max@Flatten@GraphDistanceMatrix[structureGraph@#] &;
maxPathLengthData = # -> maxPathLength[#] & /@ 
    EntityList["GDB9Chemical"] // Apply[Association];
Histogram[maxPathLengthData]
Out[12]=

Connect to PubChem to find all the names for a given entity (if it is in their database), using the SMILES string as a search parameter:

In[13]:=
smi = Entity["GDB9Chemical", "GDB128993"]["SMILES"]
ServiceExecute["PubChem", "CompoundSynonyms", {"SMILES" -> smi}]
Out[13]=
Out[14]=

Wolfram Research, "GDB-9 Database" from the Wolfram Data Repository. (2017) https://doi.org/10.24097/wolfram.99752.data

License Information

CC0 waiver

Source Metadata