SWEETLEAD Molecule Database

Source Notebook

A cheminformatics database of medicines, drugs, and herbal isolates

Details

A curated set of chemical structures with known therapeutic effects.
Information was gathered from publicly available databases, and a consensus-building scheme was used to combine entries and arrive at the correct chemical structure for each drug compound.
A set of over 4,200 unique molecules , expanded to over 9,100 Entity objects after undefined stereocenters were enumerated.
Contains chemical structures, computed 3D geometries, external identifiers, synonyms, and information from regulatory agencies.

Examples

Basic Examples (6) 

Retrieve and load the entity store:

In[1]:=
EntityRegister[ResourceData[\!\(\*
TagBox["\"\<SWEETLEAD Molecule Database\>\"",
#& ,
BoxID -> "ResourceTag-SWEETLEAD Molecule Database-Input",
AutoDelete->True]\)]]
Out[1]=

Find out the number of molecules in the set:

In[2]:=
EntityValue["SWEETLEAD", "EntityCount"]
Out[2]=

Find out the available properties:

In[3]:=
EntityProperties["SWEETLEAD"]
Out[3]=

The entity canonical names are their index in the set. The entities display using their 2D structure diagram:

In[4]:=
ent = Entity["SWEETLEAD", "2243"]
Out[4]=

Retrieve all the property values for this entity as an Association:

In[5]:=
EntityValue[ent, "PropertyAssociation"]
Out[5]=

Query for the Molecule from an entity:

In[6]:=
mol = EntityValue[Entity["SWEETLEAD", "800"], "Molecule"]
Out[6]=
In[7]:=
MoleculePlot3D[mol]
Out[7]=

Scope & Additional Elements (4) 

The "OfficialNames" key contains information about the different approvals:

In[8]:=
EntityRegister[ResourceData[\!\(\*
TagBox["\"\<SWEETLEAD Molecule Database\>\"",
#& ,
BoxID -> "ResourceTag-SWEETLEAD Molecule Database-Input",
AutoDelete->True]\)]]; Counts[
 Flatten[Keys@EntityValue["SWEETLEAD", "OfficialNames"]]]
Out[8]=

Create a FilteredEntityClass of those compounds approved in China:

In[9]:=
chinaApproved = FilteredEntityClass["SWEETLEAD", EntityFunction[c, KeyExistsQ[c["OfficialNames"], "China Approved Name"]]];
EntityValue[chinaApproved, "EntityCount"]
Out[10]=

Use MoleculePlot to make a grid of labeled structure diagrams:

In[11]:=
Grid[Partition[
  MoleculePlot[#["Molecule"], PlotLabel -> #["OfficialNames"]["China Approved Name"]] & /@ RandomEntity[chinaApproved, 9], 3]]
Out[11]=

For an idea of the size of the molecules in the set, make a histogram of their atom counts:

In[12]:=
Histogram[
 MoleculeValue[EntityValue["SWEETLEAD", "Molecule"], "AtomCount"]]
Out[12]=

Visualizations (2) 

Entries with the same "DatabaseID" come from compounds with undefined stereochemistry. Find enantiomer pairs by selecting pairs with the same id:

In[13]:=
EntityRegister[ResourceData[\!\(\*
TagBox["\"\<SWEETLEAD Molecule Database\>\"",
#& ,
BoxID -> "ResourceTag-SWEETLEAD Molecule Database-Input",
AutoDelete->True]\)]];
enantiomers = Select[GroupBy[EntityList["SWEETLEAD"], EntityProperty["SWEETLEAD", "DatabaseID"] -> EntityProperty["SWEETLEAD", "Molecule"]], Length[#] === 2 &];

View two enantiomers in 2D and 3D:

In[14]:=
GraphicsGrid[
 Through[{Map@MoleculePlot, Map@MoleculePlot3D}@enantiomers[[22]]]]
Out[14]=

Find all the compounds containing a furan group:

In[15]:=
furanPattern = MoleculePattern["c1ccoc1"];
furans = FilteredEntityClass["SWEETLEAD", EntityFunction[e, MoleculeContainsQ[e["Molecule"], furanPattern]]];
EntityValue[furans, "EntityCount"]
Out[16]=

There were 128 matches. Make 3D plots of 4 of them, highlighting the furan group:

In[17]:=
MoleculePlot3D[#["Molecule"], furanPattern] & /@ RandomEntity[furans, 4]
Out[17]=

Filter the entities further by selecting the furan-containing species with an FDA approval:

In[18]:=
furansFDA = FilteredEntityClass[furans, EntityFunction[e, KeyExistsQ[e["OfficialNames"], "FDA Approved Drug"]]];
EntityValue[furansFDA, "EntityCount"]
Out[19]=

Visualize them in labeled 2D structure diagrams:

In[20]:=
Grid@Partition[
  MoleculePlot[#["Molecule"], furanPattern, PlotLabel -> Style[#["OfficialNames"]["FDA Approved Drug"], 12], ImageSize -> 200] & /@ RandomEntity[furansFDA, 4], 2]
Out[20]=

Visualize molecular shape using ConvexHullMesh:

In[21]:=
EntityRegister[ResourceData[\!\(\*
TagBox["\"\<SWEETLEAD Molecule Database\>\"",
#& ,
BoxID -> "ResourceTag-SWEETLEAD Molecule Database-Input",
AutoDelete->True]\)]]; 
Table[mol = RandomEntity["SWEETLEAD"]["Molecule"]; Show[HighlightMesh[
    ConvexHullMesh[QuantityMagnitude[mol["AtomCoordinates"]]], Style[2, Opacity[0.5, RandomColor[]]], ImageSize -> 100], MoleculePlot3D[mol, PlotTheme -> "Tubes"]], {3}, {4}] // Grid
Out[21]=

Analysis (2) 

Find the ten most "synthetically accessible" compounds:

In[22]:=
EntityRegister[ResourceData[\!\(\*
TagBox["\"\<SWEETLEAD Molecule Database\>\"",
#& ,
BoxID -> "ResourceTag-SWEETLEAD Molecule Database-Input",
AutoDelete->True]\)]]; mols = TakeSmallestBy[EntityValue["SWEETLEAD", "Molecule"], MoleculeProperty["SyntheticAccessibilityScore"], 5]
Out[22]=

Use ToEntity to find these in the Wolfram Knowledgebase:

In[23]:=
ToEntity /@ %
Out[23]=

JasonB, "SWEETLEAD Molecule Database" from the Wolfram Data Repository (2020)  

License Information

"http://creativecommons.org/licenses/by-nc/3.0/", The data is under the "Creative Commons Attribution-NonCommercial License".

Data Resource History

Source Metadata

Data Downloads

Publisher Information