SWEETLEAD Molecule Database

Source Notebook

A cheminformatics database of medicines, drugs, and herbal isolates

Details

A curated set of chemical structures with known therapeutic effects.
Information was gathered from publicly available databases, and a consensus-building scheme was used to combine entries and arrive at the correct chemical structure for each drug compound.
A set of over 4,200 unique molecules , expanded to over 9,100 Entity objects after undefined stereocenters were enumerated.
Contains chemical structures, computed 3D geometries, external identifiers, synonyms, and information from regulatory agencies.

Examples

Basic Examples

Retrieve and load the entity store:

In[1]:=
EntityRegister[ResourceData[
ResourceObject["SWEETLEAD Molecule Database"]]]
Out[1]=

Find out the number of molecules in the set:

In[2]:=
EntityValue["SWEETLEAD", "EntityCount"]
Out[2]=

Find out the available properties:

In[3]:=
EntityProperties["SWEETLEAD"]
Out[3]=

The entity canonical names are their index in the set. The entities display using their 2D structure diagram:

In[4]:=
ent = Entity["SWEETLEAD", "2243"]
Out[4]=

Retrieve all the property values for this entity as an Association:

In[5]:=
EntityValue[ent, "PropertyAssociation"]
Out[5]=

Query for the Molecule from an entity:

In[6]:=
mol = EntityValue[Entity["SWEETLEAD", "800"], "Molecule"]
Out[6]=
In[7]:=
MoleculePlot3D[mol]
Out[7]=

Scope & Additional Elements


The "OfficialNames" key contains information about the different approvals:

In[8]:=
Counts[Flatten[Keys@EntityValue["SWEETLEAD", "OfficialNames"]]]
Out[8]=

Create a FilteredEntityClass of those compounds approved in China:

In[9]:=
chinaApproved = FilteredEntityClass["SWEETLEAD", EntityFunction[c, KeyExistsQ[c["OfficialNames"], "China Approved Name"]]];
EntityValue[chinaApproved, "EntityCount"]
Out[10]=

Use MoleculePlot to make a grid of labeled structure diagrams:

In[11]:=
Grid[Partition[
  MoleculePlot[#["Molecule"], PlotLabel -> #["OfficialNames"]["China Approved Name"]] & /@ RandomEntity[chinaApproved, 9], 3]]
Out[11]=

For an idea of the size of the molecules in the set, make a histogram of their atom counts:

In[12]:=
Histogram[
 MoleculeValue[EntityValue["SWEETLEAD", "Molecule"], "AtomCount"]]
Out[12]=

Visualizations

Entries with the same "DatabaseID" come from compounds with undefined stereochemistry. Find enantiomer pairs by selecting pairs with the same id:

In[13]:=
enantiomers = Select[GroupBy[EntityList["SWEETLEAD"], EntityProperty["SWEETLEAD", "DatabaseID"] -> EntityProperty["SWEETLEAD", "Molecule"]], Length[#] === 2 &];

View two enantiomers in 2D and 3D:

In[14]:=
GraphicsGrid[
 Through[{Map@MoleculePlot, Map@MoleculePlot3D}@enantiomers[[22]]]]
Out[14]=

Find all the compounds containing a furan group:

In[15]:=
furanPattern = MoleculePattern["c1ccoc1"];
furans = FilteredEntityClass["SWEETLEAD", EntityFunction[e, MoleculeContainsQ[e["Molecule"], furanPattern]]];
EntityValue[furans, "EntityCount"]
Out[16]=

There were 128 matches. Make 3D plots of 4 of them, highlighting the furan group:

In[17]:=
MoleculePlot3D[#["Molecule"], furanPattern] & /@ RandomEntity[furans, 4]
Out[17]=

Filter the entities further by selecting the furan-containing species with an FDA approval:

In[18]:=
furansFDA = FilteredEntityClass[furans, EntityFunction[e, KeyExistsQ[e["OfficialNames"], "FDA Approved Drug"]]];
EntityValue[furansFDA, "EntityCount"]
Out[19]=

Visualize them in labeled 2D structure diagrams:

In[20]:=
Grid@Partition[
  MoleculePlot[#["Molecule"], furanPattern, PlotLabel -> Style[#["OfficialNames"]["FDA Approved Drug"], 12], ImageSize -> 200] & /@ RandomEntity[furansFDA, 4], 2]
Out[20]=

Visualize molecular shape using ConvexHullMesh:

In[21]:=
Table[
  mol = RandomEntity["SWEETLEAD"]["Molecule"];
  Show[HighlightMesh[
    ConvexHullMesh[QuantityMagnitude[mol["AtomCoordinates"]]], Style[2, Opacity[0.5, RandomColor[]]], ImageSize -> 100], MoleculePlot3D[mol, PlotTheme -> "Tubes"]], {3}, {4}] // Grid
Out[21]=

Analysis

Find the ten most "synthetically accessible" compounds:

In[22]:=
mols = TakeSmallestBy[EntityValue["SWEETLEAD", "Molecule"], MoleculeProperty["SyntheticAccessibilityScore"], 5]
Out[22]=

Use ToEntity to find these in the Wolfram Knowledgebase:

In[23]:=
ToEntity /@ %
Out[23]=

JasonB, "SWEETLEAD Molecule Database" from the Wolfram Data Repository (2020)  

License Information

"http://creativecommons.org/licenses/by-nc/3.0/", The data is under the "Creative Commons Attribution-NonCommercial License".

Data Resource History

Source Metadata

Data Downloads

Publisher Information