Wolfram Research

Repurposing Therapeutics for COVID-19

Source Notebook

Vina Docking scores for drug molecules with the S-protein of SARS-CoV-2 and human human ACE2 receptor

Details

Molecules were tested for their binding affinity to a compuational model of the spike protein (S-protein) of SARS-CoV-2, and to the S-protien interfaced with the human ACE2 receptor.
Test molecules were taken from the SWEETLEAD dataset of chemical structures with known therapeutic effects.

Examples

Basic Examples

Load the data:

In[1]:=
dockingScores = ResourceData[\!\(\*
TagBox[
RowBox[{"ResourceObject", "[", "\"\<Repurposing Therapeutics for COVID-19\>\"", "]"}],
#& ,
BoxID -> "ResourceTag-Repurposing Therapeutics for COVID-19-Input",
AutoDelete->True]\)];
dockingScores[[;; 3]]
Out[2]=

The data consists of a set of Vina scores for the molecules from the SWEETLEAD dataset. The lower the score, the better the molecule is able to dock with either the isolated S-protein or the protein-receptor interface. To view the data in connection to the underlying molecules,

Begin by loading the SWEETLEAD dataset as an EntityStore:

In[3]:=
EntityRegister[
 ResourceData[ResourceObject["SWEETLEAD Molecule Database"]]]
Out[4]=

Now add the Vina docking scores as properties for each entity:

In[5]:=
addPropertiesToEntity[entity_, properties_Association] := KeyValueMap[(Entity["SWEETLEAD", entity][#1] = #2) &, properties]
KeyValueMap[addPropertiesToEntity, ResourceData[\!\(\*
TagBox[
RowBox[{"ResourceObject", "[", "\"\<Repurposing Therapeutics for COVID-19\>\"", "]"}],
#& ,
BoxID -> "ResourceTag-Repurposing Therapeutics for COVID-19-Input",
AutoDelete->True]\)]];

Verify that the docking scores have been added as EntityProperty objects:

In[6]:=
EntityProperties["SWEETLEAD"]
Out[6]=

Now find the three entities with the lowest (best) Vina score for the isolated S-protein:

In[7]:=
TakeSmallestBy[EntityList["SWEETLEAD"], EntityProperty["SWEETLEAD", "IsolatedSProteinVinaScore"], 3]
Out[7]=

Create labeled structure diagrams from these molecules:

In[8]:=
Framed@Labeled[MoleculePlot[#["Molecule"]], Style[Row[{"Vina docking score: ", #@
        EntityProperty["SWEETLEAD", "IsolatedSProteinVinaScore"]}]]] & /@ %
Out[8]=

Find the compound with the best docking score for the protein-receptor interface:

In[9]:=
bestInterfaceEntity = First@TakeSmallestBy[EntityList["SWEETLEAD"], EntityProperty["SWEETLEAD", "InterfaceDockingVinaScore"], 1]
Out[9]=

Retrieve all properties from this Entity:

In[10]:=
bestInterfaceEntity["PropertyAssociation"]
Out[10]=

Search PubChem for similar molecules, using a Tanimoto similarity score of 99% or greater as a threshold:

In[11]:=
similar = ResourceFunction["PubChemSimilaritySearch"][bestInterfaceEntity, "TanimotoThreshold" -> 99]
Out[25]=

Visualize the entity along with the similar compounds:

In[26]:=
Column[{Framed[MoleculePlot[entity["Molecule"]]], GraphicsGrid[Partition[MoleculePlot@*Molecule /@ similar, 3], ImageSize -> 300]}, Alignment -> Center
 ]
Out[26]=

Analysis


Create a PredictorFunction to predict the Vina score based on topological descriptors. First get a list of entities with scores for docking to the isolated S-protein:

In[27]:=
entities = Select[EntityList[
    "SWEETLEAD"], ! MissingQ[#[
       EntityProperty["SWEETLEAD", "IsolatedSProteinVinaScore"]]] &];
Length@entities
Out[28]=

Using a list of properties that return numeric values for features, prepare a set of labeled data to work with:

In[29]:=
molFeatures[e_] := MoleculeValue[e["Molecule"], {
  "AliphaticCarbocycleCount", "AliphaticHeterocycleCount", "AliphaticRingCount", "AmideBondCount", "AromaticCarbocycleCount", "AromaticHeterocycleCount", "AromaticRingCount", "BridgeheadAtomCount", "Chi0n", "Chi0v", "Chi1n", "Chi1v", "Chi2n",
    "Chi2v", "Chi3n", "Chi3v", "Chi4n", "Chi4v", "CrippenClogP", "CrippenMR", "DegreeOfUnsaturation", "FractionCarbonSP3", "HBondAcceptorCount", "HBondDonorCount", "HeteroatomCount", "HeterocycleCount", "Kappa1", "Kappa2", "Kappa3", "KierHallAlphaShape", "LabuteApproximateSurfaceArea", "LipinskiHBondAcceptorCount", "LipinskiHBondDonorCount", "RingCount", "RotatableBondCount", "SaturatedCarbocycleCount", "SaturatedHeterocycleCount", "SaturatedRingCount", "SpiroAtomCount", "StereocenterCount", "SyntheticAccessibilityScore", "UnspecifiedStereocenterCount"}]
labeledData = Thread[ResourceFunction["DynamicMap"][molFeatures, entities] -> QuantityMagnitude[
     EntityValue[entities, EntityProperty["SWEETLEAD", "IsolatedSProteinVinaScore"]]]];

Split the labeled data into training and test sets:

In[30]:=
SeedRandom[14];
With[{ld = RandomSample[labeledData]},
 training = ld[[;; 7000]];
 testset = ld[[7001 ;;]];
 ]

Create the predictor function:

In[31]:=
p = Predict[training]
Out[31]=

Visualize the results using PredictorMeasurements:

In[32]:=
PredictorMeasurements[p, testset, "ComparisonPlot"]
Out[32]=

JasonB, "Repurposing Therapeutics for COVID-19" from the Wolfram Data Repository (2020)  

License Information

CC BY-NC-ND 4.0

Data Resource History

Source Metadata

Data Downloads

Publisher Information