Human Protein Protein Interaction Network Genes

Source Notebook

Dataset containing information on all protein coding genes in the Human Protein Protein Interaction Network (PPIN)

Details

The data is based on The Human Protein Atlas version 23.0 and Ensembl version 109.
It is also possible to get the latest version (24.0) of the interaction data which was published on October 22, 2024.
The default content is a Association containing a the Ensembl ID of human protein coding genes and their relevant information and these additional data:
"interactionsOnly"Association of list of interacting genes
"Version 24.0"Data for all interacting genes in Version 24.0

(11351 elements)

Examples

Basic Examples (3) 

Retrieve the full data about genes involved in the human protein protein interaction network:

In[1]:=
Dataset[ResourceData[\!\(\*
TagBox["\"\<Human Protein Protein Interaction Network Genes\>\"",
#& ,
BoxID -> "ResourceTag-Human Protein Protein Interaction Network Genes-Input",
AutoDelete->True]\)]]
Out[1]=

Retrieve the keys available for each gene:

In[2]:=
Keys[ResourceData[\!\(\*
TagBox["\"\<Human Protein Protein Interaction Network Genes\>\"",
#& ,
BoxID -> "ResourceTag-Human Protein Protein Interaction Network Genes-Input",
AutoDelete->True]\)][[1]]] // Shallow
Out[2]=

Get the dataset of interacting proteins:

In[3]:=
Dataset[ResourceData[\!\(\*
TagBox["\"\<Human Protein Protein Interaction Network Genes\>\"",
#& ,
BoxID -> "ResourceTag-Human Protein Protein Interaction Network Genes-Input",
AutoDelete->True]\), "interactionsOnly"]]
Out[3]=

Scope & Additional Elements (1) 

Data for Version 24.0 can be accessed using:

In[4]:=
Dataset[ResourceData[\!\(\*
TagBox["\"\<Human Protein Protein Interaction Network Genes\>\"",
#& ,
BoxID -> "ResourceTag-Human Protein Protein Interaction Network Genes-Input",
AutoDelete->True]\), "Version 24.0"]]
Out[4]=

Visualizations (2) 

Association of the frequency of connection of all interacting proteins:

In[5]:=
connectivitiesAssociation = KeySort@Counts@Values@Map[Length, ResourceData[\!\(\*
TagBox["\"\<Human Protein Protein Interaction Network Genes\>\"",
#& ,
BoxID -> "ResourceTag-Human Protein Protein Interaction Network Genes-Input",
AutoDelete->True]\), "interactionsOnly"][[All, 3]]]
Out[5]=

Visualize the power law nature of the connectivity frequencies in Human PPIN, this implies the human PPIN is a scale-free network with a few large hubs:

In[6]:=
exponent = -2;
fit = 40000*(#^exponent) & /@ Range[200];
exponent2 = -1;
fit2 = 3000*(#^exponent2) & /@ Range[200];
ListLogLogPlot[{connectivitiesAssociation, fit, fit2}, AxesLabel -> {"Connections", "Frequencies"}, ImageSize -> 600, PlotLegends -> {"Data", "x^-2", "x^-1"}]
Out[10]=

Analysis (2) 

Let us visualize the most important proteins in the human protein protein interaction network in a word cloud:

In[11]:=
fullData = ResourceData[\!\(\*
TagBox["\"\<Human Protein Protein Interaction Network Genes\>\"",
#& ,
BoxID -> "ResourceTag-Human Protein Protein Interaction Network Genes-Input",
AutoDelete->True]\)]; interactions = ResourceData[\!\(\*
TagBox["\"\<Human Protein Protein Interaction Network Genes\>\"",
#& ,
BoxID -> "ResourceTag-Human Protein Protein Interaction Network Genes-Input",
AutoDelete->True]\), "interactionsOnly"]; geneConnectionsCount = Map[Length, Values@interactions[[All, 3]]];
genes = Values@interactions[[All, 1]];
geneGeneNameAssociation = AssociationThread[Values@fullData[[All, 1]],
   Values@fullData[[All, 4]]];
geneNames = Map[geneGeneNameAssociation, genes];
In[12]:=
GraphicsRow[{WordCloud[
   MapThread[{#1, #2} &, {genes, geneConnectionsCount}], PlotLabel -> Style["Genes", Black, 30], ColorFunction -> "Rainbow",
    ImageSize -> 500],
  WordCloud[MapThread[{#1, #2} &, {geneNames, geneConnectionsCount}], PlotLabel -> Style["Gene names", Black, 30], ColorFunction -> "Rainbow", ImageSize -> 600]}, ImageSize -> 1000]
Out[12]=

We can also analyze the genes based on other characteristics, such as the chromosome each of them appears on:

In[13]:=
chromosomeCounts = SortBy[Normal[
    Counts[Table[
      element["Chromosome"], {element, Values[fullData]}]]], {If[
      NumericQ[ToExpression[First[#]]], ToExpression[First[#]], 100 + Position[{"X", "Y", "MT"}, First[#]][[1, 1]]], Last[#]} &];
PieChart[Values[chromosomeCounts], ChartLabels -> Callout[Keys[chromosomeCounts]], PlotLabel -> Style["Distribution of Genes by Chromosome", Black, 15],
  ImageSize -> 600, ColorFunction -> "DarkRainbow"]
Out[14]=

WolframChemistry, "Human Protein Protein Interaction Network Genes" from the Wolfram Data Repository (2024)  

Data Resource History

Source Metadata

See Also

Data Downloads

Publisher Information