Wolfram Data Repository
Immediate Computable Access to Curated Contributed Data
Categorization of the pathogenicity of 89% of 71 million possible human missense variants
"Chromosome" | string including "Chromosome" and a number from 1 to 22 or a letter M,X,or Y. |
"RawDataSize" | total rows for each chromosome in the original database |
"TotalPositions" | total positions for each chromosome |
"TotalMutations" | total missense mutations for each chromosome |
"Genome" | the genome build |
"TotalUniprotID" | total uniprot IDs |
"TotalTranscriptID" | total transcript IDs |
"TotalAminoAcidVariations" | total amino acid changes |
"PathogenicityQuartiles" | pathogenicity Quartiles 1/4, 2/4, 3/4 |
"PathogenicityMean" | pathogenicity Mean |
"PathogenicityLikelyBenign" | percentage of "likely_benign" classifications |
"PathogenicityLikelyPathogenic" | percentage of "likely_pathogenic" classifications |
"PathogenicityAmbiguous" | percentage of "ambiguous" classifications |
"Chromosome" | string including "Chromosome" and a number from 1 to 22 or a letter M,X,or Y. |
"Position" | genome position (1-based) |
"ReferenceNucleotide" | reference nucleotide (GRCh38.p13 for hg38) |
"AlternativeNucleotide" | alternative nucleotide |
"Genome" | genome build |
"UniprotID" | UniProtKB accession number of the protein in which the variant induces a single amino-acid substitution (UniProt release 2021_02) |
"TranscriptID" | Ensembl transcript ID from GENCODE V32 (hg38) |
"AminoAcidVariation" | Amino acid change induced by the alternative allele,in the format: Reference aminoacid-POS_aa-Alternative amino acid |
"Pathogenicity" | predicted probability of a variant being clinically pathogenic |
"Classification" | derived using the following thresholds: "likely_benign" for Pathogenicity < 0.34; "likely_pathogenic" for Pathogenicity > 0.564; and "ambiguous" otherwise |
In[1]:= | ![]() |
Out[1]= | ![]() |
Retrieve the data for ChromosomeX:
In[2]:= | ![]() |
Out[21]= | ![]() |
Get a random sample of the Positions in ChromosomeX:
In[22]:= | ![]() |
Out[22]= | ![]() |
Get a random sample of the transcriptIDs in ChromosomeX:
In[23]:= | ![]() |
Out[23]= | ![]() |
Get a distribution of the ChromosomeX pathogenicity level for all possible missense variations:
In[24]:= | ![]() |
Out[24]= | ![]() |
Get a summary of the classification of the ChromosomeX pathogenicity level for all possible missense variations:
In[25]:= | ![]() |
Out[25]= | ![]() |
Get the pathogenicity information associated to ChromosomeX in position 71765227:
In[26]:= | ![]() |
In[27]:= | ![]() |
Out[27]= | ![]() |
Get the pathogenicity information associated to ChromosomeX for a list of positions:
In[28]:= | ![]() |
In[29]:= | ![]() |
Out[29]= | ![]() |
Get the pathogenicity distribution associated to all the positions of all the chromosomes:
In[30]:= | ![]() |
In[31]:= | ![]() |
Out[31]= | ![]() |
Wolfram Research, "Alpha Missense" from the Wolfram Data Repository (2024)