Wolfram Data Repository
Immediate Computable Access to Curated Contributed Data
Categorization of the pathogenicity of 89% of 71 million possible human missense variants
| "Chromosome" | string including "Chromosome" and a number from 1 to 22 or a letter M,X,or Y. |
| "RawDataSize" | total rows for each chromosome in the original database |
| "TotalPositions" | total positions for each chromosome |
| "TotalMutations" | total missense mutations for each chromosome |
| "Genome" | the genome build |
| "TotalUniprotID" | total uniprot IDs |
| "TotalTranscriptID" | total transcript IDs |
| "TotalAminoAcidVariations" | total amino acid changes |
| "PathogenicityQuartiles" | pathogenicity Quartiles 1/4, 2/4, 3/4 |
| "PathogenicityMean" | pathogenicity Mean |
| "PathogenicityLikelyBenign" | percentage of "likely_benign" classifications |
| "PathogenicityLikelyPathogenic" | percentage of "likely_pathogenic" classifications |
| "PathogenicityAmbiguous" | percentage of "ambiguous" classifications |
| "Chromosome" | string including "Chromosome" and a number from 1 to 22 or a letter M,X,or Y. |
| "Position" | genome position (1-based) |
| "ReferenceNucleotide" | reference nucleotide (GRCh38.p13 for hg38) |
| "AlternativeNucleotide" | alternative nucleotide |
| "Genome" | genome build |
| "UniprotID" | UniProtKB accession number of the protein in which the variant induces a single amino-acid substitution (UniProt release 2021_02) |
| "TranscriptID" | Ensembl transcript ID from GENCODE V32 (hg38) |
| "AminoAcidVariation" | Amino acid change induced by the alternative allele,in the format: Reference aminoacid-POS_aa-Alternative amino acid |
| "Pathogenicity" | predicted probability of a variant being clinically pathogenic |
| "Classification" | derived using the following thresholds: "likely_benign" for Pathogenicity < 0.34; "likely_pathogenic" for Pathogenicity > 0.564; and "ambiguous" otherwise |
| In[1]:= |
| Out[1]= | ![]() |
Retrieve the data for ChromosomeX:
| In[2]:= |
| Out[21]= | ![]() |
Get a random sample of the Positions in ChromosomeX:
| In[22]:= |
| Out[22]= |
Get a random sample of the transcriptIDs in ChromosomeX:
| In[23]:= |
| Out[23]= |
Get a distribution of the ChromosomeX pathogenicity level for all possible missense variations:
| In[24]:= | ![]() |
| Out[24]= | ![]() |
Get a summary of the classification of the ChromosomeX pathogenicity level for all possible missense variations:
| In[25]:= |
| Out[25]= | ![]() |
Get the pathogenicity information associated to ChromosomeX in position 71765227:
| In[26]:= | ![]() |
| In[27]:= |
| Out[27]= | ![]() |
Get the pathogenicity information associated to ChromosomeX for a list of positions:
| In[28]:= | ![]() |
| In[29]:= |
| Out[29]= | ![]() |
Get the pathogenicity distribution associated to all the positions of all the chromosomes:
| In[30]:= | ![]() |
| In[31]:= | ![]() |
| Out[31]= | ![]() |
Wolfram Research, "Alpha Missense" from the Wolfram Data Repository (2024)