Wolfram Data Repository
Immediate Computable Access to Curated Contributed Data
Categorization of the pathogenicity data of human missense variants
| "Chromosome" | string including "Chromosome" and a number from 1 to 22 or a letter M,X,or Y. |
| "RawDataSize" | total rows for each chromosome in the original database |
| "TotalPositions" | total positions for each chromosome |
| "TotalMutations" | total missense mutations for each chromosome |
| "Genome" | the genome build |
| "TotalUniprotID" | total uniprot IDs |
| "TotalTranscriptID" | total transcript IDs |
| "TotalAminoAcidVariations" | total amino acid changes |
| "PathogenicityQuartiles" | pathogenicity Quartiles 1/4, 2/4, 3/4 |
| "PathogenicityMean" | pathogenicity Mean |
| "PathogenicityLikelyBenign" | percentage of "likely_benign" classifications |
| "PathogenicityLikelyPathogenic" | percentage of "likely_pathogenic" classifications |
| "PathogenicityAmbiguous" | percentage of "ambiguous" classifications |
| "Chromosome" | string including "Chromosome" and a number from 1 to 22 or a letter M,X,or Y. |
| "Position" | genome position (1-based) |
| "ReferenceNucleotide" | reference nucleotide (GRCh38.p13 for hg38) |
| "AlternativeNucleotide" | alternative nucleotide |
| "Genome" | genome build |
| "UniprotID" | UniProtKB accession number of the protein in which the variant induces a single amino-acid substitution (UniProt release 2021_02) |
| "TranscriptID" | Ensembl transcript ID from GENCODE V32 (hg38) |
| "AminoAcidVariation" | Amino acid change induced by the alternative allele,in the format: Reference aminoacid-POS_aa-Alternative amino acid |
| "Pathogenicity" | predicted probability of a variant being clinically pathogenic |
| "Classification" | derived using the following thresholds: "likely_benign" for Pathogenicity < 0.34; "likely_pathogenic" for Pathogenicity > 0.564; and "ambiguous" otherwise |
| "ExternalIdentifier" | ExternalIdentifier of the protein |
| "Name" | common name of the protein |
| "Sequence" | amino acid sequence of the protein |
| "Mutations" | all possible mutations in the protein |
| "Score" | AlphaMissense pathogenicity score by mutations of a protein |
| "Status" | status of pathogenicity (B: Likely benign, A: Ambiguous, P: Likely pathogenic) |
| "MeanPathogenicity" | Mean pathogenicity per residue of a protein |
| "MedianPathogenicity" | Median pathogenicity per residue of a protein |
| In[1]:= |
| Out[1]= | ![]() |
Retrieve the data for ChromosomeX:
| In[2]:= |
| Out[3]= | ![]() |
Get a random sample of the Positions in ChromosomeX:
| In[4]:= |
| Out[4]= |
Get a random sample of the transcriptIDs in ChromosomeX:
| In[5]:= |
| Out[5]= |
Get a distribution of the ChromosomeX pathogenicity level for all possible missense variations:
| In[6]:= | ![]() |
| Out[6]= | ![]() |
Get a summary of the classification of the ChromosomeX pathogenicity level for all possible missense variations:
| In[7]:= |
| Out[7]= | ![]() |
| In[8]:= |
| Out[8]= |
Select a protein from the set:
| In[9]:= |
| Out[9]= |
Find the protein's properties:
| In[10]:= |
| Out[10]= | ![]() |
Get the protein's name:
| In[11]:= |
| Out[11]= |
Get the protein's amino acid sequence:
| In[12]:= |
| Out[12]= | ![]() |
Visualize the mean and median AlphaMissense pathogenicity per residue:
| In[13]:= |
| Out[13]= | ![]() |
Get the AlphaMissense pathogenicity score for a specific residue:
| In[14]:= |
| Out[15]= | ![]() |
Get the AlphaMissense pathogenicity status for a specific residue:
| In[16]:= | ![]() |
| Out[17]= | ![]() |
Get the name of a specific protein:
| In[18]:= |
| Out[18]= |
Get the pathogenicity information associated to ChromosomeX in position 71765227:
| In[19]:= | ![]() |
| In[20]:= |
| Out[20]= | ![]() |
Get the pathogenicity information associated to ChromosomeX for a list of positions:
| In[21]:= | ![]() |
| In[22]:= |
| Out[22]= | ![]() |
Get the pathogenicity distribution associated to all the positions of all the chromosomes:
| In[23]:= | ![]() |
| In[24]:= | ![]() |
| Out[24]= | ![]() |
Wolfram Research, "Alpha Missense" from the Wolfram Data Repository (2024)