PTMD 1.0 - PTMs that associate with diseases

※ Online Resources for Post translational modification & Disease:

    <> Protein translational modification & Disease Resources.
        (1) PTMs & Disease related database.
        (2) Disease database.

(3) PTMs site database.

==================================================================================

<> Protein translational modification & Disease Resources.

1. PTMs & Disease related databases.

(1) KinMutBase v3.0: is a comprehensive database of disease-causing mutations in protein kinase domains. This new release of the database contains 582 mutations in 20 tyrosine kinase domains and 13 serine/threonine kinase domains. The database refers 1790 cases from 1322 families. (Ortutay C, et al., Human Mutation. 2005).

(2) MoKCa: is a (Mutations of Kinases in Cancer) has been developed to structurally and functionally annotate, and where possible predict, the phenotypic consequences of mutations in protein kinases implicated in cancer (Richardson CJ, et al., Nucleic Acids Res. 2009).

(3) HHMD: a comprehensive database for human histone modifications, which focuses on integrating useful histone modification information from experimental data that is essential for understanding these modifications at a systematic level. The current release of HHMD incorporates 43 location-specific histone modifications in human. HHMD also provide a comprehensive resource of histone modification regulation in 9 human cancer types. We developed HisModView to facilitate the users to browse histone modifications in the context of existing human genomic annotations (Zhang Y, et al., Nucleic Acids Res. 2010).

(4) KIDFamMap: KIDFamMap is the first database grouping 189,987 kinase-inhibitor interactions (from BindingDB and kinase profiling) into 1,210 pharma-interfaces (called kinase-inhibitor family), deriving the relationship between the moiety preferences and physico-chemical properties of binding sites of kinases and inhibitors, relationships between 399 human protein kinase (from Kinase.Com), 35,788 kinase inhibitors (from BindingDB and kinase profiling) and 339 diseases (from OMIM and KEGG). (Yi-Yuan Chiu, et al., Nucleic Acids Res. 2013).

2. Disease database.

(1) Online Mendelian Inheritance in Man: (OMIM) is a comprehensive, authoritative compendium of human genes and genetic phenotypes that is freely available and updated daily. The full-text, referenced overviews in OMIM contain information on all known mendelian disorders and over 12,000 genes. OMIM focuses on the relationship between phenotype and genotype. It is updated daily, and the entries contain copious links to other genetics resources.(Hamosh A, et al., Nucleic Acids Res. 2015).

(2) The Human Gene Mutation Database: (HGMD) constitutes a comprehensive core collection of data on germ-line mutations in nuclear genes underlying or associated with human inherited disease. Data cataloged include single-base-pair substitutions in coding, regulatory, and splicing-relevant regions, micro-deletions and micro-insertions, indels, and triplet repeat expansions, as well as gross gene deletions, insertions, duplications, and complex rearrangements. Each mutation is entered into HGMD only once, in order to avoid confusion between recurrent and identical-by-descent lesions. (Stenson PD, et al., Hum Genet. 2014).

(3) The Comparative Toxicogenomics Database: (CTD) is a data resource that describe relationships between chemicals, genes, and human diseases. CTD includes curated data describing cross-species chemical–gene/protein interactions and chemical– and gene–disease associations to illuminate molecular mechanisms underlying variable susceptibility and environmentally influenced diseases. These data will also provide insights into complex chemical–gene and protein interaction networks. (Allan, et al., Nucleic Acids Res. 2017).

(4) DisGeNET: is a database integrating gene-disease associations from several public data sources and the literature. The current version contains 381056 associations, between 16666 genes and 13172 diseases. Given the large number of gene-disease associations compiled in DisGeNET, we have also developed a score in order to rank the associations based on the supporting evidence.(Piñero, J, et al., Nucleic Acids Res. 2017)

(5) Cancer Gene census: is an ongoing effort to catalogue those genes for which mutations have been causally implicated in cancer. So far, 291 cancer genes have been reported, more than 1% of all the genes in the human genome. 90% of cancer genes show somatic mutations in cancer, 20% show germline mutations and 10% show both. The most common mutation class among the known cancer genes is a chromosomal translocation that creates a chimeric gene or apposes a gene to the regulatory elements of another gene. Many more cancer genes have been found in leukaemias, lymphomas and sarcomas than in other types of cancer, despite the fact that they represent only 10% of human cancer. These genes are usually altered by chromosomal translocation. The most common domain that is encoded by cancer genes is the protein kinase. Several domains that are involved in DNA binding and transcriptional regulation are common in proteins that are encoded by cancer genes. (Futreal PA, et al.,Nat Rev Cancer. 2004).

(6) Genetic Association Database: is an archive of human genetic association studies of complex diseases and disorders. This includes summary data extracted from published papers in peer reviewed journals on candidate gene and GWAS studies. The goal of this database is to allow the user to rapidly identify medically relevant polymorphism from the large volume of polymorphism and mutational data, in the context of standardized nomenclature. (Kevin G Becker, et al., Nature Genetics. 2004).

(7) GWASdb v2: is is a one stop shop which combines collections of GVs from GWAS and their comprehensive functional annotations, as well as disease classifications. We aim to help researchers and clinicians to maximize the utilility of the most recent GWAS data and gain biological insights through an integrative, multi-dimensional functional annotation portal (MJ Li, et al.,Nucleic Acids Res. 2016).

(8) MalaCards: is an integrated database of human maladies and their annotations, modeled on the architecture and richness of the popular GeneCards database of human genes. The MalaCards disease and disorders database is organized into "disease cards", each integrating prioritized information, and listing numerous known aliases for each disease, along with a variety of annotations, as well as inter-disease connections, empowered by the GeneCards relational database, searches, and GeneDecks set-analyses. Annotations include: symptoms, drugs, articles, genes, clinical trials, related diseases/disorders and more.(Rappaport N, et al.,Nucleic Acids Res. 2017).

3. PTMs site database.

(1) dbPAF: The dbPAF (database of Phospho-sites in Animals and Fungi) is an online data resource specifically designed for protein phosphorylation in seven eukaryotic species, including 294,370 non-redundant phosphorylation sites of 40,432 proteins. (Lin S, et al.,Scientific reports. 2016).

(2) PLMD: is an online data resource specifically designed for protein lysine modifications (PLMs). The PLMD database was extended and adapted from our CPLA database and CPLM database , and the PLMD release contains 284780 modification events in 53501 proteins for 20 types of PLMs,, including ubiquitination, acetylation, sumoylation, methylation, succinylation, malonylation, glutarylation, glycation, formylation, hydroxylation, butyrylation, propionylation, crotonylation, pupylation, neddylation, 2-hydroxyisobutyrylation, phosphoglycerylation, carboxylation, lipoylation and biotinylation.. (Xu, et al., J Genet Genomics. 2017).

(3) PhosphoSitePlus: a knowledgebase dedicated to mammalian post-translational modifications (PTMs), contains over 330,000 non-redundant PTMs, including phospho, acetyl, ubiquityl and methyl groups. Over 95% of the sites are from mass spectrometry (MS) experiments. (Hornbeck PV, et al.,Nucleic Acids Res. 2015).

(4) PhosphoNET: PhosphoNET presently holds data on more than 74,000 phosphorylation sites in over 12,400 human proteins that have been collected from the scientific literature and other reputable websites. It features direct links to several other useful websites, and will continue to expand as a useful portal for phosphoproteomics information.

(5) HPRD release 9: use Profile Hidden Markov Model (HMM) to predict phosphorylation sites within given protein sequences. The HMM profiles are derived from learning each groups sequences surrounding to the phosphorylation residues (Keshava Prasad, et al., Nucleic Acids Res. 2009).

(6) PHOSIDA: comprises more than 80,000 phosphorylated, N-glycosylated or acetylated sites from nine different species. All sites are obtained from high-resolution mass spectrometric data using the same stringent quality criteria. (Gnad, et al.,Nucleic Acids Res. 2011).

(7) PhosPhAt 4.0: contains information on Arabidopsis phosphorylation sites which were identified by mass spectrometry in large scale experiments from different research groups with 6,282 phosphopeptides (Durek P, et al.,Nucleic Acids Res. 2010).

(8) dbPTM 2016: integrates experimentally verified PTMs from several databases, and to annotate the predicted PTMs on Swiss-Prot proteins(Huang, et al., Nucleic Acids Res. 2016).

(9) SysPTM 2.0: provides a systematic and sophisticated platform for proteomic PTM research, equipped not only with a knowledge base of manually curated multi-type modification data, but also with four fully developed, in-depth data mining tools. (Li, et al., Mol Cell Proteomics. 2009).

(10) PhosphoPOINT: is a comprehensive human kinase interactome and phospho-protein database, containing 4195 phospho-proteins with a total of 15,738 phosphorylation sites (Yang CY, et al., Bioinformatics. 2008).

(11) Phospho.ELM: is a relational database designed to store in vivo and in vitro phosphorylation data extracted from the scientific literature and phosphoproteomic analyses. The resource has been actively developed for more than 7 years and currently comprises 42,574 serine, threonine and tyrosine non-redundant phosphorylation sites. (Dinkel H, et al., Nucleic Acids Res. 2011).