TY - JOUR
T1 - The Monarch Initiative in 2019
T2 - An integrative data and analytic platform connecting phenotypes to genotypes across species
AU - Shefchek, Kent A.
AU - Harris, Nomi L.
AU - Gargano, Michael
AU - Matentzoglu, Nicolas
AU - Unni, Deepak
AU - Brush, Matthew
AU - Keith, Daniel
AU - Conlin, Tom
AU - Vasilevsky, Nicole
AU - Zhang, Xingmin Aaron
AU - Balhoff, James P.
AU - Babb, Larry
AU - Bello, Susan M.
AU - Blau, Hannah
AU - Bradford, Yvonne
AU - Carbon, Seth
AU - Carmody, Leigh
AU - Chan, Lauren E.
AU - Cipriani, Valentina
AU - Cuzick, Alayne
AU - Rocca, Maria D.
AU - Dunn, Nathan
AU - Essaid, Shahim
AU - Fey, Petra
AU - Grove, Chris
AU - Gourdine, Jean Phillipe
AU - Hamosh, Ada
AU - Harris, Midori
AU - Helbig, Ingo
AU - Hoatlin, Maureen
AU - Joachimiak, Marcin
AU - Jupp, Simon
AU - Lett, Kenneth B.
AU - Lewis, Suzanna E.
AU - McNamara, Craig
AU - Pendlington, Zoë M.
AU - Pilgrim, Clare
AU - Putman, Tim
AU - Ravanmehr, Vida
AU - Reese, Justin
AU - Riggs, Erin
AU - Robb, Sofia
AU - Roncaglia, Paola
AU - Seager, James
AU - Segerdell, Erik
AU - Similuk, Morgan
AU - Storm, Andrea L.
AU - Thaxon, Courtney
AU - Thessen, Anne
AU - Jacobsen, Julius O.B.
AU - McMurry, Julie A.
AU - Groza, Tudor
AU - Köhler, Sebastian
AU - Smedley, Damian
AU - Robinson, Peter N.
AU - Mungall, Christopher J.
AU - Haendel, Melissa A.
AU - Munoz-Torres, Monica C.
AU - Osumi-Sutherland, David
N1 - Funding Information:
National Institutes of Health (NIH) Office of the Director (OD); The Monarch Initiative [1R24OD011883]; Forums for Integrative Phenomics [1U13CA221044]; Director, Office of Science, Office of Basic Energy Sciences, of the U.S. Department of Energy [DE-AC02-05CH11231 to S.C., N.L.H., N.D., M.J., S.E.L., C.J.M., J.R., D.U]; EMBL-EBI Core Funds, Open Targets [OTAR005]; European Union’s Horizon 2020 Research and Innovation Programme [654248 (CORBEL), 676559 (ELIXIR Excel-erate) to S.J., P.R., Z.M.P.]; National Human Genome Research Institute at the US National Institutes of Health [U24 HG002223 to C.G.]; UK Medical Research Council; UK Biotechnology and Biological Sciences Research Council; National Human Genome Research Institute at the US National Institutes of Health [U41HG006627 to A.H.]; Wellcome Trust [104967/Z/14/Z]; National Human Genome Research Institute at the US NIH [U41HG000330 to S.M.B.]; National Human Genome Research Institute (NHGRI) at the US NIH [U41 HG002659 to Y.M.B.]. Funding for open access charge: NIH OD; The Monarch Initiative [1R24OD011883]; Forums for Integrative Phe-nomics [1U13CA221044]. Conflict of interest statement. None declared.
Funding Information:
National Institutes of Health (NIH) Offce of the Director (OD); The Monarch Initiative [1R24OD011883]; Forums for Integrative Phenomics [1U13CA221044]; Director, Offce of Science, Offce of Basic Energy Sciences, of the U.S. Department of Energy [DE-AC02-05CH11231 to S.C., N.L.H., N.D., M.J., S.E.L., C.J.M., J.R., D.U]; EMBL-EBI Core Funds, Open Targets [OTAR005]; European Union's Horizon 2020 Research and Innovation Programme [654248 (CORBEL), 676559 (ELIXIR Excelerate) to S.J., P.R., Z.M.P.]; National Human Genome Research Institute at the US National Institutes of Health [U24 HG002223 to C.G.]; UK Medical Research Council; UK Biotechnology and Biological Sciences Research Council; National Human Genome Research Institute at the US National Institutes of Health [U41HG006627 to A.H.]; Wellcome Trust [104967/Z/14/Z]; National Human Genome Research Institute at the US NIH [U41HG000330 to S.M.B.]; National Human Genome Research Institute (NHGRI) at the US NIH [U41 HG002659 to Y.M.B.]. Funding for open access charge: NIH OD; The Monarch Initiative [1R24OD011883]; Forums for Integrative Phenomics [1U13CA221044].
Funding Information:
We regularly update our database with the latest gene-to-phenotype data from research organism databases (e.g. MGD (8), ZFIN (10), WormBase (15), FlyBase (16), IMPC (30)), human variants and gene-to-disease data (from OMIM, ClinVar, Orphanet, GWAS Catalog (31)) and other organismal gene-to-phenotype resources (OMIA (32), Animal QTLdb (33)). As well, we ingest other genomic data types, such as GO annotations, gene expression in specific tissues (BgeeDB (34)), protein-to-protein interaction (BioGRID (35)), pathway data (KEGG (36), Re-actome (37)), chemical-disease associations (CTD (38)), cell line genotypes-to-disease data (Coriell; https://www. coriell.org), data from the Mouse Phenome Database (39) and from the Mutant Mouse Resource and Research Centers (MMRRC; https://www.mmrrc.org/, Off ice of the Director grant number OD010921). We recently also added data from the Rat Genome Database (RGD (40)), the Saccharomyces Genome Database (SGD (41)) and data from protein-to-protein interaction networks from STRING (42). The latest release of the Monarch knowledge graph (September 2019, https://archive.monarchinitiative. org/latest) contains over 32.9 million nodes and 160 million edges. In comparison to our previous report, we have 134 244 additional gene-to-phenotype associations. Nearly half (68 640) of the new associations were the result of adding SGD and RGD as new sources of data for Monarch, while 65 604 were added from new data available for mouse, zebrafish, nematode and human combined. Our database now has 26 433 models of disease, a 44% increase since our 2017 report in NAR (43). There are 2 982 400 high-quality protein-protein interactions from STRING from 6 species, and 931 518 from BioGRID. Figure 4 summarizes the sources, their data types, the ontologies used for integration and their delivery within Monarch’s knowledge graph.
Publisher Copyright:
© 2019 The Author(s). Published by Oxford University Press on behalf of Nucleic Acids Research.
PY - 2020/1/1
Y1 - 2020/1/1
N2 - In biology and biomedicine, relating phenotypic outcomes with genetic variation and environmental factors remains a challenge: patient phenotypes may not match known diseases, candidate variants may be in genes that haven't been characterized, research organisms may not recapitulate human or veterinary diseases, environmental factors affecting disease outcomes are unknown or undocumented, and many resources must be queried to find potentially significant phenotypic associations. The Monarch Initiative (https://monarchinitiative.org) integrates information on genes, variants, genotypes, phenotypes and diseases in a variety of species, and allows powerful ontology-based search. We develop many widely adopted ontologies that together enable sophisticated computational analysis, mechanistic discovery and diagnostics of Mendelian diseases. Our algorithms and tools are widely used to identify animal models of human disease through phenotypic similarity, for differential diagnostics and to facilitate translational research. Launched in 2015, Monarch has grown with regards to data (new organisms, more sources, better modeling); new API and standards; ontologies (new Mondo unified disease ontology, improvements to ontologies such as HPO and uPheno); user interface (a redesigned website); and community development. Monarch data, algorithms and tools are being used and extended by resources such as GA4GH and NCATS Translator, among others, to aid mechanistic discovery and diagnostics.
AB - In biology and biomedicine, relating phenotypic outcomes with genetic variation and environmental factors remains a challenge: patient phenotypes may not match known diseases, candidate variants may be in genes that haven't been characterized, research organisms may not recapitulate human or veterinary diseases, environmental factors affecting disease outcomes are unknown or undocumented, and many resources must be queried to find potentially significant phenotypic associations. The Monarch Initiative (https://monarchinitiative.org) integrates information on genes, variants, genotypes, phenotypes and diseases in a variety of species, and allows powerful ontology-based search. We develop many widely adopted ontologies that together enable sophisticated computational analysis, mechanistic discovery and diagnostics of Mendelian diseases. Our algorithms and tools are widely used to identify animal models of human disease through phenotypic similarity, for differential diagnostics and to facilitate translational research. Launched in 2015, Monarch has grown with regards to data (new organisms, more sources, better modeling); new API and standards; ontologies (new Mondo unified disease ontology, improvements to ontologies such as HPO and uPheno); user interface (a redesigned website); and community development. Monarch data, algorithms and tools are being used and extended by resources such as GA4GH and NCATS Translator, among others, to aid mechanistic discovery and diagnostics.
UR - http://www.scopus.com/inward/record.url?scp=85077668398&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85077668398&partnerID=8YFLogxK
U2 - 10.1093/nar/gkz997
DO - 10.1093/nar/gkz997
M3 - Article
C2 - 31701156
AN - SCOPUS:85077668398
SN - 0305-1048
VL - 48
SP - D704-D715
JO - Nucleic Acids Research
JF - Nucleic Acids Research
IS - D1
ER -