Dark Matter


Dark matter in physics can be defined as that material in the universe that is thought to exist to account for the estimated mass or dynamic behavior of the universe.  An early definition in physics referred to non-luminous material.  The early definition of dark matter in cell biology, in analogy with physics, referred to the amount of genomic sequence which might code for the expression of protein estimated to be 1-2% (Ponting, C.P., The functional repertoires of metazoan genomes, Nat.Rev.Genet. 9, 689-698, 2009 while other studies have suggested higher values (Levitt, M., Nature of the protein universe, Proc.Natl.Acad.Sci. USA 106, 11079-11084, 2009; Scaiewicz, A. and Levitt, M., The language of the protein universe, Curr.Opin.Genet. 35, 50-56, 2015).   Earlier work defined short proteins without homology matches (dissimilar to any known protein) as dark matter (Frith, M.C., Forrest, A.R., Nourbakhsh, E., et al., The abundance of short proteins in the mammalian proteome, PLoS Genet. 2:e52, 2006).  Microbial dark matter is a term used to designate genetic information from uncultured bacterial phyla (candidate phyla)(Rinke, C., Schwientek, P., Sczyrba, A., et al., Insights into the phylogeny and coding potential of microbial dark matter, Nature 499, 431-437, 2013; Solden, L., Lloyd, K., and Wrighton, K., The bright side of microbial dark matter: lessons learned from the uncultivated majority, Curr.Opin.Microbiol. 31, 217-226, 2016). It is estimated that there are 1,500 bacterial phyla of which less than 100 have been cultivated (Yarza, P., Yilmaz, P., Pruesse, E., et al., Uniting the classification of cultured and uncultured bacteria and archaea using 16S rRNA gene sequences, Nat.Rev.Microbiol. 12, 635-645, 2014). The concept of dark matter can be considerably expanding beyond genomic expression to include  ionic species(e.g. metal ions) and weakly interacting systems forming transient complexes (Ross, J.L., The dark matter of biology, Biophys.J. 111, 909-916, 2016).   Another approach to dark matter is based on the number of possible protein folds relative to the number of such folds described (Taylor, W.R., Chelliah, V., Hollup, S.M., et al., Probing the “dark matter” of protein fold space, Structure 17, 1244-1252. 2009).    It is noted that dark matter may reflect mRNA and protein with very fast catabolism (Baboo, S. and Cook, P.R., “Dark matter”worlds of unstable RNA and protein, Nucleus 5, 281-286, 2014).    The production of non-coding RNA has been shown to another genomic function separate from the production of coding RNA.  The amount of non-coding RNA is equal to or larger than the amount of coding RNA (mRNA, polyA RNA)(Kapronov, P., St.Laurent, G., Raz, T., et al., The majority of total nuclear-encoded non-ribosomal RNA in a human cell is ‘dark matter’ un-annotated RNA, BMC Biology 8:149, 2010).    More recent work suggests that 75% of the genome is transcribed into RNA (Djebali, S., Davis, C.A., Merkel, A., et al., Landscape of transcription in human cells, Nature 489. 101-108, 2012).  The investigation of various RNA species including long, non-coding RNA (lncRNA) as dark matter is of interest in oncology research (Evans, J.R., Feng, J.Y., and Chinnaiyan, A.M., The bright side of dark matter: lncRNAs in cancer, J.Clin.Invest. 126, 2775-2782, 2015: Ling, H., Vincent, K., Pichler,M., et al., Junk DNA and the long non-coding RNA twist in cancer genetics, Oncogene 34, 5003-5011, 2015;  Diederichs, S., Bartsch, L., Berkmann, J.C.,et al., The dark matter of the cancer genome: aberrations in regulatory elements, untranslated regions, splice sites, non-coding RNA and synonymous mutations, EMBO Mol.Med. 6, 442-457, 2016;;Ling, H., Girnita, L., Buda, O. and Calin, G.A., Non-coding RNAs: the cancer genome dark matters?, Clin.Chem.Lab.Med. 55, 705-714, 2017). See dark proteome