See the full-size poster

Philosophy of Biology

Conservation and Function in Comparative Genomics

Stephan Guttinger and Alan C. Love

Abstract

A central tool in comparative genomics is sequence alignment. This makes it possible to identify stretches of DNA that exhibit different degrees of similarity across closely and distantly related organisms. In much of this work, phylogenetic conservation of sequence structure (i.e., homology) is a proxy for genomic function. After the Human Genome Project, there was growing interest in moving from structural or descriptive genomics to functional genomics. The goal of the ENCyclopedia Of DNA Elements (ENCODE) project was to identify and catalogue all the functional elements or active structures of the human genome. A parallel project attempted to catalogue shared functional elements of genomes across model organisms (modENCODE).

 

The key methodological question for both projects was how to identify functional elements in the first place (Kellis et al. 2014). As noted, an evolutionary approach uses comparative analysis of sequence similarity to identify sequence conservation across species, which is treated as a proxy for the functional relevance of those genomic elements. A biochemical approach focuses on signatures of activity that functional elements leave behind as proxies for the elements themselves. ENCODE primarily utilized the biochemical approach. modENCODE ended up doing something different and distinctive: it fused the biochemical and evolutionary approaches to genomic function and adopted an increased abstraction about what counts as a genomic property.

 

This was a novel methodological maneuver. ENCODE focused on functional elements of the genome (i.e., structures), but modENCODE shifted to general regulatory principles (i.e., abstract functional rules). Evolutionary conservation is combined with biochemical activity to isolate shared relational functional properties—not elements or structures—of metazoan genomes. “Quantitative relationships among chromatin state, transcription, and cotranscriptional RNA processing are deeply conserved” (Brown and Celniker 2015). This methodological shift by modENCODE introduced a theoretical tension: the notion of conservation applies straightforwardly to sequence-based functional elements (i.e., structures), but less clearly to quantitative functional relationships. These rules are better described as prerequisites for how genomes operate rather than outcomes of evolutionary conservation.

 

modENCODE identified distinctive physicochemical rules rather than mechanistic structure. These abstract, quantitative relationships are disconnected from modENCODE’s stated goal of discovering “how the information encoded in a genome can produce a complex multicellular organism” (Celniker et al. 2009). Although modENCODE advanced our knowledge of how the genome works, it was relatively mute about the translation of genomic form into organismal complexity. The original research question was transformed in the process of inquiry: from detecting functional elements in the genome that contribute to organismal phenotypes, to identifying properties or rules of the genome that make it possible to function.

 

References

 

Brown, J.B. and S.E. Celniker. 2015. Lessons from modENCODE. Annual Review of Genomics and Human Genetics 16:31–53.


Celniker, S.E., L.A.L. Dillon, M.B. Gerstein, K.C. Gunsalus, S.Henikoff, et al. 2009. Unlocking the secrets of the genome. Nature 459:927–930.


Kellis, M., B. Wold, M.P. Snyder, B.E. Bernstein, A. Kundaje, et al. 2014. Defining functional DNA elements in the human genome. Proceedings of the National Academy of Sciences 111:6131–6138.