Casey E. Romanoski & Christopher K. Glass

All the cells in the body contain essentially the same genome, and arise from the progeny of a single fertilized egg. How does each cell type interpret this common set of instructions to achieve its specific identity? The Roadmap Epigenomics Project has tackled this question by defining the epigenomic signatures of a broad spectrum of human tissues and cells undergoing crucial developmental transitions (for an overview2, see page 317). Collectively, these papers and the associated data sets provide an unprecedented resource for understanding relationships between cells and tissues, and for delineating how cell-specific programs of gene expression are achieved.

Only about half of the approximately 25,000 protein-coding genes that make up the mammalian genome are expressed in any given cell type. Although many of these genes are required for general functions and are ubiquitously expressed, others are active in only one or a few cell types, or exhibit different patterns of regulation from cell to cell. A remarkable achievement of the ENCODE Project was the use of epigenomic signatures to infer the existence of hundreds of thousands of enhancer-like regions in the mammalian genome that regulate gene expression at long range. From this vast palate, each cell type is regulated by a subset of perhaps 20,000–40,000 enhancers, which determine its particular gene-expression profile.

Enhancers are activated through interactions with transcription factors, which recognize and bind to specific DNA sequences within the enhancer region. Bound transcription factors recruit co-regulators, many of which deposit or remove modifications on histones. The way in which each cell type interprets genomic information is therefore closely linked to the organization of its DNA regulatory elements. Enhancers that are active in cell-type-specific epigenomic signatures are typically highly enriched in DNA sequences to which lineage-determining and signal-dependent transcription factors bind. Therefore, the delineation of a particular cell's active enhancer repertoire provides a powerful means of predicting the transcription factors required for that cell's identity. By extension, changes in epigenomic signatures during developmental transitions reflect activation or inhibition of such factors.

Four of the papers in this issue2,3,4,5 exploit these relationships to identify combinations of transcription factors that might define different cell types during development. Ziller et al.4 (page 355) modelled neuronal development in vitro, by generating six lineages of neuronal progenitors from embryonic stem (ES) cells, which give rise to almost every cell type of the body. The authors developed computational models to predict the transcription factors that bind to core neural-differentiation enhancers, as well as those that bind enhancers of distinct neural lineages only.

Tsankov et al.5 (page 344) studied the sets of transcription factors that bind to promoters and enhancers in the first three cell lineages that differentiate from ES cells. Sequences bound by transcription factors in one of the three lineages exhibited molecular modifications that promote gene expression, such as loss of DNA methylation. By contrast, the same DNA regions exhibited repressive modifications in the other two cell types. Both Ziller et al. and Tsankov et al. found that regulatory elements controlling genes that are essential for cellular identity are often also epigenetically modified in parental cells, highlighting the importance of existing regulatory landscapes and stage-specific expression of transcription factors for defining the developmental potential of cells.

Nature special issue Epigenome roadmap Aspecial issue nature.com/epigenomeroadmap

Some major caveats should be noted. These studies are based on analysis of cell populations, and therefore miss potentially crucial aspects of cellular variability within populations. When tissues are examined, enhancer landscapes represent the composite of the cell types that make up that tissue, not a pure cell population. Studies10,11 of different populations of white blood cells called macrophages suggest that the tissue environment can shape enhancer landscapes, emphasizing the value of studying purified cell populations from in vivo sources. Finally, although the DNA sequences found in cell-specific enhancers provides clues to the identities of the transcription factors that regulate enhancer activation, functional roles must be validated experimentally. The Roadmap Epigenomics Project has made some efforts along these lines, but the large number of hypotheses generated by the current papers means that this step is largely left for future work.