(A) Aggregated raw Reb1 tag distribution around 1,058 primary Reb1 motif midpoints determined in this study. The 5′ ends of unshifted tags on the forward (blue) and reverse strands (red, inverted scale) were plotted separately.

(B) Frequency distribution of Reb1 peak-pair distances for the 1,058 primary peak-pairs determined in this study (see Extended Experimental Procedures ).

(C) Primary and secondary Reb1-bound locations within clusters. (a) Bar graph showing the number of Reb1 peak pairs for 462 clustered primary, 596 isolated primary, and 718 secondary bound locations that contain the indicated motifs. A cluster is defined as a peak pair that is located within 100 bp of another peak pair. A secondary location is defined as less-occupied locations within 100 bp of another Reb1 location. (b) A stacked bar graph depicting the number of primary Reb1 peak-pairs (n = 1,058) that contain the indicated sequence. (c) Frequency distribution of the distances between Reb1 peak-pair midpoints and their motif midpoint for 462 primary (clustered), 596 primary (isolated), and 718 secondary bound locations. (d) Frequency distribution of Reb1 peak-pair distances for 462 primary (clustered), 596 primary (isolated), and 718 secondary bound locations. Distances were binned in 2 bp intervals, and the bins smoothed using a 3 bin moving average.

(D) Occupancy level for primary and secondary Reb1-bound locations. (a) For the indicated type of motif, or pooled motifs (excluding TTACCCT in the pool) having the indicated number of mismatches to TTACCCG, the median tag count is reported. Note that the number of locations being evaluated within each bar markedly differ. For example, very few secondary locations use TTACCCG. (b) Median tag count per peak pairs of 462 clustered primary (cyan), 596 isolated primary (orange), and 718 secondary sites (green). (c) Distribution of isolated primary (orange trace) Reb1 bound locations around TSS. The gray filled background plot indicates the distribution of nucleosomes for those genes having isolated primary Reb1 bound. (d) Illustration of weak/strong binding to weak/strong sites. (e) Telomeric regions used TTACCCT 60% (62/104) and ACACCCA 16% (17/104) of the time instead of TTACCCG. These sequences resemble telomeric repeat sequences (CCCTGTAACC and ACACCCACACA). (f) Browser shots of a Reb1-bound location having 2–3 mismatches to the consensus. Tags were not shifted.

(E) Example of primary and secondary Reb1-bound locations at YTM1 promoter. Green arrows indicate Reb1 motifs. Three distinct Reb1-binding locations within ∼120 bp YOR271C1-YTM1 intergenic region were resolved. Vertical blue and red bars demarcate the 5′ ends of forward and reverse strand tags, respectively, shifted in the 3′ direction by 14 bp.

(F) Secondary Reb1-bound locations. Smoothed frequency distribution of the distances between adjacent Reb1-bound locations within clusters. A cluster is defined as a peak pair that is located within 100 bp of another peak pair.

Venters and Pugh, 2009 Venters B.J.

Pugh B.F. A canonical promoter organization of the transcription machinery and its regulators in the Saccharomyces genome. Harbison et al., 2004 Harbison C.T.

Gordon D.B.

Lee T.I.

Rinaldi N.J.

Macisaac K.D.

Danford T.W.

Hannett N.M.

Tagne J.B.

Reynolds D.B.

Yoo J.

et al. Transcriptional regulatory code of a eukaryotic genome. Venters and Pugh, 2009 Venters B.J.

Pugh B.F. A canonical promoter organization of the transcription machinery and its regulators in the Saccharomyces genome. (G) False discovery evaluation of prior ChIP-chip and current ChIP-seq datasets. (a) The Venn diagram illustrates the type of analysis being reported in the color-matched bar graphs. For the purposes of this analysis, we define false positives as those regions identified as Reb1-bound in one or more other studies but not by ChIP-exo (i.e., putative erroneous calls) and false negatives as those regions identified by ChIP-exo, but not by the other studies (putative missed calls). True positives and negatives are determined as the regions (bound genes or peaks, depending on the resolution of the dataset) that are within the indicated distance from a ChIP-exo Reb1 peak-pair midpoint. (b) Overlap of putative false positives obtained from ChIP-chip (5 bp probe density Affymetrix tiling arrays from;) and the ChIP-seq study presented here. The p value indicates that the probability that these regions were selected at random is nearly 90%. An overlap occurred if a chip-chip peak was <100 bp from a Reb1 ChIP-seq-defined peak-pair. (c) Reb1-occupied genes in Harbison et al. and this study were identified using 6,219 Pol II-transcribed genes. Comparisons performed at various p value thresholds (0.005, 0.05, 0.1, and 0.2, defined by) are shown. The description in the main text refers to the 0.05 threshold. (d) A similar analysis was performed using the Affymetrix high density tiling array Reb1 data of (), with the various threshold defined as the top-most occupied locations (hybridization signal within peaks). An overlap occurred if a chip-chip peak was <250 bp from a Reb1 ChIP-exo-defined peak-pair. (e) Reb1-bound peak pairs in ChIP-seq were selected in terms of highest tag counts. Comparisons were performed for Reb1-bound locations with the various thresholds defined as the top-most occupied locations (tag count within peaks). The distance threshold was 100 bp, which as in panel c, reflects broadness (and uncertainty) of the peaks in the published datasets. (f–h) Reb1 motif count for the various categories of locations defined in the color-matched bars in the panel immediately above each graph. False positives (putative erroneous calls) in the published studies tended to have low motif count, although by random chance degenerate versions of the motif can be found in the broad distances allowed (100–250 bp), all of which is consistent with them being erroneous calls. False negatives (putative missed calls) in the published studies tended to have high motif count and a low count of degenerate motifs, which provides evidence that they were missed calls. (i and j) Occupancy levels are reported for those regions defined in the panels above them. Note that false positives and false negatives tended to have low occupancy levels. Thus, false negatives likely arose in previous studies due to high thresholding of peak heights (occupancy level).

(H) Gene Ontology (GO) analysis of Reb1-bound genes clustered by transcription frequency. p values for enrichment of GO terms above the significance threshold of 2.27 x 10−03 are indicated by the color scale (GO Term Finder on SGD). Genes were divided into three groups by transcription frequency.

Koerber et al., 2009 Koerber R.T.

Rhee H.S.

Jiang C.

Pugh B.F. Interaction of transcriptional regulators with specific nucleosomes across the Saccharomyces genome. (I) Reb1-nucleosomal interactions. Venn diagram depicting the overlap between nucleosomes defined as Reb1-interacting from the study of Koerber et al. () and the peak pairs identified in this study. 176 Reb1 locations identified by ChIP-exo overlapped with 165 Reb1-interacting nucleosomes identified from Koerber et al. (some nucleosomes encompassed multiple Reb1 locations).

(J) Evidence that nucleosome binding by Reb1 does not alter its detection properties. (a) Frequency distribution of Reb1 intra-peak-pair distances for the two classes of Reb1 locations determined in this study. Note that if Reb1 was preferentially crosslinking to histones, it should display a different peak-pair distance compared to non-nucleosomal Reb1. (b) Nucleosomal Reb1 has a higher incidence of strong Reb1 motifs, which might explain why nucleosomal Reb1 displays higher occupancy levels ( Figure S2 I). (c) Nucleosomal Reb1 is detected at the same distance from its motif as non-nucleosomal Reb1. If nucleosomal interactions were altering detection, then the standard deviation of distances for nucleosomal Reb1 is expected to increase.

(K) Model for how primary and secondary sites may be placed on a nucleosome.