Table S2. Analysis Data, Related to Figures 1, 2, 3, 4, 5, 6, and 7

Root-to-tip tab: The dates for EBOV and LASV strains used for root-to-tip versus date of collection calculations. The individual strains used for rooting of the trees are marked with “1.”

CAI tab: CAI of NP-luciferase fusion proteins used in this study. CAI was calculated for the fusion-proteins containing NP 1–699 fragments fused to either firefly luciferase (fLuc) or gaussia luciferase (gLuc). Two of the NP 1–699 fragments were codon optimized before being fused to luciferase.

Epitopes tab: Substituting major for minor iSNV alleles reduces epitope scores. All samples containing nonsynonymous iSNVs in predicted GPC B cell epitopes are shown. The epitope start position refers to the amino acid position of the consensus sequence, with the major iSNV allele. Replacing the major iSNV allele for the minor allele did not change the predicted epitope start position in the majority of cases: only four epitopes were offset by 2–10 amino acids but still overlapped by at least ten amino acids with the epitope predictions based on the consensus sequences. The epitope score assigned by BCPREDS ranges from 0 to 1, with higher scores denoting higher confidence predictions. Here, scores are multiplied by 1,000 for clarity, and epitopes that were no predicted after replacing the major with the minor iSNV allele were assigned a score of zero. The last column (sign test) indicates the direction of the change in score after swapping major and minor iSNVs: + denotes major > minor; – denotes minor > major. A sign test was performed using the direction of the change to assess whether minor iSNVs tended to increase or decrease epitope scores. Only epitopes containing iSNVs are shown (all other epitope predictions were unchanged).

tMRCA - models tab: Key statistics of selected BEAST analyses run as part of this study. The final analyses were run on the matched dataset, whereas parameter comparisons were run on the batch 1 dataset (hpd, 95% highest posterior density interval; srd06, separate partitions for the first and second codon positions combined and the third codon position; 3pos, separate partitions for the first, second, and third codon; BSP, Bayesian skyline; exp, exponential growth).

tMRCA - BSREL tab: Modeling branch-specific rate variation gives tMRCA estimates similar to those identified under a GTRγ nucleotide substitution model. We calculated tMRCAs for key nodes under a model that allows for site- and branch-specific variation in selective pressure, as it has been shown that molecular dating methods may severely underestimate the ages of viral strains in the presence of purifying selection (Wertheim and Kosakovsky Pond, 2011; Wertheim et al., 2013).Under this model, we again found our estimates to be consistent. Starting with our maximum-likelihood phylogenies constructed using RAxML under the GTRγ nucleotide model, we re-estimated the branch lengths across the phylogeny using a branch site random effects likelihood model (BS-REL), implemented in the HyPhy package (Kosakovsky Pond et al., 2011; Pond et al., 2005) (batch 1 dataset). We compared the node heights of the most recent common ancestors of key lineages on the phylogeny under the GTRγ nucleotide model and with the branch lengths re-estimated under the BS-REL model. The estimates of time in years use the median clock rate from the final BEAST analysis (9.63E-4 subst/site/year, L segment; 9.56E-4 subst/site/year, S segment).