a, Matching bulk TCR-seq and scTCR-seq clonotypes. An example from the Yost et al.5 dataset is shown to illustrate issues in matching clonotypes across bulk and single-cell technologies. Bulk TCR-seq from Adaptive Biotechnologies immunoSEQ technology yields single 87-base-pair segments of individual β-chains, whereas scTCR-seq from 10x Genomics potentially yields combinations of α- and β-chain CDR3 sequences per clonotype, indicated here by four clonotype IDs and associated sequences in grey boxes. The immunoSEQ output also provides a CDR3 amino acid sequence (bulk-CDR3-aa, rectangle) for productive β-chains, which we used to facilitate matching. We considered clonotypes to match if either β-chain CDR3 from scRNA-seq aligned exactly to the bulk TCR-seq sequence at the nucleotide level, at the position consistent with bulk-CDR3-aa. α-chain CDR3 sequences were disregarded in this process. All four clonotypes shown were therefore considered matches to the bulk TCR-seq sequence. For the purpose of counting T cells, a sum was taken over all matching scRNA-seq clonotypes. Further considerations are provided in Methods. b, Correlation of tumour and blood clone sizes in novel CD8 clones. Scatter plots are shown for each patient in Yost et al.5 that had both single-cell RNA-seq and TCR-seq of tumour-infiltrating lymphocytes in pre- and post-treatment tumours as well as bulk TCR-seq of T cells in blood. Dots represent novel CD8 clones based on the primary post-treatment cluster from the original analysis. Novel clones are plotted by the count of transcripts in pre-treatment blood (resorting to post-treatment blood for bcc.su002, which lacked a pre-treatment blood sample), used as a proxy for blood clone size, and clone size in post-treatment tumour. Vertical bar separates novel clones matching a clonotype in blood (blood-associated, right) from those that did not (blood-independent, left). Two-sided P values are shown for a Pearson’s correlation coefficient r on blood-associated novel clones. Patients are ordered by their total (blood-associated plus blood-independent) number of novel CD8 clones. Two-sided P values are shown from a Fisher’s z-test for the comparison of the correlation coefficient of CD8 novel clones and the correlation coefficient of the CD4 novel clones. c, Correlation of tumour and blood clone sizes in novel CD4 clones. Scatter plots are shown for the novel CD4 clones from patients in a, in corresponding order, as in b. d, Clonal diversity in blood. Scatter plots are shown for the patients in b and c, in corresponding order. Dots represent distinct TCR β-chain rearrangements as provided in the original immunoSEQ analysis, plotted by the numbers of templates reported in pre- and post-treatment blood. For patient bcc.su002, which lacked a pre-treatment blood sample, a one-dimensional strip chart shows the post-treatment TCR repertoire with horizontal jitter added to display points more clearly. Increasing clonal diversity can be observed qualitatively as the increasing presence of clones along the main diagonal, and is quantified using Shannon entropy. e, Completeness curves for blood TCR-seq samples. Each plot shows a sample completeness curve for a bulk TCR-seq sample in d based on a rarefaction and extrapolation analysis38, with pre- and post-treatment samples coloured as shown. Each curve indicates the estimated coverage of the total set of TCR β-chain﻿ rearrangements as a function of the total number of transcripts sampled. Dot indicates the actual number of transcripts sampled, solid lines indicate the interpolated completeness curve, and dashed lines indicate an extrapolation of the completeness curve.