a, The distribution of full-length protein size of genes that generate CLL-IPAs (n = 306) and B-IPAs (n = 2,690) is shown in amino acids. Box plots are as in Fig. 1e. P = 0.87, two-sided Mann–Whitney U-test. b, TR rate (ratio of TR mutations compared to total mutations) is shown for known TSGs obtained previously5. Box plots are as in Fig. 1e. P = 1 × 10−155, two-sided Mann–Whitney U-test. c, Known TSGs, obtained previously5 that are targeted by CLL-IPAs (n = 21) are shown. Dark green bars indicate the fraction of retained CDRs for each IPA-generated protein. Black dots indicate the hot spot positions of TR mutations obtained from MSK cbio portal. CLL-IPAs mostly occur upstream or within 10% (of overall amino acid length) of the mutations. P = 0.04, two-sided Wilcoxon rank-sum test. d, Contingency table for enrichment of TSGs among genes that generate CLL-IPAs. P value was obtained from two-sided Fisher’s exact test. TSGs were obtained previously5. e, TSGs and genes that generate CLL-IPA isoforms have longer CDRs than genes that do not generate IPA isoforms. Box plots are as in Fig. 1e. P = 1 × 10−80, two-sided Kruskal–Wallis test. f, Five control gene lists (n = 306, each) with a similar size distribution as CLL-IPAs and expressed in CLL were tested for enrichment of TSGs. Shown is the number of TSGs found. A χ2 test did not show a significant enrichment of TSGs among the control genes. g, Contingency table for enrichment of TR mutation genes in CLL among genes that generate CLL-IPAs. P value was obtained from two-sided Fisher’s exact test. h, ZMYM5 is truncated by a TR mutation and an IPA isoform in the same patient, but the aberrations are predicted to result in different truncated proteins. A 10-bp deletion in exon 3 results in a frameshift leading to the generation of a truncated ZMYM5 protein, whereas ZMYM5 IPA (not yet annotated) produces a truncated protein containing 352 more amino acids in the same patient. The genes shown in h and i are the only genes with simultaneous presence of a TR mutation and CLL-IPA out of n = 268 tested. The position of the TR mutation is indicated in green. CLL7 and CLL11 3′-seq and RNA-seq tracks are shown for comparison reasons. i, MGA is truncated by a TR mutation and an IPA isoform in the same patient. The TR mutation affects the 5′ splice site of intron 7, thus generating two additional amino acids downstream of exon 7, whereas the IPA isoform encodes a truncated MGA protein containing three more amino acids downstream of exon 9. Mutation and 3′-seq analysis were performed once. CLL7 and CLL11 are shown for comparison reasons. j, Shown are additional recurrent (n > 1) DNA mutations found by exome sequencing of CLL patient samples stratified by a high or low number of CLL-IPAs per patient. Only the top and bottom 16 samples with high or low CLL-IPAs are shown to normalize the number of samples analysed. This analysis is only descriptive and no test was performed. k, Significant enrichment of SF3B1 mutations in the group of CLL samples with abundant CLL-IPA isoforms. Two-sided Mann–Whitney U-test was performed. l, Abundance of CLL-IPAs is not associated with IGVH mutational status. Shown is the number of CLL-IPAs per sample for patients with mutated (MUT, n = 30) or unmutated (UN, n = 21) IGVH genes. Box plots are as in Fig. 1e. P = 0.4, two-sided Mann–Whitney U-test. Source data