Generation of Mouse NSCLC Cell Lines with Cas9-GFP The lentiviral Cas9-NLS-FLAG-2A-EGFP (lentiCas9-EGFP) plasmid was co-transfected into HEK293FT cells (Invitrogen R700-07) with lentiviral packaging plasmids psPAX2 and pMD2.G (Addgene 12260 and 12259). HEK293FT cells were cultured in high-glucose DMEM (Invitrogen 10566-024) + 10% FBS (Hyclone) (hereafter referred to as “D10” media) and seeded in a 15 cm culture dish the day before transfection such that they would be 80%–90% confluent at the time of transfection. Two hours before transfection, the media was replaced with 15 ml of prewarmed OptiMEM (Invitrogen 51985-091). The transfection mix was aspirated after 6 hr and replaced with fresh prewarmed DMEM + 10% FBS. Viral particles were harvested 48 hr after this media change and frozen at −80C. Chen et al., 2014 Chen S.

Xue Y.

Wu X.

Le C.

Bhutkar A.

Bell E.L.

Zhang F.

Langer R.

Sharp P.A. Global microRNA depletion suppresses tumor angiogenesis. A mouse NSCLC cell line with genotype KrasG12D/+;p53−/−;Dicer1+/− (KPD cell line) was derived as described in () and cultured in D10. This cell line was transduced with the lentiCas9-EGFP virus at a low MOI (MOI < 0.01). GFP-positive cells were sorted as single cells into 96-well plates and cultured as clonal cell lines. Multiple clonal lines were established and genotyped by PCR. Lines with 100% GFP-positive cells were kept and those with segregating GFP expression were discarded ( Figure S1 A). FLAG-Cas9 expression was confirmed by antibodies against FLAG (Sigma) or Cas9 (Diagenode) ( Figure S1 B). This established the Cas9-GFP KPD cell lines.

Pooled Guide-Only Library Cloning and Viral Production To produce virus, the mGeCKOa pooled plasmid was co-transfected into HEK293FT cells (Invitrogen R700-07) with lentiviral packaging plasmids psPAX2 and pMD2.G (Addgene 12260 and 12259). HEK293FT cells were cultured in D10 media and seeded in T-225 flasks the day before transfection such that they would be 80%–90% confluent at the time of transfection. Two hours before transfection, the media was replaced with 13 ml of prewarmed OptiMEM (Invitrogen 51985-091). For transfection of each T-225 flask, 200ul of Plus reagent (Invitrogen 11514-015) was diluted into 4 mL of OptiMEM and then the following DNA was added: 20 μg mGeCKOa, 15 μg psPAX2, 10 μg pMD2.G. Separately, 100ul of Lipofectamine 2000 was diluted into 4 mL of OptiMEM, briefly vortexed, and incubated at room temperature for 5 min. After incubation, the Plus+DNA and Lipofectamine 2000 (Invitrogen 11668019) mixtures were combined, briefly vortexed, and incubated at room temperature for 20 min. The mixture was then gently added to the T-225 flask with 13 ml OptiMEM. All media was aspirated after 6 hr and replaced with fresh prewarmed DMEM + 10% FBS. Viral particles were harvested 48 hr after this media change and frozen at −80°C.

Pooled Library Transduction into Mouse NSCLC Cell Line The virus was titered by spinfection of 3x106 Cas9-GFP KPD cells per well in a 12-well plate with different dilutions of the virus (and no virus control) in each well. After adding virus, cells were spun at 2000 rpm for 2 hr at 37°C and then placed into the incubator overnight. The next day 1.5x105 cells from each viral concentration were plated into two replicate wells of a 6-well plate. At 24 hr post-transduction, 2 μg/ml puromycin (Sigma) was added to one of the replicate wells. Cells not transduced with the library (no virus control) did not survive past 24 hr with puromycin at 2 μg/ml. After 72 hr, cells were counted in all wells to determine the viral volume that results in 20%–40% of cells surviving in puromycin. Assuming infection events occur independently, this corresponds to a MOI of 0.2-0.5 and a single-infection percentage (SIP) of 77% (at 40% puromycin survival) to 89% (at 20% puromycin survival). survival ), as shown below. The probability of a cell being infected by n viral particles if the MOI is m is given by the Poisson distribution, P ( n ) = m n ⋅ e − m n !

The SIP is calculated directly from the puromycin survival (p), as shown below. The probability of a cell being infected by n viral particles if the MOI is m is given by the Poisson distribution, p s u r v i v a l = P ( n > 0 ) = 1 − P ( n = 0 ) = 1 − e − m

and the observed percent of cells surviving puromycin selection is survival , we get: S I P = P ( n = 1 ) P ( n ≥ 1 ) = P ( n = 1 ) P ( n > 0 ) = − ( 1 − p s u r v i v a l ) In ( 1 − p s u r v i v a l ) p s u r v i v a l

Solving for the SIP as a function of p, we get: For each experimental replicate of the pooled screen, a total of 1.2x108 cells were infected at MOI ∼0.4, and selected with puromycin at 2 μg/ml for 7 days. MOI was calculated with an in-line control using a similar procedure as given in the previous paragraph. Infected cells were expanded under puromycin selection for 7 days and split every 2-3 days. After 7 days, 3x107 cells were spun down and frozen for genomic DNA extraction. At the same time, 1.7x108 cells were washed twice in sterile PBS and resuspended at 5x107 cells/ml in PBS for transplantation. In the three infection replicates of our primary screen, the MOI was on average 0.4 ± 0.02, which ensures that over 80% of the cells surviving puromycin selection received only one sgRNA-expressing lentiviral integrant.

Genomic DNA Extraction from Cells and Mouse Tissues After a hands-on comparison of several commercial DNA extractions kits, such as the QIAamp Blood Midi/Max (QIAGEN), the Quick-gDNA MidiPrep (Zymo) and Puregene (QIAGEN), we find that our homemade salt precipitation method provides consistent, high-quality yields with a low-cost and simple protocol. We describe this protocol below. For gDNA extraction from either 100-200 mg of frozen ground tissue or 3x107 - 5x107 frozen cells, the same procedure is used. For different amounts of tissue or cells, the quantities were scaled proportionally. In a 15 ml conical tube, 6 ml of NK Lysis Buffer (50 mM Tris, 50 mM EDTA, 1% SDS, pH 8) and 30 μl of 20 mg/ml Proteinase K (QIAGEN 19131) were added to the tissue/cell sample and incubated at 55°C overnight. The next day, 30 μl of 10 mg/ml RNase A (QIAGEN 19101, diluted in NK Lysis Buffer to 10 mg/ml and then stored at 4°C) was added to the lysed sample, which was then inverted 25 times and incubated at 37°C for 30 min. Samples were cooled on ice before addition of 2 ml of pre-chilled 7.5M ammonium acetate (Sigma A1542) to precipitate proteins. Stock solutions of 7.5M ammonium acetate was made in sterile dH 2 O and kept at 4°C until use. After adding ammonium acetate, the samples were vortexed at high speed for 20 s and then centrifuged at ≥ 4,000 × g for 10 min. After the spin, a tight pellet was visible in each tube and the supernatant was carefully decanted into a new 15 ml conical tube. Then 6 ml 100% isopropanol was added to the tube, inverted 50 times and centrifuged at ≥ 4,000 × g for 10 min. Genomic DNA was visible as a small white pellet in each tube. The supernatant was discarded, 6 ml of freshly prepared 70% ethanol was added, the tube was inverted 10 times, and then centrifuged at ≥ 4,000 × g for 1 min. The supernatant was discarded by pouring; the tube was briefly spun, and remaining ethanol was removed using a P200 pipette. After air drying for 10-30 min, the DNA changed appearance from a milky white pellet to slightly translucent. At this stage, 500 μl of 1x TE buffer (Sigma T9285) was added, the tube was incubated at 65°C for 1 hr and at room temperature overnight to fully resuspend the DNA. The next day, the gDNA samples were vortexed briefly. The gDNA concentration was measured using a Nanodrop (Thermo Scientific).

sgRNA Library Readout by Deep Sequencing lentiGuide-PCR1-F CCCGAGGGGACCCAGAGAG

lentiGuide-PCR1-R GCGCACCGTGGGCTTGTAC The sgRNA library readout was performed using two steps of PCR, where the first PCR includes enough genomic DNA to preserve full library complexity and the second PCR adds appropriate sequencing adapters to the products from the first PCR The sgRNA library for each sample (plasmid, genomic DNA from cells and tissues) was amplified and prepared for Illumina sequencing using the following two-step PCR procedure. For PCR#1, a region containing sgRNA cassette was amplified using primers specific to the sgRNA-expression vector (lentiGuide-PCR1-F, and lentiGuide-PCR1-R): All PCR was performed using Phusion Flash High Fidelity Master Mix (Thermo). For PCR#1, the thermocycling parameters were: 98°C for 30 s, 18-24 cycles of (98°C for 1 s, 62°C for 5 s, 72°C for 35 s), and 72°C for 1 min. In each PCR#1 reaction, we used 3 μg of gDNA. For each sample, the appropriate number of PCR#1 reactions was used to capture the full representation of the screen. For example, at ∼400x coverage of our 67,405 mGeCKOa sgRNA library, gDNA from 3x107 cells was used. Assuming 6.6 pg of gDNA per cell, ∼200 μg of gDNA was used per sample. Since 3 μg of gDNA was used per 100 μl PCR#1 reaction, each biological sample required at least 67 PCR#1 reactions. 4 constructs in the library) using 10 μl of the pooled PCR#1 product per PCR#2 reaction. PCR#1 products for each biological sample were pooled and used for amplification with barcoded second PCR primers ( Table S3 ). For each sample, we performed at least 7 PCR#2 reactions (One 100 μl reaction per 10constructs in the library) using 10 μl of the pooled PCR#1 product per PCR#2 reaction. Second PCR products were pooled and then normalized for each biological sample before combining uniquely barcoded separate biological samples. The pooled product was then gel purified from a 2% E-gel EX (Life Technologies) using the QiaQuick kit (QIAGEN). The purified pooled library was then quantified with dsDNA High-Sensitivity Qubit (Life Technologies) and/or a gel-based method using the Low-Range Quantitative Ladder Life Technologies). Diluted libraries with 5%–20% PhiX were sequenced with MiSeq, Hiseq 2000 or HiSeq 2500 (Illumina).

sgRNA Deep Sequencing Data Processing Martin, 2011 Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. Langmead et al., 2009 Langmead B.

Trapnell C.

Pop M.

Salzberg S.L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Subramanian et al., 2005 Subramanian A.

Tamayo P.

Mootha V.K.

Mukherjee S.

Ebert B.L.

Gillette M.A.

Paulovich A.

Pomeroy S.L.

Golub T.R.

Lander E.S.

Mesirov J.P. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Deep sequencing data were processed for sgRNA representation using custom scripts. Briefly, sequencing reads were first demultiplexed using the 8 bp barcodes in the reverse primer, and then demultiplexed using the 8 bp barcodes in the forward primer. Demultiplexed reads were trimmed using cutadapt (), leaving only the 20bp spacer (guide) sequences. The spacer sequences were then mapped to the spacers of designed sgRNA library using bowtie (). For mapping, a maximum of one mismatch was allowed in the 20 bp spacer sequence. Mapped sgRNA spacers were then quantified by counting the total number of reads. In each biological sample, any sgRNA spacer with only a single read was filtered out. The total numbers of reads for all sgRNAs in each sample were normalized. Figures were generated using the normalized read counts in R and RStudio (R project, Revolution Analytics), Matlab (Mathworks), and the Gene Set Enrichment Analysis tool (Broad Institute) ().

Individual sgRNA Design, Lentiviral Transduction, and In Vivo Transplantation sgRNAs targeting individual genes/miRs were cloned into the lentiGuide-Puro vector (Addgene 52963). Single sgRNA viruses were generated by transfection of HEK293FT using the same procedure as described for the mGeCKOa library virus. After harvest, viruses were functionally titered (percent puromycin survival) and used to transduce Cas9-GFP NSCLC cells at a MOI < < 1 for all transductions. Cells were maintained in D10 media (with 2 μg/ml puromycin added at 24 hr post-transduction) and split 1:4 every 2-3 days. Hsu et al., 2013 Hsu P.D.

Scott D.A.

Weinstein J.A.

Ran F.A.

Konermann S.

Agarwala V.

Li Y.

Fine E.J.

Wu X.

Shalem O.

et al. DNA targeting specificity of RNA-guided Cas9 nucleases. 6 cells per mouse. Each gene or microRNA was targeted with three independent sgRNAs that went into validation experiments, with two mice injected per sgRNA. A total of 6 mice were used to validate each gene or miRNA. After five weeks, mice were sacrificed and primary tumors and lungs were dissected. Histology samples were collected and analyzed in a similar manner as described above for the mGeCKOa screen. After 3 days in culture, a portion of the cells infected with each single sgRNA virus was collected in QuickExtract buffer (Epicenter) and subjected to amplicon-sequencing (Illumina) for indels, as described previously (). After 7 days in culture, another portion of the cells was collected in RIPA buffer for western blot of protein levels using antibodies against Nf2 or Pten (CST). sgRNAs that were efficient in generating indels or in reducing protein levels were chosen for in vivo experiments. After 7 days in culture, cells were injected subcutaneously into the right side flank of Nu/Nu mice at 5x10cells per mouse. Each gene or microRNA was targeted with three independent sgRNAs that went into validation experiments, with two mice injected per sgRNA. A total of 6 mice were used to validate each gene or miRNA. After five weeks, mice were sacrificed and primary tumors and lungs were dissected. Histology samples were collected and analyzed in a similar manner as described above for the mGeCKOa screen.

Validation and Control Minipool Pooled Cloning and Viral Transduction Sanjana et al., 2014 Sanjana N.E.

Shalem O.

Zhang F. Improved vectors and genome-wide libraries for CRISPR screening. Shalem et al., 2014 Shalem O.

Sanjana N.E.

Hartenian E.

Shi X.

Scott D.A.

Mikkelsen T.S.

Heckl D.

Ebert B.L.

Root D.E.

Doench J.G.

Zhang F. Genome-scale CRISPR-Cas9 knockout screening in human cells. 6 Cas9-GFP KPD cells per well in a 12-well plate with different dilutions of the virus (and no virus control) in each well to find what viral volume was need to yield a 20%–40% survival after treatment for 48 hr with 2 μg/ml puromycin (Sigma). Validation and control minipools were synthesized in a single oligonucleotide pool using a semiconductor-based electrochemical detritylation synthesis (CustomArray) and SAFC Proligo reagents (Sigma). Each minipool (Tables S5, S6) was separately PCR amplified from the pooled synthesis, gel purified, and cloned into the lentiGuide-Puro vector using Gibson assembly, as previously described (). Cloned minipool libraries were deep-sequenced to verify representation (Illumina) and lentivirus was produced in the same manner as for the mGeCKOa library. As before, validation and control minipool library viruses were titered by spinfection of 3x10Cas9-GFP KPD cells per well in a 12-well plate with different dilutions of the virus (and no virus control) in each well to find what viral volume was need to yield a 20%–40% survival after treatment for 48 hr with 2 μg/ml puromycin (Sigma). Using these viral volumes, the same clonal Cas9-GFP KPD cell line as used for the mGeCKOa screen was transduced via spinfection with either control or validation minipool virus. Representation for both screens (validation and control minipool) was above 1,000-fold. Cells were maintained in D10 media with 2 μg/ml puromycin and split 1:4 every 2-3 days. After 7 days in culture, cell samples were taken for genomic DNA extraction or for transplantation (3x107 cells per mouse with 5 replicate mice). After five weeks, mice were sacrificed, and primary tumors and lungs were dissected. Primary tumor growth, metastasis phenotype scoring and genomic DNA extraction were performed in a similar manner as described above for the mGeCKOa primary screen.

TCGA Data Analysis For lung cancer types in TCGA (LUAD, lung adenocarcinoma; LUSC, lung squamous cell carcinoma), mRNaseq and clinical data were retrieved from TCGA data portal ( https://tcga-data.nci.nih.gov/tcga/ ) and via Firehose from the Broad Institute Genome Data Analysis Center ( http://gdac.broadinstitute.org ). Grouping of patients with metastatic tumors or non-metastatic tumors was performed according to the pathological TNM staging. Patient samples with a pathologic_m value of “m1” were considered to have a metastatic tumor since metastases were detected in distant organs (beyond regional lymph nodes); and patients with a value of “m0” were considered to have a non-metastatic tumor since metastases were not detected. Cases were excluded where the status of metastases in distant organs cannot be evaluated (“mx”) or where such data are missing (“NA”). Human orthologs of mouse genes were inferred using the orthologous gene table from mouse genome informatics ( http://www.informatics.jax.org ). The mRNA expression level (as normalized RNA-Seq by Expectation-Maximization, RSEM value) for each human ortholog in patients with metastatic and non-metastatic tumors were retrieved from the mRNA-seq data matrix and compared.

CTC Chip Design and Capture Chung et al., 2013 Chung J.

Issadore D.

Ullal A.

Lee K.

Weissleder R.

Lee H. Rare cell isolation and profiling on a hybrid magnetic/size-sorting chip. We designed a microfluidic device for CTC capture similar to one previously described (). This device has two functions: magnetic depletion of leukocytes, and capture of CTCs based on size (5 - 30 μm). The microfluidic system was fabricated with standard soft lithography and boned with glass substrate. Each CTC chip contains a large number of capturing sites (> 10,000). Peripheral blood was collected from mice that had been injected with cells five weeks prior, using the terminal cardiac puncture method. The blood samples were depleted of red blood cells (RBC) using red blood cell lysis buffer (ACK Lysis Buffer, GIBCO). After RBC lysis, the remaining cells, comprised of mostly CTCs and leukocytes, were fixed with 2% formaldehyde. Next, each sample was incubated for 15 min in room temperature with 20μL of CD45 specific magnetic beads (CD45 Microbeads, Miltenyi Biotec) to label leukocytes, and then introduced to the device to run through the chip at flow rate of 2-5 ml/hour. The magnetically labeled leukocytes were depleted in the magnetic depletion region and CTCs were captured in the size-based capture region. The captured CTCs were imaged under a fluorescence microscope with 15-17 images tiling the whole CTCs capture region for each sample. GFP-positive cells were counted across all images for each chip using custom Matlab scripts.

Standard Molecular Biology Routine DNA cloning, nucleic acid purification, and western blotting were performed using standard molecular biology protocols with commercially available kits unless otherwise noted. Antibodies used for western blot analysis: Anti-FLAG (Sigma #F1804, 1:3000), anti-Cas9 (Diagenode #C15200203, 1:1000), anti-Nf2 (Cell Signaling Technologies #6995, 1:1000), anti-Pten (Cell Signaling Technologies #9559, 1:1000).