Construction of strains

Strain Sc.A1 was made by replacing the TRP1 gene in BY4741 with the pcbAB-npgA segment from the pESC-npgA-pcbAB plasmid from a previous study12. A URA3 gene from K. lactis was integrated upstream of the pcbAB gene to allow genomic integration and the use of media without uracil was used to enable comparison of the ACV production of this strain with that of a BY4741 strain transformed with the ura-marked pESC-npgA-pcbAB plasmid. A full genetic map of the altered TRP1 locus is provided as an annotated Genbank file in Supplementary Data 1. The PEX5 deletion strain was constructed by CRISPR-enhanced recombination. Linear fragments encoding Cas9, a gRNA retargeted to PEX5 and an overlap extension PCR product encoding a whole-CDS deletion of PEX5 were co-transformed into BY4741 following the protocol outlined at benchling.com/pub/ellis-crispr-tools. Oligonucleotides used retarget the gRNA and generate the deletion template PCR product are listed in the Supplementary Information. To change the native P. chrysogenum Peroxisome Targeting Sequences (PTS1) to those of S. cerevisiae for the pclA and penDE genes, the native C-terminal tripeptide SKI for pclA and ARL for penDE were both changed to the S. cerevisiae PTS1 tripeptide SKL.

All other strains used in all experiments were constructed by transforming plasmids into BY4741 or into the Δpex5 strain. These are specified in Supplementary Table 4. Annotated Genbank files of all plasmids are provided in Supplementary Data 1.

Growth of strains for ACV and penicillin production

For all ACV and penicillin producing experiments, cultures were prepared in the following manner. After initial construction, strains were stored in 25% glycerol stocks at −80 °C. For recovery of strains from glycerol stocks, strains were streaked onto the appropriate selective media agar plates and incubated at 30 °C for 2–3 days. Single colonies were picked using a pipette tip and used to innoculate 4 ml overnight cultures in synthetic complete media minus the appropriate amino acids for selective pressure with either glucose or galactose as the carbon source. There were no secondary precultures. For plate based assays, cells were grown overnight at 700 r.p.m. at 30 °C. For 50 ml falcon tube based assays, cells were grown at 225 r.p.m. at 30 °C. Overnight cultures were then back-diluted into production media and grown at 20 °C (216 r.p.m.) for 20 h (for plate based assays) or until the OD600 reached between 0.6 and 0.8 (for 50 ml falcon tube assays). Supplementary Table 5 details the composition of production media for different experiments.

Fluorescence microscopy

Microscopy for Supplementary Fig. 1 was carried out with a Nikon Eclipse Ti, using the NIS Elements AR software. The objective was set at × 60. Slides were fixed with yeast cells to visualize. The excitation wavelengths for detection of Venus, mRuby2 and mTurquoise2 fluorescence were 535, 590 and 535 nM, respectively.

Preparation of standards and samples for LCMS

For all LCMS experiments, standards were prepared as follows. ACV standards were prepared by dissolving ACV (BACHEM H-4204) in water to a concentration of 10 ng μl−1 and making three 10-fold dilutions. This gave four standards with concentrations of 10 ng μl−1, 1 ng μl−1; 100 pg μl−1; 10 pg μl−1. Benzylpenicillin standards were prepared by dissolving the sodium salt of penicillin G (Sigma P3032) in water. The same concentrations were used for benzylpenicillin as were used for ACV.

Cellular extracts for LCMS for the data in Fig. 1 were prepared as follows: 30 ml of cell culture was collected at an OD600 of 0.6. Cell culture was centrifuged at 7,000g for 10 min, and supernatant was either kept for LCMS (as in Fig. 1d) or discarded. The cell pellet was resuspended in 100 μl methanol. A volume of 50 μl of the resuspension was transferred to a microcentrifuge tube with 25 μl of glass beads (Sigma G8772-100G) on ice. The tube was then vortexed for 30 s and then placed on ice for 30 s, and these two steps were repeated three times (for a total of four sets of vortexing and incubation on ice). The tube was then centrifuged at 12,000g for 30 min, and 40 μl of supernatant was aliquoted to a separate tube for LCMS measurement. Supernatants for LCMS from cultures grown in 96 well plates in Fig. 2 were obtained by centrifuging plates at 3,000g for 30 min.

Liquid chromatography mass spectrometry

An LC/MS/MS method was developed for the measurement of ACV and benzylpenicillin, using an Agilent 1290 LC and 6550 quadrupole time-of-flight (Q-ToF) mass spectrometer with electrospray ionization (Santa Clara, CA). The LC column used was an Agilent Zorbax Extend C-18, 2.1 × 50 mm and 1.8 μm particle size. The LC buffers were 0.1% formic acid in water and 0.1% formic acid in acetonitrile (v/v).

The gradient elution method is detailed in Supplementary Table 6. Quantification was based on the LC retention times of standards and the area of accurately measured diagnostic fragment ion for each molecule (Supplementary Table 6). The protonated molecules of each analyte [M+H]+, were targeted and subjected to collision induced dissociation (collision energy 16 eV), with product ions accumulated throughout the analysis. Solutions of benzylpenicillin and ACV standards in water were used to generate calibration curves.

The linear range of the method was determined by injecting standards over a range of concentrations. The lower limit of detection was determined by the amount a sample resulting in a peak with a signal-to-noise of 3:1. The lower limit of quantification was taken to be the concentration of analyte that produced a signal-to-noise of 10:1. The lower limit of detection and lower limit of quantification for benzylpenicillin were found to be on-column injections of 5 and 20 pg, respectively. Further specifications are found in Supplementary Table 6. All mass spectra are included as Supplementary Data 2.

Calculation of ACV and benzylpenicillin yield from LCMS data

For both ACV and benzylpenicillin, pure chemical standards were run at the following concentrations: 10 ng ml−1, 100 ng ml−1, 10 μg ml−1, 100 μg ml−1. The corresponding LCMS counts for these standards were plotted against the concentrations of the standards and the linear range of the resulting plot was used to construct a line of best fit in excel. The corresponding line equation was used to obtain values for the yield in ng ml−1 of ACV and benzylpenicillin from experimental samples based on the LCMS counts for these molecules.

Promoter screens for optimizing ACV to penicillin conversion

The assembly of multigene (pcbC, pclA, penDE) plasmids with ten randomized promoters (Supplementary Table 7) was split into two stages: assembly of single-gene constructs, then assembly of multigene constructs. For single-gene construct assembly, an equimolar mix of all ten promoters was made with a final concentration of 50 fM (referred to as ‘promoter mix’, Supplementary Table 9). This was used as a type 2 plasmid according to the yeast toolkit specification17. Then, Golden Gate reactions were set up with the following parts according to the Yeast Toolkit cassette plasmid golden gate assembly protocol (Supplementary Table 9).

Each of the three reactions were transformed into E. coli, and for each of the three transformation plates, transformant colonies were mixed together into a single overnight culture each. From each of the three resulting cultures a plasmid library was prepared, of which an aliquot was used to construct a pooled sample for nanopore sequencing (see nanopore sequencing section). The three resulting single-gene plasmid libraries were used to set up a single multigene golden gate reaction according to the yeast toolkit protocol (Supplementary Table 9).

This Golden Gate reaction was transformed into E. coli, and all transformant colonies were mixed together into a single overnight culture. A multigene plasmid library was prepared from this overnight culture. Part of this plasmid library was prepared for nanopore sequencing (see nanopore sequencing section), while 4 μg was used to transform into S. cerevisiae strain Sc.A2. The resulting transformants were screened by LCMS for the production of benzylpenicillin (Supplementary Table 2) and the promoter regions of the multigene plasmids from producer strains were identified by Sanger sequencing.

A second promoter screen was carried out to test the suitability of the S. pyogenes growth inhibition assay for identifying strains with improved benzylpenicillin yield. Strains were constructed in analogous fashion to that described above, but with different promoters (Supplementary Tables 10 and 11).

Nanopore sequencing of library construction

To enrich for the penicillin pathway assembly DNA and remove assembly vector backbone DNA, the multigene assembly library was digested with EcoRI and AlwNI. Restriction digest products ranging from 5,616 to 6,117 bp were isolated by agarose gel electrophoresis and purified using a QIAquick gel extraction kit (Qiagen). The single pathway gene assembly libraries were similarly enriched by digestion with BsmBI and AlwnI. Restriction digest products ranging from 2,062 to 2,229, 2,549 to 2,716 and 1,885 to 2,052 bp for the pcbC, pclA and penDE assemblies, respectively, were purified.

Enriched assembly DNA for the multigene and single gene assemblies was quantified on a Qubit 2.0 fluorometer (Thermo Fisher Scientific) using a Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific). The four samples were combined to a give an equimolar mix (assuming a molecular weight for each assembly based on the mean promoter length) with a total DNA content of 2.6 μg in 45 μl dH 2 O.

DNA underwent end repair using NEBNext FFPE DNA Repair Mix (M6630, New England Biolabs) according to the manufacturer’s instructions. The repaired DNA was recovered using Agincourt AMPure XP beads (A63880, Beckman Coulter), washed twice in 200 μl 70% ethanol and eluted in 46 μl dH 2 O. DNA was then dA-tailed using NEBNext Ultra II End Repair/dA-Tailing Module (E7546, New England Biolabs) according to the manufacturer’s instructions and recovered using Agincourt AMPure XP beads as before, eluting in 30 μl dH 2 O.

The dA-tailed library DNA was then processed using Blunt/TA Ligase Master Mix (M0367, New England Biolabs) and a Nanopore Sequencing Kit (SQK-NSK007, Oxford Nanopore Technologies) according the manufacturer’s instructions to ligate adaptors and tethers to the library. 50 μl Dynabeads MyOne Streptavidin C1 beads were washed twice in buffer BBB (Nanopore Sequencing Kit) and then resuspended in 100 μl BBB. These beads were then added to the processed DNA sample and mixed for 5 min at room temperature. Beads were washed twice with 150 μl BBB before eluting the sample in 25 μl ELB (Nanopore Sequencing Kit). The library was quantified by Qubit as before and yielded a total of 253 ng.

Nanopore sequencing

A fresh MinION R7 Flow Cell Mk I (FLO-MIN104, Oxford Nanopore Technologies) was loaded into a MinION MK I (MIN-MAP002, Oxford Nanopore Technologies) and primed using a Nanopore Sequencing Kit according to the manufacturer’s instructions. The sequencing mix was generated by combining 75 μl RNB, 65 μl NFW and 4 μl Fuel Mix (Nanopore sequencing kit) before adding 6 μl of the processed DNA library. This sequencing mix was loaded into the flow cell and sequenced using the 48 h sequencing script on MinKNOW (Oxford Nanopore Technologies). After 18 h, the script was stopped and a fresh sequencing mix was prepared and loaded into the flow cell. The 48 h sequencing script was then restarted. This reloading process was repeated after a further 4.5 h. The sequencing script was stopped once read acquisition had slowed to less than 1 successful read in a 5 min period.

Analysis of nanopore sequencing data

Oxford Nanopore’s cloud-based Metrichor application ‘2D Basecalling for SQK-MAP006 v1.69’ was used to basecall data from a MinION run that used R7.3 chemistry. Poretools30 (https://github.com/arq5x/poretools) was used to extract sequence files and the program lastal v658 (http://last.cbrc.jp/) was used to align the 2D reads to a database of all the potential promoters and all combined CDS+terminators using options -s2 -T0 -Q0 -a1 -fTAB -e50*. To remove reads corresponding to other DNA sequences present, the database was also populated with sequences for AmpRTerm_AmpR_AmpRProm (part of the single assembly plasmid backbone), His3Prom_His3Term_2Micron_KanRTerm_KanR_KanRProm (part of the multiple assembly plasmid backbone), ColE1 (part of both the single and multiple assembly plasmid backbone), and the lambda phage whole genome.

To build up a picture of which part of a plasmid each read represented, a custom-built script ordered alignments first by read, then by read coordinates. It then identified reads as originating from a particular plasmid by looking at plasmid regions in the database that each read had aligned to. These identified reads were required to be a similar length (within 15%) to the sum of lengths of regions of plasmids that they were aligned to. The script identified digested multiple gene assemblies, digested single gene assemblies and, due to reduced function of BsmBI, non-digested single assemblies. To be identified as a digested single gene assembly, the read was also required to start and finish within 50 bp of the start and end of the first and final regions respectively. This minimized the chance of misidentifying a multigene assembly as a single-gene assembly. The promoters at each position within these identified reads were recorded.

*[-s2 use both query strands. -T0 local alignment. -Q0 use fasta as input. -a1 gap existence cost of 1. -fTAB tabular output -e50 minimum gap alignment score of 50.]

S. pyogenes growth inhibition assays

An overnight culture of S. pyogenes (H584, M1 type) was grown for 24 h at 37 °C (5% CO 2 ) in Todd Hewitt Broth. This culture was diluted in 10 × Todd Hewitt Broth to an optical density of OD600 0.2. In separate wells of an optically transparent 96-well plate (VWR 3596) 10 μl of this solution was added to 90 μl of either a known concentration of benzylpenicillin dissolved in 0.25 mM phenylacetic acid or 90 μl of spent culture media from Sc.P2 or Sc.P2x or other potential benzylpenicillin-producing strains grown in production conditions. Sc.P2x is a variant of Sc.P2 known to not produce benzylpenicillin due an inactive pcbAB caused by mutation in the coding region of the gene’s terminal module region. OD600 values from the bacterial cultures were then measured again after overnight growth at 37 °C. The percentage growth inhibition caused by spent culture media for each strain was calculated by dividing the fold-change in OD600 overnight caused by that spent culture media by the overnight OD600 fold-change caused by the control media (from Sc.P2x cells) and subtracting this from 100%.

Code availability

Custom code for assigning nanopore sequencing reads to promoters in the first promoter screen (Fig. 2) is provided as part of Supplementary Data 3.

Data availability

Plasmid construct sequences are provided in Supplementary Data 1, LCMS spectra are available in Supplementary Data 2 and nanopore sequencing files are provided as Supplementary Data 3.