The A. annua cytochrome b 5 cDNA sequence was identified from a trichome expressed sequence tag library (NCBI accession 35608) by searching for sequence similarity to Crepis alpina cytochrome b 5 type 11 (ref. 23). Dominant selection markers for yeast strain engineering were d-serine24, nourseothricin25 and hygromycin B25.

Yeast strain engineering

S. cerevisiae codon-optimized synthetic genes of A. annua ADS, CYP71AV1 and CPR1 have been previously described3. Codon-optimized synthetic genes for A. annua CYB5 (GenBank accession JQ582841), A. annua ALDH1 (JQ609276) and A. annua ADH1 (JQ582842) were synthesized by DNA 2.0 (https://www.dna20.com/) or Biosearch Technologies.

Construction of genome integration cassettes

The oligonucleotide primers used in this study are listed in Supplementary Table 9.

dsdA-P CTR3 -ERG9

Replacement of the MET3 promoter with the CTR3 promoter in Y301 and Y592 was accomplished as follows. The dsdA gene (encoding d-serine deaminase) was amplified from pAM577 (containing the promoter and terminator of Kluyveromyces lactis TEF1) by PCR amplification with oligonucleotides PW91-031-CPK275-G and DE_PW91-027-CPK262-G. PCR amplification of the wild-type CTR3 promoter from positions −1 to −734 was performed with oligonucleotides PW61-104-CPK116-G and DE_PW91-027-CPK263-G using CEN.PK2-1C (ref. 3) genomic DNA as the template. These two PCR products shared a 44-base-pair (bp) overlap at the 3′ end of the promoter and the 5′ end of the gene. For the secondary PCR, 25 ng each of the purified CTR3 promoter (−1 to −734) and dsdA PCR were used as the DNA templates and PCR amplified with oligonucleotides PW91-031-CPK275-G and PW61-104-CPK116-G to give P CTR3(−1 to −734) -dsdA.

gal1/10/7::natA_P GAL3 -CPR1-T CYC1

Targeted integration of the P GAL3 -CPR1 expression cassette in Y657 at the GAL7 locus was accomplished as follows. PCR amplification of the wild-type GAL7 locus from positions 30 to 1021 was performed with oligonucleotides PW91-014-CPK236-G and PW-91-079-CPK384-G using CEN.PK2-1C genomic DNA as the template. PCR amplification of the CPR1 ORF and CYC1 terminator was performed with oligonucleotides PW-91-079-CPK385-G and PW-91-079-CPK392-G using plasmid pAM322 (ref. 3) as the template. PCR of the wild-type GAL3 promoter from positions −1 to −660 was performed with oligonucleotides PW-91-079-CPK393-G and PW-91-079-CPK394-G using CEN.PK2-1C genomic DNA as the template. PCR of the natA marker (nourseothricin resistance) was performed with oligonucleotides PW-91-079-CPK383-G and PW-91-079-CPK395-G. Each of the DNA elements from the first round of PCR was designed to share a 20–30-bp overlap with the adjacent element, by using non-templated tails on the oligonucleotides. For the secondary PCR amplification, 25 ng each of the purified GAL7, CPR1-CYC1, GAL3 promoter and natA PCR products were used as the DNA template and PCR amplified with oligonucleotides PW91-014-CPK236-G and PW-91-079-CPK383-G to give GAL7(30 to 1021)_P GAL3 (−1 to −660) -CPR1-T CYC1 _natA.

leu2::hisMXΔ::kanA_P GAL7 -CYB5-T CYC1

Targeted replacement of the leu2::hisMX locus in Y657 was accomplished as follows. PCR amplification of the wild-type ERG19 locus from positions 489 to 1341 was performed with oligonucleotides AM/PW-91-093-CPK461-G and AM/PW-91-093-CPK462-G using CEN.PK2-1C genomic DNA as the template. PCR amplification of the kanA marker (G418 resistance) was performed with oligonucleotides AM/PW-91-093-CPK460-G and AM-125-50-CPK514-G using pAM575 (containing the promoter and terminator of K. lactis TEF1) as the template. PCR amplification of the GAL7 promoter from positions −1 to −725 was performed with oligonucleotides AM-125-50-CPK513-G and AT-126-103-CPK593-G using CEN.PK2-1C genomic DNA as the template. PCR amplification of the S. cerevisiae codon-optimized A. annua CYB5 ORF was performed with oligonucleotides AT-126-103-CPK592-G and PW-91-093-CPK426-G. PCR amplification of the CYC1 terminator (T CYC1 ) from positions 331 to 830 was performed with oligonucleotides PW-91-093-CPK425-G and AT-126-103-CPK595-G using CEN.PK2-1C genomic DNA as the template. PCR amplification of the LEU2 locus from positions 1 to 450 was performed with oligonucleotides AT-126-103-CPK594-G and AM/PW-91-093-CPK457-G using CEN.PK2-1C genomic DNA as the template. For the secondary PCR, 25 ng each of the purified ERG19(489 to 1341), kanA, P GAL7(−1 to −725) , A. annua CYB5, T CYC1(331 to 830) , and LEU2(1 to 450) PCR products were used as the DNA template and PCR amplified with oligonucleotides AM/PW-91-093-CPK462-G and AM/PW-91-093-CPK457-G to give ERG19(489 to 1341)_kanA_P GAL7(−1 to −725) -CYB5-T CYC1(331 to 830) _LEU2(1 to 450).

ndt80Δ::P TDH1 -HEM1_hphA_P PGK1 -CTT1

Targeted replacement of the NDT80 locus in Y1368 with constitutively expressed CTT1 and HEM1 was accomplished as follows. PCR amplification of the wild-type NDT80 locus from positions −187 to −951 was performed with oligonucleotides PW-091-144-CPK640-G and PW-091-144-CPK654-G using CEN.PK2-1C genomic DNA as the template. PCR of the wild-type HEM1 locus from positions 1 to 1947 was performed with oligonucleotides PW-091-144-CPK655-G and PW-091-144-CPK656-G using CEN.PK2-1C genomic DNA as the template. PCR of the TDH1 promoter (P TDH1 ) from positions −1 to −577 was performed with oligonucleotides PW-091-144-CPK657-G and PW-091-144-CPK658-G using CEN.PK2-1C genomic DNA as the template. PCR of the hphA marker was performed with oligonucleotides PW-091-144-CPK659-G and PW-091-144-CPK643-G using BY4710 (ref. 4) genomic DNA as the template. PCR of the PGK1 promoter (P PGK1 ) from positions −1 to −623 was performed with oligonucleotides PW-091-144-CPK644-G and PW-091-144-CPK645-G using CEN.PK2-1C genomic DNA as the template. PCR of the CTT1 locus from positions 1 to 2000 was performed with oligonucleotides PW-091-144-CPK646-G and PW-091-144-CPK647-G using CEN.PK2-1C genomic DNA as the template. PCR of the wild-type NDT80 locus from positions 1684 to 2470 was performed with oligonucleotides PW-091-144-CPK648-G and PW-091-144-CPK649-G using CEN.PK2-1C genomic DNA as the template. For the secondary PCR, 25 ng each of the purified NDT80(−187 to −951), HEM1(1 to 1947), P TDH1(−1 to −577) , hphA, P PGK1(−1 to −623) , CTT1(1 to 2000) and NDT80(1684 to 2470) PCR products were used as the DNA template and PCR amplified with oligonucleotides PW-091-144-CPK640-G and PW-091-144-CPK649-G to give NDT80(−187 to −951)_P TDH1(−1 to −577) -HEM1(1 to 1947)_ hphA_ P PGK1(−1 to −623) - CTT1(1 to 2000)_ NDT80(1684 to 2470).

ndt80::hphA_P GAL7 -ALDH1-T TDH1

Targeted replacement of the NDT80 locus with P GAL7 -ALDH1 in Y973 was accomplished as follows. PCR amplification of the wild-type NDT80 locus from positions −187 to −951 was performed with oligonucleotides PW-091-144-CPK640-G and PW-091-144-CPK641-G using CEN.PK2-1C genomic DNA as the template. PCR amplification of the hphA marker was performed with oligonucleotides PW-091-144-CPK642-G and AM-125-50-CPK514-G using pAM578 pAM575 (containing the promoter and terminator of K. lactis TEF1) as the template. PCR amplification of the GAL7 promoter (P GAL7 ) from positions −1 to −725 was performed with oligonucleotides AM-125-50-CPK513-G and AM-125-107-CPK756-G using CEN.PK2-1C genomic DNA as the template. PCR amplification of the S. cerevisiae codon-optimized ALDH1 ORF was performed with oligonucleotides AM-125-107-CPK754-G and AM-125-107-CPK755-G. PCR amplification of the TDH1 terminator (T TDH1 ) from positions 1000 to 1997 was performed with oligonucleotides AM-125-107-752G and AM-125-107-CPK753-G using CEN.PK2-1C genomic DNA as the template. PCR amplification of the wild-type NDT80 locus from positions 1684 to 2470 was performed with oligonucleotides AM-125-107-CPK751-G and PW-091-144-CPK649-G using CEN.PK2-1C genomic DNA as the template. For the secondary PCR amplification, 25 ng each of the purified NDT80(−187 to −951), hphA, P GAL7(−1 to −725) , A. annua ALDH1, T TDH1(1000 to −1997) and NDT80(1684 to 2470) PCR products were used as the DNA templates and PCR amplified with oligonucleotides PW-091-144-CPK640-G and PW-091-144-CPK649-G to give NDT80(−187 to −951)_ hphA_P GAL7(−1 to −725) -ALDH1-T TDH1(1000 to −1997) _ NDT80(1684 to 2470).

his3::hphA_P GAL7 -ALDH1-T TDH1

Targeted replacement of the his3::hisMX locus with P GAL7 -ALDH1 to create Y1368 was accomplished as follows. PCR amplification of the wild-type HIS3 locus from positions −32 to −630 was performed with oligonucleotides PW-91-129-CPK543-G and PW-91-129-CPK544-G using BY4710 genomic DNA as the template. PCR of the hphA marker was performed with oligonucleotides PW-91-129-CPK545-G and AM-125-50-CPK514-G using pAM578 as the template. PCR amplification of the GAL7 promoter (P GAL7 ) from positions −1 to −725 was performed with oligonucleotides AM-125-50-CPK513-G and AM-125-107-CPK756-G using CEN.PK2-1C genomic DNA as the template. PCR amplification of the A. annua ALDH1 ORF was performed with oligonucleotides AM-125-107-CPK754-G and AM-125-107-CPK755-G using a synthetic, S. cerevisiae codon-optimized template (DNA2.0). PCR amplification of the TDH1 terminator (T TDH1 ) from positions 1000 to 1997 was performed with oligonucleotides PW-191-015-CPK859-G and AM-125-107-CPK753-G using CEN.PK2-1C genomic DNA as the template. PCR amplification of the wild-type ERG12 locus from positions 883 to 1456 was performed with oligonucleotides PW-191-015-CPK860-G and PW-91-129-CPK550-G using CEN.PK2-1C genomic DNA as the template. For the secondary PCR amplification, 25 ng each of the purified HIS3(−32 to −630), hphA, P GAL7(−1 to −725) , A. annua ALDH1, T TDH1(1000 to −1997) and ERG12(883 to 1456) PCR products were used as the DNA templates and PCR amplified with oligonucleotides PW-91-129-CPK543-G and PW-91-129-CPK550-G to give NDT80(−187 to −951)_ hphA_P GAL7 (−1 to −725) -ALDH1-T TDH1(1000 to −1997) _ERG12(883 to 1456).

natAΔ::URA3_P GAL7 -ADH1-T TDH1

Targeted replacement of the natA locus with P GAL7 -ADH1 for creation of Y1283 was accomplished as follows. PCR amplification of the wild-type GAL3 promoter (P GAL3 ) from positions −77 to −660 was performed with oligonucleotides PW-191-015-CPK866-G and PW-191-015-CPK867-G using BY4710 genomic DNA as the template. PCR amplification of the URA3 locus from position −226 to 884 was performed with oligonucleotides PW-191-015-CPK868-G and PW-191-015-CPK869-G using CEN.PK2-1C genomic DNA as the template. PCR amplification of the GAL7 promoter (P GAL7 ) from positions −1 to −725 was performed with oligonucleotides PW-191-015-CPK870-G and PW-191-015-CPK871-G using CEN.PK2-1C genomic DNA as the template. PCR amplification of the S. cerevisiae codon-optimized A. annua ADH1 ORF was performed with oligonucleotides PW-191-015-CPK872-G and PW-191-015-CPK873-G. PCR amplification of the TDH1 terminator (T TDH1 ) from positions 1000 to 1750 was performed with oligonucleotides PW-191-015-CPK874-G and PW-191-015-CPK875-G using CEN.PK2-1C genomic DNA as the template. PCR amplification of the wild-type GAL1 locus from positions 1637 to 2436 was performed with oligonucleotides PW-191-015-CPK876-G and PW-191-015-CPK877-G using CEN.PK2-1C genomic DNA as the template. For the secondary PCR amplification, 25 ng each of the purified P GAL3(−77 to −660) , URA3(−226 to 884), P GAL7(−1 to −725) , A. annua ADH1, T TDH1(1000 to −1750) and GAL1(1637 to 2436) PCR products were used as the DNA templates and PCR amplified with oligonucleotides PW-191-015-CPK866-G and PW-191-015-CPK877-G to give P GAL3(−77 to −660) _URA3(−226 to 884)_ P GAL7(−1 to −725) - ADH1-T TDH1(1000 to −1750) -GAL1(1637 to 2436).

gal80Δ::URA3_P GAL7 -ADH1-T GAL80

Targeted replacement of the GAL80 locus with P GAL7 -ADH1 to create Y1284 was accomplished as follows. PCR amplification of the wild-type GAL80 locus from positions −28 to −760 was performed with oligonucleotides PW-191-015-CPK882-G and PW-191-015-CPK883-G using CEN.PK2-1C genomic DNA as the template. PCR amplification of the URA3 locus from position −226 to 884 was performed with oligonucleotides PW-191-015-CPK884-G and PW-191-015-CPK869-G using BY4710 genomic DNA as the template. PCR of the GAL7 promoter (P GAL7 ) from positions −1 to −725 was performed with oligonucleotides PW-191-015-CPK870-G and PW-191-015-CPK871-G using CEN.PK2-1C genomic DNA as the template. PCR amplification of the A. annua ADH1 ORF was performed with oligonucleotides PW-191-015-CPK872-G and PW-191-015-CPK873-G using a synthetic S. cerevisiae codon-optimized template (DNA2.0). PCR amplification of the wild-type GAL80 locus from positions 1320 to 2117 was performed with oligonucleotides PW-191-015-CPK886-G and PW-191-015-CPK887-G using CEN.PK2-1C genomic DNA as the template. For the secondary PCR amplification, 25 ng each of the purified GAL80(−28 to −760), URA3(−226 to 884), P GAL7(−1 to −725) , A. annua ADH1, and GAL80(1320 to 2117) PCR products were used as the DNA templates and PCR amplified with oligonucleotides PW-191-015-CPK882-G and PW-191-015-CPK887-G to give GAL80(−28 to −760)_URA3(−226 to 884)_P GAL7(−1 to −725) -ADH1-GAL80(1320 to 2117).

All strains were confirmed with diagnostic PCR to contain the expected integration constructs and, where appropriate, all integrations were verified by sequence analysis.

Cloning and characterization of A. annua ADH1

Analysis of a previously developed A. annua EST collection12,26,27 identified a contig corresponding to an apparently full-length ORF encoding a putative trichome-expressed alcohol dehydrogenase. The corresponding gene, designated A. annua ADH1, was associated with 2.2%, 1.3% and 0.06% of ESTs in the ‘trichome-minus-flower-bud’ (designated GSTSUB in ref. 12), glandular trichome (designated AAGST12) and flower bud (designated AAFB12) collections, respectively. Similarly, based on the generation of expressed sequence tags by 454 sequencing of A. annua28, the gene expression pattern of ADH1 was found to be comparable to that of CYP71AV1 in a range of tissues, with negligible expression in cotyledons and mature leaf trichomes and 0.21, 0.53 and 0.03% of sequences in each of the EST collections derived from young leaf trichomes, flower bud trichomes and meristem/young leaf, respectively. A full-length ADH1 ORF was cloned by reverse transcriptase PCR (RT–PCR) (using oligonucleotide primers PSC1 and PSC2 and the vector pENTR/D TOPO (Invitrogen)). The A. annua ADH1gene has an ORF encoding a polypeptide of 378 amino acids with a relative molecular mass of 40,415 daltons. On the basis of sequence similarities, A. annua ADH1 is a member of the medium chain alcohol dehydrogenase/reductase superfamily that is related to predicted proteins of Populus trichocarpa (61% identity, GenBank accession XP_002324694) and Cynara cardunculus (72% identity over 214 amino acids, GenBank accession GE588275).

The A. annua ADH1 ORF was subcloned into the pET15b vector modified to contain a PreScission protease cleavage site. The vector containing the A. annua ADH1 ORF was used to transform the E. coli BLR (DE3). Protein expression was induced by adding 0.4 mM isopropyl-β-d-thiogalactoside (IPTG) and cultures totalling 5 l were incubated at 16 °C for 16 h. ADH1 was subjected to two rounds of Ni-column purification, analysis by SDS–PAGE and dialysis. The final fractions containing ADH1 were pooled and dialysed against protein storage buffer (20 mM Tris-HCl, pH 8.0, 200 mM NaCl, 200 mM KCl, 10% glycerol and 1 mM dithiothreitol (DTT)). Protein concentration was determined by Bradford assay and aliquots were stored at −80 °C. ADH1 purity by SDS–PAGE was judged to be 95%.

Unless otherwise stated, ADH1 enzyme assays included 50 mM Tris buffer, pH 8.5, 250 mM NaCl, 0.4 mg ml−1 BSA, 50 μM substrate, 1 mM NAD, 3 μg of octadecane (as internal standard; Sigma-Aldrich) and 80 ng of recombinant A. annua ADH1 in a total volume of 200 μl. Negative controls were carried out in the absence of NAD. Reactions were allowed to proceed for 4 min at 30 °C with shaking (500 r.p.m.), and immediately stopped by extraction with 500 μl pentane. All quantitative analyses were done with 3–6 technical replicates per treatment. Pentane extracts were concentrated to ∼30 μl under a stream of nitrogen and either 10 μl ethyl acetate or 10 μl of a mixture of 1:1 N,O-bis-(trimethylsilyl)acetamide (Sigma-Aldrich)/pyridine (Fluka) was added. The remainder of the pentane was carefully removed under a stream of nitrogen and the final 10-μl sample was analysed by gas chromatography–mass spectrometry (GC–MS)26.

Substrate specificity was determined in 15-min, 600-μl assays using (+)-borneol (Fluka), (−)-borneol (Fluka) and artemisinic, dihydroartemisinic, artemisia26, coniferyl (Sigma-Aldrich), and cinnamyl (Sigma-Aldrich) alcohols. ADH1 only showed considerable dehydrogenase activity with artemisinic alcohol and to a lesser extent with dihydroartemisinic alcohol (4.2% relative to artemisinic alcohol; Supplementary Fig. 5). The identity of the aldehyde products was confirmed by GC–MS in comparison to authentic standards. When assayed with artemisinic alcohol and NAD, recombinant A. annua ADH1 showed a pH optimum of 8.5. Using 1 mM NADP as the cofactor, oxidation of artemisinic alcohol by ADH1 was 30-fold lower (Supplementary Fig. 5). The linear range of the ADH1 assay with respect to time was tested by reactions under standard assay conditions except varying the time up to 30 min. The pH optimum of the purified ADH1 was determined to be 8.5 based on a series of 15-min assays with the pH range from 5.5 to 10 in intervals of 0.5-pH units using 50 mM citrate, phosphate, Tris, CHES and CAPS buffers. Kinetic parameters were determined by varying the concentrations of artemisinic alcohol (3.0–25 μM) in 600-μl assays using 240 ng ADH1. Substrate solubility prevented the use of higher concentrations. Octadecane was used as an internal standard to quantify the substrate and product from the reactions by gas chromatography using response factors determined by using known concentrations of standards. Kinetic constants were determined by fitting the data to the Michaelis–Menten equation using nonlinear regression and EnzFitter software (Biosoft).

Media and growth conditions

Fermentation media: The media used for this work were based on media described previously29. The trace metal solution contained 5.75 g l−1 ZnSO 4 ·7H 2 O, 0.32 g l−1 MnCl 2 ·4H 2 O, 0.47 g l−1 CoCl 2 ·6H 2 O, 0.48 g l−1 Na 2 MoO 4 ·2H 2 O, 2.9 g l−1 CaCl 2 ·2H 2 O, 2.8 g l−1 FeSO 4 ·7H 2 O and 80 ml l−1 0.5 M EDTA, pH 8.0. The vitamin solution contained 0.05 g l−1 biotin, 1 g l−1 calcium pantothenate, 1 g l−1 nicotinic acid, 25 g l−1 myo-inositol, 1 g l−1 thiamine HCl, 1 g l−1 pyridoxal HCl and 0.2 g l−1 p-aminobenzoic acid. The batch medium for all fermentations contained 19.5 g l−1 glucose, 15 g l−1 (NH 4 ) 2 SO 4 , 8 g l−1 KH 2 PO 4 , 6.2 g l−1 MgSO 4 ·7H 2 O, 12 ml l−1 vitamin solution and 10 ml l−1 trace metal solution.

The batch medium also contained additional components depending on the strain and the process being run. For the glucose and ethanol mixed-feed process, for all strains except Y285, CuSO 4 was added to the batch medium to a concentration of 0.25 μM CuSO 4 . For Y285, the batch medium contained 20 μM CuSO 4 .

The bioreactor feed media also varied for the different processes and for different strains. All mixed glucose/ethanol processes used bioreactor feed base that contained 386g l−1 glucose, 9 g l−1 KH 2 PO 4 , 5.12 g l−1 MgSO 4 ·7H 2 O, 3.5 g l−1 K 2 SO 4 , 0.28 g l−1 Na 2 SO 4 and 237 ml l−1 ethanol (95% v/v).

For the glucose and ethanol mixed feed process, two different feed media were prepared for the fermentation: pre-induction-feed media and induction-feed media (includes small molecule inducers and repressors). For both feed media, stock solutions of vitamins and trace metals were added to the bioreactor feed base as follows: 12 ml vitamin solution per litre feed base, and 10 ml trace metals solution per litre feed base. The pre-induction-feed media contained 0.25 μM CuSO 4 and no other additions. For fermentation of Y285, the pre-induction-feed media contained 20 μM CuSO 4 .

To the induction-feed medium, two different inducers/repressors were added to the medium dependent on the strain. For Y285, concentrated solutions of galactose and methionine were added to the induction-feed medium to bring the final concentrations to 10 g l−1 galactose and 1 g l−1 methionine. For all other strains, concentrated solutions of galactose and CuSO 4 were added to the feed medium to bring the final concentrations to 10 g l−1 galactose and 150 μM CuSO 4 . All additions to the medium were made in a sterile hood.

Shake-flask media: Seed medium for pre-culture was the batch fermentation medium modified with the addition of 100 ml l−1 succinate buffer (0.5 M, pH 5.0). For strain Y285, the concentration of CuSO 4 in the seed medium was 20 μM. For all other strains tested (which contain a copper-repressible promoter controlling ERG9 expression), low-copper seed medium, containing only 0.25 μM CuSO 4 , was used.

Flask-production medium was modified seed media which contained 40 g l−1 glucose, 5 g l−1 galactose, 1.7 mM methionine and 150 μM CuSO 4 .

Shake-flask methods

To prepare seed vials, single isolates of each strain from agar plates were grown for 18–24 h in 20-ml low-copper seed medium containing 0.25 μM CuSO 4 . Cultures were then inoculated at an attenuance (D 600 nm ) of 0.05 into fresh low-copper seed medium and grown for a further 18–24 h to an D 600 nm of between 2 and 3 (measured using a Thermo Scientific Genesys 10 Vis spectrophotometer). Six-hundred microlitres of this culture was added to 400 μl of 50% glycerol and stored in 1-ml aliquots (20% glycerol (v/v) final) at −80 °C.

To acclimate cells before inoculation into production medium, frozen seed vials were thawed to room temperature and inoculated into 20-ml low-copper seed medium. Cultures were grown for 18–24 h at 30 °C with shaking at 200 r.p.m. The next day, the cultures were diluted to a D 600 nm of 0.05 in 20-ml low-copper seed medium and grown for ∼18 h at 30 °C with shaking at 200 r.p.m.

Cells from the second overnight acclimation were diluted to a D 600 nm of 0.05 in 250 ml unbaffled flasks containing 25 ml flask-production medium. Flasks contained an additional 5 ml IPM where indicated. All cultures were inoculated in triplicate and incubated at 30 °C for 72 h with shaking at 200 r.p.m. in a humidified Innova incubator. Flasks were sampled periodically for growth (D 600 nm ), viability and product titres. Viability was measured using the LIVE/DEAD Funga Light yeast viability kit for flow cytometry (Invitrogen Corporation) and a Guava technologies EasyCyte Plus flow cytometer.

For the production of insoluble artemisinic acid in shake-flask cultures, 15.8 g l−1 of 95% ethanol was added to the production flask after 72, 96 and 120 h growth. The flask was inspected for formation of insoluble material after 144 h.

Glucose and ethanol mixed-feed process

Preparation of seed cultures and procedures for setting up and running glucose and ethanol mixed-feed fermentations have been described3. The production process was induced with the addition of 10 g l−1 galactose and 0.25 g l−1 methionine (for Y285), or 10 g l−1 galactose and 150 μM CuSO 4 (all other strains) to the bioreactor after the culture reached an D 600 nm value of approximately 50. At this time, the feed bottle containing pre-induction-feed medium was exchanged for a feed bottle containing induction-feed medium.

Mixed glucose/ethanol feed process with early induction/repression

The mixed glucose/ethanol feed process was modified by changing the time of induction (the time of addition of galactose and methionine, or galactose and high CuSO 4 ) from the time the culture reached a D 600 nm of 50 to the time of inoculation. The mixed glucose/ethanol feed process with early induction/repression was identical to the mixed feed process except that batch and feed media were modified. Immediately before inoculation, concentrated solutions of galactose, methionine and/or CuSO 4 were added to the batch medium to bring the final concentrations to 10 g l−1 galactose and 0.25 g l−1 methionine (for Y285), or 10 g l−1 galactose and 150 μM CuSO 4 (for all other strains). Only the induction-feed medium was used in the fed-batch phase of the process (no pre-induction medium). All other parameters were the same as the mixed glucose/ethanol feed process.

At later time points (42–96 h), in some fermentation runs of strains Y1283 and Y1284, artemisinic acid precipitated from the liquid fermentor broth. Solid precipitate was visible in the fermentor and adhered onto the side of the bioreactor and the head plate. During the runs, artemisinic acid concentration was still assayed from fermentor broth samples over the course of the fermentation as described below. However, for select fermentations of Y1284, artemisinic acid was also assayed at the end of the fermentation after complete solubilization of the precipitate by high pH treatment of the fermentor broth. At the end of the fermentation, the culture was adjusted to pH 8.1 with 10 M NH 4 OH and allowed to stir at 1,500 r.p.m. (maximum rpm) at 30 °C for at least 1 h to dissolve the precipitated artemisinic acid. After the pH adjustment, the cell broth was collected and water was added the tank to wash any residual precipitate from the tank. The water was adjusted to pH 9.1 with 10 M NH 4 OH and allowed to stir at 1,200 r.p.m. overnight. Together, this provided a more accurate measurement of artemisinic acid at the final time point.

Mixed glucose/ethanol feed process with induction/repression and IPM

The addition of an IPM phase to yeast cultures was tested at the fermentor scale using strains Y1283 and Y1284. The fermentation process used was the mixed glucose/ethanol feed process with early induction/repression. The only process change was the addition of 200 ml IPM to the fermentor before inoculation. The initial aqueous batch volume of the fermentations was 0.7 l.

The IPM phase and the aqueous cell broth formed a well-mixed emulsion in the reactor at later times in the fermentation (>24 h). To assay artemisinic acid titre in fermentations with IPM, samples of the combined IPM and cell broth mixture/emulsion were extracted with solvent. The mixture was first vortexed then added to the methanol/formic acid as described below. The concentrations measured by liquid chromatography with ultraviolet detection are reported in terms of grams per litre total volume (aqueous cell broth plus IPM). Using the ratio of aqueous cell broth volume to IPM volume at the time of sampling, the titres are converted to terms of grams per litre aqueous volume to allow for direct comparison with runs that do not use IPM.

At 30 °C, artemisinic acid has a solubility of approximately 100–115 g l−1 in IPM (empirically determined). At later times in the fed-batch fermentation, after a large volume of aqueous feed has been added to the fermentor, the ratio of IPM/aqueous volume is significantly lower and the solubility limit could restrict additional production.

Ethanol pulse-feed process with IPM

Yeast strain Y1284 was tested in an ethanol pulse-feed process with the addition of IPM to the culture medium. The temperature, pH and dissolved oxygen were controlled at the set points described above. The batch medium for this process was the same as for the glucose/ethanol mixed-feed process with early induction, described above, except that no galactose was added (Y1284 does not require galactose for induction). Four-hundred millilitres of IPM was added to a starting aqueous fermentor volume of 0.8 l, before inoculation.

The feed for the ethanol pulse-feed fed-batch phase of the process was 95% (v/v) ethanol. Because none of the salts, trace metals or vitamins was soluble in 95% (v/v) ethanol, concentrated feed components were combined into a concentrated post-sterile addition (PSA) solution. The concentrated PSA solution consisted of 72.9 g l−1 KH 2 PO 4 , 41.4 g l−1 MgSO 4 ·7H 2 O, 28.3 g l−1 K 2 SO 4 , 2.3 g l−1 Na 2 SO 4 , 1.2 mM CuSO 4 , 10 ml l−1 trace metals solution and 12 ml l−1 vitamin solution. The concentrated PSA solution was injected through a septum in the bioreactor head plate with a syringe once per day according to how much volume of 95% (v/v) ethanol volume had been delivered since the previous addition of feed components. One-hundred-and-twenty-four millilitres of concentrated PSA solution was added per litre of 95% (v/v) ethanol added.

After the batch carbon was consumed (detected as described above) the ethanol pulse-feed algorithm was initiated. As the culture grew and consumed O 2 , dissolved O 2 was maintained at 40% by an agitation cascade followed by oxygen enrichment (as described above). In the first phase of the fed-batch fermentation, before the stir rate of the reactor reached the maximum allowed for the unit, the pulse feed algorithm used stir rate (Stir) measurements to control ethanol feed delivery (Supplementary Fig. 14a). The computer algorithm assigned a variable (Stir Max) that tracked the maximum stir rate obtained so far in the process. While growing on ethanol, O 2 demand increased and stir rate increased until the substrate was depleted from the fermentor medium. At that point, the dissolved O 2 increased and the controller decreased stir rate to maintain dissolved O 2 = 40%. When Stir decreased to less than 75% of the value of Stir Max, the ethanol feed pump was activated for the length of time necessary to add 10 g ethanol per litre fermentor volume to the reactor. The computer algorithm calculated the time necessary to add 10 g ethanol per litre fermentor volume to the reactor (Timer Max) after each cycle. The first phase of the algorithm iterated unit O 2 enrichment as required.

After the stir rate of the reactor reached the maximum allowed for the unit, oxygen enrichment was used to maintain dissolved O 2 = 40%. During this stage of the fed-batch fermentation, the second phase of the control algorithm was initiated. Dissolved O 2 measurements were used to control ethanol feed delivery (Supplementary Fig. 14b). When ethanol was depleted from the fermentor medium, the dissolved O 2 began to increase rapidly—faster than the dissolved O 2 controller could compensate. When dissolved O 2 > 50%, the ethanol feed pump was activated for the length of time necessary to add 10 g ethanol per litre fermentor volume to the reactor. After the addition of ethanol, the dissolved O 2 would rapidly decrease to <50%. The variable Timer Max was again calculated by the computer algorithm after each cycle. This algorithm iterated for the remainder of the fermentation.

Purification of artemisinic acid from IPM

IPM was isolated from artemisinic acid fermentations by centrifugation. IPM was mixed with 1% NaH 2 PO 4 ·12H 2 O and the pH was adjusted to 10.7 by the addition of 5 M NaOH. The solution was then stirred at ambient temperature for 60 min. After mixing, the solution was allowed to separate by gravity in a separatory funnel at ambient temperature. The bottom aqueous phase was drawn off from the upper IPM phase. The bottom aqueous phase was run through a liquid: liquid annular centrifugal contactor (CINC Industries) to ensure complete removal of any residual IPM. A 10% (w/v) SDS solution was added to the aqueous phase to bring the final SDS concentration to 0.03%. The solution was mixed and the pH adjusted to 5.0 with 2.5 M H 2 SO 4 . The acidification resulted in the formation of a fine white precipitate, which was captured on a 0.45-μm PTFE (polytetrafluoroethylene) filter, rinsed with purified water and then dried. Analysis of the IPM before and after aqueous extraction showed that 5% of the artemisinic acid remained in the IPM after extraction (∼95% step yield). Analysis of the filtrate after precipitation showed that 2% of the artemisinic acid present in the aqueous phase remained in the filtrate after acidification (∼98% step yield). The overall purification yield obtained was ∼93%. Additional aqueous extractions of the remaining IPM should increase the overall yield. Analysis of the dried precipitate by gas chromatography with flame-ionization detection (GC–FID) gave artemisinic acid purities of ∼96% by area and ∼98% by weight.

Broth extraction

Amorpha-4,11-diene, artemisinic alcohol and artemisinic aldehyde were extracted from cells and broth as follows. Cell lysis cocktail was prepared by combining two parts Novagen YeastBuster protein reagent (EMD Biosciences) and one part 2 M HCl. Samples were prepared by mixing 0.4 ml cell lysis cocktail with 0.1 ml whole broth and 1 ml ethyl acetate containing 10 mg l−1 trans-caryophyllene (internal standard, ≥98.5% purity; Sigma-Aldric ) in a 2-ml glass vial. The sample was mixed for 30 min on a vortex mixer. After mixing, the vial was placed on the bench top to allow the phases to separate. If necessary, the vial was centrifuged at 1,000g to break any emulsion that had formed. Six-hundred microlitres of the ethyl acetate layer was transferred to a gas chromatography vial for analysis.

Gas chromatography

The production of amorpha-4,11-diene, artemisinic alcohol and artemisinic aldehyde was monitored by GC–FID. The ethyl acetate-extracted samples were analysed using on the GC–FID. Amorpha-4,11-diene, artemisinic alcohol and A.CHO peak areas were converted to concentration measurements from external standard calibrations using authentic compounds. To expedite run times, the temperature program and column were modified to achieve optimal resolution and the shortest overall run-time with minimal interferences. A 10-μl sample was split 1:20 and was separated using a DB-WAX column (50 m × 200 μm × 0.2 μm; Agilent), with hydrogen as the carrier gas at a flow rate of 1.57 ml min−1. The temperature program for the analysis was as follows: the column was initially held at 150 °C for 3 min, followed by a temperature gradient of 5 °C min−1 to a temperature of 250 °C, and then the column was held at 250 °C for 5 min to elute all remaining components. Under these conditions, trans-caryophyllene, amorpha-4,11-diene, artemisinic aldehyde and artemisinic alcohol elute at 4.95, 5.77, 12.94 and 18.60 min, respectively.

Broth preparation with and without IPM

A 1-ml aliquot of well-mixed fermentation broth was diluted in 9 ml of methanol plus 0.1% formic acid (IPM formed an emulsion with the cell broth when it was used). The mixture was then mixed on a vortex mixer for 30 min and centrifuged at 16,000g for 5 min. One-hundred microlitres of the supernatant was diluted into 900 μl methanol plus 0.1% formic acid, and analysed by the HPLC method described below.

In-process assay for titre measurement

A screening method was developed to rank artemisinic-acid-producing strains. This method was used only to rank strains, and not determine final titre. A 20-μl aliquot was injected on an Agilent 1200 HPLC with ultraviolet detection at 212 nm. An Supelco Discovery C 8 column (4.6 mm × 100 mm × 5.0 μm; Supelco) equipped with the appropriate guard column (4.0 mm × 20.0 mm; Supelco) was used for separation, with the following gradient at a flow rate of 1 ml min−1 (channel A: water plus 0.1% formic acid; channel B: methanol plus 0.1% formic acid): 0–0.5 min 70% B, gradually increased to 97% B from 0.5 to 6.7 min, held at 97% B until 7 min, decreased to 70% B from 7 to 7.5 min, and re-equilibrated to 70% B from 7.5 to 9.5 min. The column was held at 25 °C during the separation. Under these conditions, artemisinic acid was found to elute at 6.3 min. AA peak areas were converted to concentrations from external standard calibrations of authentic compounds.

Final titre measurement