Whole midgut agarose diffusion assay for substrate activity

To test if Phasmatodea guts were pectinolytic, we filled square Petri-dishes with 0.1% solutions of either citrus pectin (Sigma) or polygalacturonic acid (PGA) (Megazyme) in 0.4% agarose and 50 mM citrate-phosphate buffer (pH 5.0). We made wells in the plates and filled them with 5 μL of macerated, whole midguts cleared of their contents and dissected from the aforementioned six species with the published midgut transcriptomes7: Aretaon asperrimus (Heteropteryginae), Peruphasma schultei (Pseudophasmatinae), Sipyloidea sipylus (Necrosciinae), and Extatosoma tiaratum (Lanceocercata: Extatosomatinae), Medauroidea extradentata, and Ramulus artemis (Clitumninae). We used pectinases from Aspergillus niger (Sigma) as positive control. Plates were incubated upside-down at 40 °C overnight, stained for one hour in 0.01% Ruthenium Red (Colour Index No. 77800) on a shaker at 20 rpm, and destained in diH2O. Enzyme activity was detectable as clearings in the stained gel.

Creating cDNA libraries and cloning of full length genes

Amino acid sequences from known pectinases of the glycoside hydrolase (GH) family 28 (www.cazy.org) were retrieved from GenBank (Accession Numbers: JQ728556.1, Y17906.1, EU450666.1). We used the tBLASTn algorithm36 with an e-value cutoff of 1E-10 to mine for homologous sequences to these from the six published phasmatodean midgut transcriptomes (Genbank Accession No. PRJNA238833 & PRJNA221630). For incomplete transcripts, we designed specific primers for 5′- and 3′-Rapid Amplification of cDNA Ends (RACE) PCR using the Primer3 program v0.4.0 (http://bioinfo.ut.ee/primer3-0.4.0/primer3/). From living, cultured specimens of these six species, we dissected the anterior midgut, removed the gut contents, and stored the tissue in RNAlater® solution (Qiagen). After maceration in a frozen Tissue Lyser, RNA was extracted using the innuPREP RNA MiniKit (Analytik-Jena) and purified with theRNeasy® MinElute® cleaning kit (Qiagen) following the manufacturers’ protocols. From the RNA, we synthesized cDNA and performed RACE PCR as needed with the SMARTer RACE cDNA Amplification Kit (BD Contech) following the manufacturer’s instructions. PCR products were cloned into One Shot® Top10 Chemically Competent E. coli cells with the pCR™4-TOPO/TA® Vector (Invitrogen), and subsequently sequenced by the Sanger method using M13 forward and reverse primers on an ABI 3730 xl automatic DNA sequencer (PE Applied Biosystems). Once we obtained complete open reading frames (ORFs) for every pectinase gene, they were converted to amino acid sequences and checked for eukaryote-specific signal peptides using the SignalP 4.1 server (http://www.cbs.dtu.dk/services/SignalP/). We annotated the sequences accordingly and deposited them in GenBank (Accession Numbers in Table S1).

Multiple alignment and inference of pectinase gene tree

For a general pectinase tree, the phasmid proteins were combined with those of bacteria, nematodes, and beetles as identified using an NCBI database search for glycoside hydrolase (GH) family 28 enzymes, which have a conserved, GH28 pectinolytic domain (Table S2), with Arabidopsis thaliana (GenBank Accession Number NP_850359.1) as outgroup37. These GH28 sequences were aligned using MAFFT38 with the L-INS-i option optimized for alignment of protein sequences with one conserved domain and allowing for long gaps whenever necessary. The alignment was pruned using trimAL39 to remove all positions containing more than 70% of gaps. The pruned alignment was used to infer gene trees with both, Bayesian and Maximum Likelihood (ML) methods. For the Bayesian tree we used MrBayes40 3.2.6 with an estimated gamma distribution among site rate variation (ASRV), along with a mixture of evolutionary models. A total of 300,000 generations were performed on 8 Monte Carlo Markov Chains (MCMC). We discarded a burn-in of 25% of sampled generations for inference of a consensus tree and calculation of posterior probabilities. The ML phylogeny was obtained using RAxML41 8.1.24 using the automated selection (AUTO) of the best fitting evolutionary model and an estimated gamma distribution of ASRV. The autoMRE option (default 0.03) was used to automatically conduct bootstopping, indicating that enough bootstrap replicates had been sampled to obtain bootstrap value convergence among the tree topologies. In both cases, the A. thaliana GH28 sequence37 was used as outgroup to root the trees. The trees were visualized and exported as figures using FigTree software (http://tree.bio.ed.ac.uk/software/figtree/).

Expression of pectinase genes in specific insect tissue

We chose four exemplar species, one from each taxonomic family, for downstream analysis: A. asperrimus, P. schultei, R. artemis, and S. sipylus. We designed gene-specific forward and reverse primers (Table S1) to amplify the complete ORF of each putative enzyme from the cDNA, and cloned them into Top10 cells with the pIB/V5-His TOPO/TA® vector (Invitrogen). We included Kozak sequences (RCCATGG) at the 3′ end of the forward primers and did not include the stop codon in the reverse primers. Colony PCR or direct sequencing was done to ensure the genes were cloned in the correct direction, then we extracted the plasmids with a GeneJET™ Plasmid Miniprep Kit (Thermo Scientific) and transfected them into Sf9 cells (Invitrogen) using the reagent FuGENE HD (Promega). Culture medium was harvested after 72 hours incubation at 27 °C and centrifuged, and the supernatant tested for successful expression via Western Blot (Figure S1) with anti-V5-HRP antibody (Invitrogen). Plate assays for substrate activity were performed on the individual enzymes following the same protocol as the whole gut extracts.

Thin Layer Chromatography (TLC) assays for substrate activity

Enzyme solutions were dialyzed in three baths of 50 mM citrate-phosphate buffer pH 5.0 at 4 °C using Slide-A-Lyzer Dialysis Cassettes (Thermo Scientific) with 10 KDa cutoffs, desalted in Zeba TM Desalt Spin Columns (Thermo Scientific) with 7 KDa cutoffs, and stored at 4 °C until use. 10 μL of desalted enzyme were combined in microcentrifuge tubes with 2 μL 0.2 M citrate phosphate buffer (pH 5.0) and 8 μL of the following ratios of 1% w/v substrate stock solutions and diH2O: 4:4 citrus pectin (Sigma), 4:4 soy- or potato- rhamnogalacturonan (Megazyme), 4:4 demethylated polygalacturonic acid (PGA) from citrus (Sigma), 1:7 trigalacturonic acid (TGA) (Megazyme), 2:6 digalacturonic acid (DGA) (Megazyme), and 4:4 xylogalacturonan produced following the protocol published by Beldman et al.25. We used pectinases from Aspergillus niger (Sigma) as positive control. The tubes were incubated for 16 hours at 40 °C, then spotted onto TLC plates (silica gel 60, 20 × 10 cm, Merck) and developed with 9:3:1:4 of ethyl acetate:acetic acid:formic acid:water. We used as reference standards 2 μg each of galacturonic acid, DGA, and TGA, as well as xylose mono-, di-, and trimers and galactose as needed (Megazyme). The dried plates were sprayed with 0.2% (w/v) orcinol in 9:1 methanol/sulfuric acid, and subsequently heated with a heat gun until spots appeared.

Timing the origin of the phasmatodean pectinases

As we did for the six transcriptomes, we mined the transcriptomes of 38 phasmatodean species, which broadly represent all recognized major lineages32, and 16 representative polyneopteran outgroups (GenBank Accession No: PRJNA183205, Table S3) for target genes. Digestive enzymes in stick insects are expressed in the anterior midgut7, which starts approximately at the thorax/abdomen border8. For most of the studied phasmids, the transcriptome was not taken from the entire animal but rather from the head and parts of the thorax. Table S3 provides a detailed list of which body parts were used for respective species. To ensure that the relevant midgut tissue had been included, we also mined for endogenous insect cellulase enzymes23 from GH family 9, which are also highly and differentially expressed in the anterior midgut7,22. Phasmids should have more than five genes from this group7. We therefore excluded in the present study all phasmatodean transcriptomes with five or fewer GH9 cellulase genes, which suggested the transcriptome did not include midgut tissue. For all outgroup taxa, the entire animal was used to generate the transcriptomes, ensuring the presence of all digestive tissue. To date the origin of the horizontal gene transfer we mapped the presence of endogenous pectinase genes on the dated phylogenies of Misof et al.20 and Bradler et al.12.

Ethics

All methods were carried out in accordance with local guidelines for animal research. All experimental protocols were approved by the Max Planck Institute for Chemical Ecology in accordance with these guidelines.

Data accessibility

Phasmatodea pectinase sequences are available under GenBank Accession Number’s KT921897-KT921989. 1KITE RNASeq data (BioProject PRJNA183205) is available from the Consortium upon request, with the Accession Numbers for published individual transcriptomes in Table S3.