Source material

Animals (Pleurobrachia bachei, Euplokamis dunlapae, Dryodora glandiformis, Beroe abyssicola, Bolinopsis infundibulum and Mertensiid) were collected at Friday Harbour Laboratories (Pacific North-Western Coast of USA) and maintained in running seawater for up to 2 weeks. Other species were collected at the Atlantic coast of Florida and around Woods Hole, Massachusetts (Pleurobrachia pileus, Pleurobrachia sp., Mnemiopsis leidyi) as well as central Pacific (Palau, Hawaii, Coeloplana astericola, Vallicula multiformis). Animals were anaesthetized in 60% (volume/body weight) isotonic MgCl 2 (337 mM). Specific tissues were surgically removed with sterile fine forceps and scissors and processed for DNA/RNA isolations as well as metabolomics or pharmacological/electrophysiological tests. Whole animals were used for all in situ hybridization and immunohistochemical tests as described35. Genomic DNA (gDNA) was isolated using Genomic-tip (QIAGEN) and total RNA was extracted using RNAqueous-Micro (Ambion/Life Technology) or RNAqueous according to manufacturers’ recommendations. Quality and quantity of gDNA was analysed on a Qubit2.0 Fluorometer (Life Technologies) and for RNA we used a 2100 Bioanalyzer (Agilent Technologies). For all details see Supplementary Methods sections 1.1–1.3.

Genome sequencing

All genomic sequence data for de novo assembly were generated on Roche 454 Titanium and Illumina Genome Analyzer IIx, HiSeq2000 and MiSeq instruments using both shotgun pair-end and mate-pair sequencing libraries with 3–9 kb inserts as summarized in Supplementary Tables 1 and 2. Shotgun sequencing was performed from a single individual. Owing to a limited amount of starting gDNA, mate pair libraries were constructed from 10–12 individuals. In total, the genome sequencing is composed of 132,015,600,107 bp or ∼132 Gb of data, which corresponds to 733–825× physical coverage of the Pleurobrachia genome (the size of the P. bachei genome is estimated to be ∼160–180 Mb); see Supplementary Methods sections 1.4–2.1.2.

Genome assemblies

The Pleurobrachia bachei draft genome was assembled using a custom approach designed to leverage the individual strengths of three popular de novo assembly packages and strategies: Velvet36, SOAPdenovo37, and pseudo-454 hybrid assembly with ABySS38. First, using filtered and corrected data, we performed individual assemblies from 454 and Illumina reads by the Newbler (Roche, Inc.) software. Then the merged/hybrid assembly was achieved using three individual assemblies (SOAPdenovo, Velvet and ABySS/Newbler as described in Supplementary Methods 2.2). Three gene model predictions were performed by Augustus39 and Fgenesh predictions with the Softberry Inc. Fgenesh++ pipeline40,41 to incorporate information from full-length cDNA alignments and similar proteins from the eukaryotic section of the NCBI NR database42. After initial gene predictions in each of the three sets of genomic scaffolds, we screened each set of gene models for internal redundancy with the BLASTP program from NCBI’s BLAST+ software suite43. A model was considered redundant if it: had 90% identity to other model; the alignment between the two models had a bit score of at least 100; and the model was shorter than the other model.

Scaffolds producing these gene models were pooled and then screened for prokaryotic contamination using UCSC’s BLAT software package44 to produce the draft genome assembly version 1.0 (statistics can be found in Supplementary Table 5 and Supplementary Methods 2).

Genome annotation

For annotation, gene models were uploaded to the In-VIGO BLAST interface, a blastp alignment of gene models was performed against the entirety of NCBI’s non-redundant protein database and the Swiss-Prot protein database, and subsequently annotated in terms of Gene Ontology and KEGG pathways as well as Pfam domain identification. Transposable elements (TEs) were identified using not only WU-BLAST and its implementation in CENSOR but also databases for all known classes, superfamilies and clades of TEs described in the literature and/or collected in Repbase45. Detected sequences have been clustered based on their pairwise identities by using BLASTclust. All autonomous non-LTR retrotransposons have been classified based on RTclass1 (ref. 46). To merge partially predicted, non-redundant gene models with assembled transcriptome data, a custom Java tool was developed. This Java tool extended partial gene model predictions based on using transcriptome sequences to bridge 5′ and 3′ fragments of partially predicted genes. Using this Java tool, analysis of alignments of non-redundant gene models to assembled Pleurobrachia transcriptomes resulted to 19,523 (Supplementary Table 30) gene models. These gene models were used to also identify their possible homologues in assembled transcriptomes from 10 other ctenophore species sequenced (Supplementary Tables 10 and 11). All genomic sequences were submitted to NCBI on SRA accession number Project SRP001155 (Supplementary Methods 3.1–3.2).

Transcriptome sequencing and annotation

Three sequencing technology platforms were used for transcriptome profiling (RNA-seq): Roche 454 Titanium, Illumina HiSeq2000 and Ion Proton/PGM (Ion Torrent, Life Technologies). RNA-seq was performed from all major embryonic and developmental stages (1 cell, 2 cells, 4 cells, 8 cells, 16 cells, 32 cells, 64 cells, early and later gastrula, 1 day and 3 day larvae), major adult tissues and organs (combs, mouth, tentacles, stomach, the aboral organ, body walls), and whole body of Pleurobrachia bachei. We developed a reduced representation sequencing protocol for the 454 and Ion Torrent sequencing platforms that can detect low abundance transcripts47. The method reduces the amount of sequencing and gives more accurate quantification and additional details of the procedure are reported elsewhere47,48. In summary, we have generated 499,699,347 reads or ∼47.9 Gb to achieve approximately 2,000× coverage of the Pleurobrachia transcriptome.

In addition, Illumina HiSeq sequencing was also performed with RNA extracted from the following ctenophore species: Euplokamis dunlapae, Coeloplana astericola, Vallicula multiformis, Pleurobrachia pileus, Pleurobrachia sp. (collected from the Middle Atlantic and later identified as a subspecies of P. pileus), Dryodora glandiformis, Beroe abyssicola, Mnemiopsis leidyi, Bolinopsis infundibulum and an undescribed species which belongs to the family Mertensiidae (Supplementary Table 3). Each sequencing project was individually assembled using the Trinity de novo assembly package49 and in selected cases using MIRA. Reads from developmental stages were also assembled using the CLCBio Genomics Workbench. Before each assembly, reads were quality trimmed and had adaptor contamination removed with cutadapt50. Full summaries of the transcriptome assemblies are presented in Supplementary Tables 4 and 10. Each transcriptome was mapped to the Pleurobrachia genome, and aligned to both NCBI’s non-redundant protein database (NR) and the UniProtKB/Swiss-Prot (SP) protein database. Gene Ontology51 and Kyoto Encyclopedia of Genes and Genomes52,53 (KEGG) terms were associated with each transcript. By first translating transcripts in all six reading frames, Pfam/SMART domains54 were assigned to each reference transcriptome.

Each reference transcriptome and its full set of annotation and expression data was uploaded to our transcriptome database http://moroz.hpc.ufl.edu/slimebase2/browse.php for downstream analysis and visualization55,56. The database is integrated with UCSC type genome browser. Via the genome project homepage (http://neurobase.rc.ufl.edu/Pleurobrachia) all data sets have direct download options. Quantification of gene expression profiling was performed on all transcriptional data as described in Supplementary Methods 4.4. Hierarchical clustering was performed by Spotfire agglomerative algorithm. All primary transcriptome data was submitted to NCBI on SRA accession number Project SRP000992. (See Supplementary Methods 4.1–4.2.3 for details.)

Phylogenetic analyses

To reconstruct basal metazoan phylogeny (see controversies in10,11,12,13,14,15,57), we conducted two sets of phylogenomic analysis using tools described elsewhere58. All analyses included new data from Pleurobrachia bachei and the sponges Sycon (Calcarea) and Aphrocallistes (Hexactinellida). For the first set of analyses, Ctenophora was represented by two species of Pleurobrachia and Mnemiopsis leidyi. Initial analyses included the taxa in Supplementary Table 12. For a subsequent analysis, sampling within Ctenophora was expanded to include ten additional taxa, each represented by a relatively deeply sequenced Illumina transcriptome (Supplementary Table 13). In order to reduce noise in the phylogenetic signal, we used strict criteria to exclude paralogues, highly derived sequences, mistranslated sequence regions, and ambiguously aligned positions in sequence alignments. Analyses were conducted in RAxML 7.2.7 (refs 59) using maximum likelihood (ML) with the CAT +WAG + F model. Topological robustness (that is, nodal support) for all ML analyses was assessed with 100 replicates of nonparametric bootstrapping. Details of phylogenomic analyses are presented in Supplementary Methods 7. Shimodaira–Hasegawa test17 was implemented in RAxML with the PROTGAMMAWAGF model17.

In order to examine evolution of single genes or gene families, alignments were performed with either ClustalX260,61,62 or Muscle63 then, if appropriate, either trimmed manually or trimmed using GBlocks64 to exclude ambiguously aligned positions. Once alignments were obtained, gene trees were reconstructed in MEGA 565 using ML with the Whelan and Goldman (WAG) model. The bootstrap consensus tree was inferred from 100 replicates. All positions containing gaps and missing data were eliminated. Pfam composition54, Gene Ontology51 and KEGG52,53 were used to further validate P. bachei orthologues. Analyses of gene gain and gene loss were performed using custom scripts as described elsewhere66 and in Supplementary Methods 7.

Analysis of DNA methylation

ELIZA-based colorimetric assays (Epigenteck) were performed to quantify both global 5-mC and 5-hmC methylation in the P. bachei genome. A total of 6 individual P. bachei and three rat (positive control) were used (Supplementary Methods 1.2). Three biological and technical replicates were performed for every sample. Absolute quantification of 5-mC and 2-hmC were determined and date is reported as a mean ± s.e.m. (Supplementary Methods 8).

Molecular cloning, in situ hybridization and immunohistochemistry

Methods were similar as reported elsewhere35,47,48,67 with some modifications (Supplementary Methods 9–11).

Scanning electron microscopy

Animals were fixed in 2.5% glutaraldehyde in 0.2 M phosphate-buffered saline (pH 7.6) for 3–4 h at room temperature, and washed. For secondary fixation, we used 2% osmium tetroxide in 1.25% sodium bicarbonate for 2–3 h at room temperature. After dehydration in ethanol, samples were placed for drying in Samdri-790 Critical Point Drying. After drying the samples were coated on Sputter Coater. SEM observations and recordings were done on NeuScope JCM-5000 microscope (Supplementary Methods 12).

Electrophysiological methods, calcium imaging and pharmacological assays

Patch electrodes for extracellular and whole-cell recordings were pulled from borosilicate capillary (P-87, Sutter Instruments). All currents were recorded using an Axopatch or 200B amplifier controlled by a Digidata 1322A and pClamp 9.2. Action potentials (APs, spikes) were recorded in track mode using cell-attached loose-patch configuration. Whole-cell currents were recorded in voltage-clamp mode at a holding potential of −70 mV. Neurotransmitter candidate (see Supplementary Method 15) application for both extracellular AP and whole cell recordings were performed with a rapid solution changer, RSC-160 (Bio-Logic-Science Instruments, France). Data were analysed with Clampfit 9.0 (Molecular Devices) in combination with SigmaPlot 10.0. Videomicroscopy and time-lapse series were acquired with QImaging EXi CCD camera using DIC mode of Nikon Eclipse 2000 inverted microscope. Calcium imaging was performed on isolated ctenophore muscle cells using an Olympus IX-71inverted microscope equipped with a cooled CCD camera (ORCA R2, Hamamatsu). Cells were injected with calcium-sensitive probe (Fluo-4, ∼5 μM) through a patch pipette. Fluorescence imaging was performed under the control of Imaging Workbench 6 software. Stored time series image stacks were analysed off-line using Imaging Workbench 6, Clampfit 10.3, SigmaPlot 10/11 or exported as TIFF files into ImageJ 1.42. Pharmacological tests and behavioural assays with video recording were performed on intact animals in 5–40 l aquaria or on semi-intact preparations in a Sylgard-coated Petri dish with free cilia beating and muscle contractions. To monitor and quantify cilia movements we used glass microelectrodes filled with 2 M potassium acetate with resistances of 5–20 MΩ with electrical signals recorded by A-M System amplifiers (Neuroprobe 1600) and Gould Recorder (WindoGraf 980).

Determination of the presence of classical neurotransmitters by capillary electrophoresis (CE)

Two CE separation techniques were used to analyse tissue extracts for the presence of a number of neurotransmitters (Supplementary Tables 22 and 23; Supplementary Methods 17). While both methods used CE separations, complimentary detection methods, laser-induced native fluorescence (LINF)68 and electrospray ionization mass spectrometry (ESI-MS)69,70 were used to ensure broad coverage and low detection limits for the specific analytes of interest. Whole bodies of small animals as well as individual organs and tissues were removed, rinsed with ultrapure water and analytes were extracted using 49.5/49.5/1, methanol (LC-MS grade)/water/glacial acetic acid (99%) by volume, homogenized, centrifuged and supernatant was removed and frozen at −80 °C until analysis. The CE-LINF instrument used ultraviolet excitation at 264 nm and the native fluorescence emission was collected and recorded using a UV-enhanced CCD array (Spec-10; 2KBUV/LN; Princeton Instruments). CE separations were performed by hydrodynamic injection of 10 nl of sample and using 25 mM citric acid (pH 2.5, applied voltage +30 kV) or 50 mM borate (pH 9.5, applied voltage +21 kV). Analytes were identified based on comparison of both the migration time and fluorescence spectrum to that of standard mixtures of analytes. CE-ESI-MS analysis was performed using a Bruker Microtof or a Maxis (Bruker Daltonics) mass spectrometer for detection. All separations were performed using 1% formic acid in water as the electrolyte and applied voltage of +30 kV. Sheath liquid was 0.1% formic acid in 50/50 methanol/water. Samples were hydrodynamically injected for a total volume of ∼6 nl. Mass spectra were collected and recorded at a rate of 2 Hz with calibration performed using sodium formate clusters. Analytes were identified based on comparison of both the CE migration time and mass match to that of standard mixtures of analytes.