Significance Marine oxygen-deficient zones (ODZs) are naturally occurring midlayer oxygen-poor regions of the ocean, sandwiched between oxygenated surface and deep layers. In the absence of oxygen, microorganisms in ODZs use other compounds, such as oxidized forms of nitrogen and sulfur, as terminal electron acceptors. We identified the presence and expression of genes for both arsenic reduction and oxidation in marine ODZs, suggesting the microbial community in these waters is also cycling arsenic for respiratory gain. The existence of an arsenic respiratory cycle in pelagic waters suggests microbial arsenic metabolisms may be underestimated in the modern ocean and were likely an even more significant contributor to biogeochemical cycles in the anoxic ancient oceans when arsenic concentrations were higher.

Abstract Microbial capacity to metabolize arsenic is ancient, arising in response to its pervasive presence in the environment, which was largely in the form of As(III) in the early anoxic ocean. Many biological arsenic transformations are aimed at mitigating toxicity; however, some microorganisms can respire compounds of this redox-sensitive element to reap energetic gains. In several modern anoxic marine systems concentrations of As(V) are higher relative to As(III) than what would be expected from the thermodynamic equilibrium, but the mechanism for this discrepancy has remained unknown. Here we present evidence of a complete respiratory arsenic cycle, consisting of dissimilatory As(V) reduction and chemoautotrophic As(III) oxidation, in the pelagic ocean. We identified the presence of genes encoding both subunits of the respiratory arsenite oxidase AioA and the dissimilatory arsenate reductase ArrA in the Eastern Tropical North Pacific (ETNP) oxygen-deficient zone (ODZ). The presence of the dissimilatory arsenate reductase gene arrA was enriched on large particles (>30 um), similar to the forward bacterial dsrA gene of sulfate-reducing bacteria, which is involved in the cryptic cycling of sulfur in ODZs. Arsenic respiratory genes were expressed in metatranscriptomic libraries from the ETNP and the Eastern Tropical South Pacific (ETSP) ODZ, indicating arsenotrophy is a metabolic pathway actively utilized in anoxic marine water columns. Together these results suggest arsenic-based metabolisms support organic matter production and impact nitrogen biogeochemical cycling in modern oceans. In early anoxic oceans, especially during periods of high marine arsenic concentrations, they may have played a much larger role.

For most of Earth’s history the oceans were characterized by a lack of available oxygen and as a result ancient microbes employed a variety of alternate electron acceptors (1). Proxies for these early environments are found in marine pelagic oxygen-deficient zones (ODZs) where redox gradients through the water column impact the associated microbial communities (2, 3). Naturally occurring marine ODZs are created by a combination of organic matter respiration and slow midlayer ventilation (4) and are functionally anoxic (<10 nmol L−1) (2, 5). Denitrifying metabolisms, which reduce nitrate and nitrite, are prominent (3) and ultimately result in the production of N 2 O and N 2 gas and thus loss of N from the marine system. There is also evidence for a cryptic sulfur cycle in marine ODZs, where dissimilatory sulfate reduction produces sulfide, which is consumed by autotrophic sulfur oxidizers (6).

The oxidized inorganic arsenic compound arsenate is the thermodynamically stable form in oxygenated aqueous environments [As(V) as H 2 AsO 4 − and HAsO 4 2−] and arsenite is the thermodynamically stable form in anoxic environments [As(III) as H 3 AsO 3 0 and H 2 AsO 3 −] (7). Microbes utilizing the arsenic redox potential for energetic gains have been studied and isolated in a range of systems with moderate to high arsenic concentrations, including polluted and unpolluted soils, sediments, hot springs, lakes, wastewater, and mine drainage (8). In some arsenic-rich aquatic systems, like the alkaline Mono Lake in California (USA), arsenic supports microbial communities through a cycle of dissimilatory heterotrophic reduction of As(V) and autotrophic oxidation of As(III), analogous to the cryptic sulfur cycle in ODZs (7). A complete metabolic arsenic cycle is ancient, existing since the Archaean (9, 10), as suggested by fossilized arsenic-rich organic globules from 2.72 billion years ago (11). Arsenic concentrations have fluctuated over geologic time, with early anoxic oceans likely experiencing periods of greater arsenic abundance than today (12, 13).

While the modern ocean is not considered an arsenic-loaded system, inorganic arsenic is ubiquitous in marine waters, ranging from 15 to 20 nmol/L in the open ocean (14, 15). Measurements of arsenic in the anoxic waters of the Eastern Tropical South Pacific (16), Black Sea (17), and the anoxic fjord Saanich Inlet, British Columbia (18) show thermodynamic disequilibrium of the arsenic species in the water column which may be due, in part, to the microbial biotransformations of arsenic species. In radioisotope arsenic tracer experiments carried out on the anoxic waters from Saanich Inlet, rates of arsenite oxidation greatly diminished when antibiotics were added to the seawater (18), suggesting microbially mediated impacts on arsenic transformations. We hypothesized that the apparent redox disequilibrium of arsenic species in anoxic marine waters may be due to the presence of arsenotrophic microbes.

Methods Samples were collected during a research cruise to the ETNP in April 2012 aboard the research vessel (R/V) Thompson (cruise TN278) using 10-L Niskin bottles in a 24-bottle conductivity, temperature, and depth (CTD) water sampling rosette. The top of the ODZ was determined to be at 105 m based upon oxygen concentrations determined by switchable trace oxygen (STOX) sensor (5). Oxygen and nutrient measurements were collected as previously described (20, 42). DNA samples were obtained from station 136 (106.543° W 17.043° N) in the anoxic zone at multiple depths (100, 110, 120, 160, 180 m). Two liters of Niskin water was vacuum filtered onto a 0.2-µm SUPOR filter. At the offshore multiday station BB2/141 (107.148° W 16.527° N), approximately four liters of water from 120 m were first filtered through a >30-µm nylon filter and the prefiltered (<30 µm) water was then filtered onto 0.2-µm Supor membrane filters (PALL Corporation). Filters were immediately frozen in a −80 °C freezer and transported on dry ice from the ship to the University of Washington where they were maintained at −80 °C. DNA library preparation and sequencing with Illumina HiSeq 2500 were conducted in 2014 and previously described in another publication (20). DNA samples were extracted using freeze thaw followed by incubation with lysozyme and proteinase K and phenol/chloroform extraction. A Rubicon THRUPLEX kit was used for library prep using 50 ng of DNA per sample. Two libraries (ETNP_100m & ETNP_110m) were sequenced on a HiSeq 2500 in rapid mode (∼22 million 150-bp paired-end read pairs per sample) at Michigan State University. The other five libraries (ETNP_120m, ETNP_160m, ETNP_180m, ETNP_BB2_120m_free, and ETNP_BB2_120m_particle) were sequenced on a HiSeq 2500 high-output mode (∼55–150 million 125-bp paired-end read pairs per sample) at the University of Utah. Sequences were quality checked and trimmed using Trimmomatic (43). Metagenomic reads from each depth sample were first assembled separately, processed with diginorm (44) to normalize coverage, and then de novo assembled with the VELVET assembler (45). Contigs assembled from individual depths as previously described (20) were functionally annotated using the prokka annotation pipeline (46) with reads and assembled contigs deposited at NCBI GenBank under bioproject PRJNA350692 v.1 assemblies (SI Appendix, Table S4 lists accessions for each assembly). Prokka alone was insufficient to identify DMSO reductase family-related sequences (aioA, aioA-like, arrA, and arxA). To identify these arsenotrophy-related enzymes, blast databases of all assembled contigs were created using blast version 2.2.28 (47, 48) and queried with tblastn using known reference queries. Contigs which had hits to arsenotrophy-related sequences with e values ≤ 1 × 10−50 were parsed out, and submitted to MetaGeneMark for gene prediction (49, 50). Potential genes that overlapped with the region of best blast hit were then added to the DMSO reductase tree for further identification, with full-length sequences that grouped with arsenotrophy-related clades depicted on the DMSO reductase tree (SI Appendix, Fig. S4). To assemble the longest possible arsenotrophy-related contigs, a targeted assembly approach was used to combine metagenomes from all anoxic samples. Partial and full arsenotrophy-related sequences identified from assembled contigs from individual depths and reference regions from known arsenotrophy-related sequences (gene sequence with 10-kb regions up- and downstream) were used as queries for sequence read recruitment from each individual depth metagenome using the program FR-HIT (51). Reads recruited with FR-HIT were then assembled with the de novo assembler IDBA-UD (52) optimized for assembling data of uneven depth coverage. Arsenotrophy-related genes from these contigs were identified as above. In addition, all genes called on contigs identified as containing full-length arsenotrophy-related enzymes were annotated with InterProScan (53) with general taxonomic identity on a gene-by-gene basis being inferred at the bacterial class level through a blastn best hit to the nr/nt database. The assembled arrA amino acid sequence was then pairwise aligned with that from Shewanella ANA-3 using Jalview 2 (24, 54). Gene trees were constructed with amino acid translated sequences from metagenomic assemblies and known reference sequences aligned with the program MUSCLE version 3.6 (55). Phylogenetic trees were then constructed with the program RAxML version 8.0.23 (56). A maximum-likelihood tree was inferred from the best of 20 trees using the amino acid substitution matrix model which resulted in the best likelihood score during a trial run (DMSO: BLOSUM62F; DsrA, AioB, and ArrB: WAGF) along with empirical character frequencies and a gamma model of rate heterogeneity using the RAxML estimated alpha value. Bootstrap analyses were conducted on all trees at n = 100. The DMSO reductase family tree and DsrA tree were used as reference trees for phylogenetic placement of metagenomic and metatranscriptomic reads. Publicly available pelagic marine metagenomic assemblies in the NCBI WGS database were searched for the presence of arrA, aioA, and aioA-like. Multiple amino acid query sequences spanning the taxonomic range of these genes were used to identify potential genes in WGS using the SRA toolkit tool tblastn_vdb to remotely blast search assembled metagenomes annotated as marine metagenomes from seawater. Hits receiving an e value ≤ 1 × 10−80 were selected and their contig sequences obtained from WGS using Batch Entrez. Contigs were then submitted to MetaGeneMark for gene prediction (49, 50). Potential full-length arsenotrophy genes that overlapped with the region of best blast hit were added to the DMSO reductase tree for further identification. Recruitment and placement of metagenomic reads from the samples at 120-m depth from station BB2/141 comparing the >30-µm (particulate) and <30-µm (free-living) fractions was conducted. Reads were combined into a single blast database using blast 2.2.28 (47, 48) and queried with tblastn (e-value cutoff 1 × 10−5) using full-length known reference sequences, including full-length identified environmental arsenotrophy sequences assembled in this work. Translation of blast recruited reads which were at least 33 amino acids in length after quality trimming were then identified using a phylogenetically informed placement approach by comparison with known reference sequences (20, 37, 57⇓–59). To obtain the abundance of these reads relative to the overall prokaryotic community in the two size fractions, the length-normalized number of identified read pairs for each gene was compared with the length-normalized abundance of the single-copy core gene RNA polymerase (rpoB) in the sequenced prokaryotic community (20, 37). A similar read recruitment and phylogenetic placement method was used to identify reads from ODZ metatranscriptomes (37, 57⇓–59). Publicly available metatranscriptomes from oxygen deficient zones in the ETNP (39), ETSP (41), and within a sulfide plume in the ETSP (40) were downloaded locally. Reads were also recruited for identification of bacterial dsrA for comparison.

Acknowledgments We thank John Baross and Rika Anderson for helpful discussions and feedback on this project. We also thank the chief scientists of the research cruise, Al Devol and Bess Ward, as well as the captain and crew of the R/V Thomas G. Thompson. This work was supported through a NASA Earth and Space Sciences Graduate Research Fellowship to J.K.S. and National Science Foundation Grant OCE-1138368 (to G.R.).

Footnotes Author contributions: J.K.S. and G.R. designed research; J.K.S., C.A.F., and C.M. performed research; J.K.S. analyzed data; and J.K.S. and G.R. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

Data deposition: The sequence reported in this paper has been deposited in the GenBank database (GenBank Bio Project PRJNA350692).

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1818349116/-/DCSupplemental.