

Anyi Hu1, Jibing He1, Kung-Hui Chu, Chang-Ping Yu1*

1. Key Lab of Urban Environment and Health, Institute of Urban Environment, Chinese Academy of Sciences, Xiamen 361021, China

2. College of Earch and Environment, Anhui University of Science and Technology, Huainan, 232001, China

3. Zachry Department of Civil Engineering, Texas A&M University, College Station, TX 77843, USA *Correspondence to: Chang-Ping Yu: cpyu@iue.ac.cn, Key Lab of Urban Environment and Health,

Institute of Urban Environment, Chinese Academy of Sciences, Xiamen 361021, P. R.





Abstract

Sphingomonas strain KC8 is known for its ability to utilize 17β-estradiol, a natural estrogen and an environmental endocrine disrupting compound, as the sole carbon and energy source. Here, we report the draft genome sequence of the strain KC8 (4,074,265 bp, with a GC content of 63.7%) and major findings from its annotation.

Main text: Estrogens are one of the most concerned environmental endocrine disrupting compounds because the exposure to estrogens or estrogen-like chemicals is known to cause adverse health effects on wildlife (11, 13). As biodegradation is considered as the major removal mechanism in the man-made and natural environments, a better understanding of microbial transformation of estrogens is warranted (3). Among known estrogen-degrading isolates, Sphingomonas strain KC8 is of particular interest because strain KC8 can degrade 17β-estradiol into none-estrogenic end products (14). However, the degrading mechanism remains largely unclear. Here, we present the draft genome sequence of strain KC8. To our knowledge, this is the first genome report of estrogen-degrading bacteria.The genome of KC8 was sequenced by a whole-genome shotgun strategy using Roche 454 GS-FLX Titanium pyrosequencing technology. A total of 217,810 reads and 85,408,792 bp sequences were produced, providing about 21-fold coverage of the

genome. Genome sequences were assembled in silico using Newbler Asembler 2.3 (Roche) resulted in 70 contigs (>1,000 bp in size) with an N50 length of 142,404 bp. The protein-coding genes were predicted using Glimmer 3.02 (4), while tRNAscan-SE (9) and RNAmmer (8) were used to identified tRNA and rRNA, respectively. The genome sequence was also uploaded into Rapid Annotation using Subsystem Technology (RAST) (1) to check the annotated sequences. The functions of predicted protein-coding genes were then annotated through comparisons with the databases of NCBI-NR (2), COG (12) , and KEGG (6).

The KC8 draft genome sequence has a total of 4,074,265 bp with an average GC content of 63.7%. It contains 3,950 predicted coding sequences (CDSs), one 16S-23S-5S operon, and 46 tRNAs. Using COG functional assignment, the majority of predicted proteins (89.4%) could be classified into 22 COG categories. According to subsystems-base annotation generated by RAST, the strain KC8 has 338

subsystems. The top four most abundant of the subsystems are related to amino acids and derivatives (number of CDSs, n = 328), carbohydrates (n = 264), protein metabolism (n = 213), and fatty acids, lipids and isoprenoids (n = 199). In addition, a large number of the CDSs are found to be related to resistance to antibiotics and toxic compounds (n = 101), stress response (n = 125), and motility and chemotaxis (n =

128).These findings suggest that the strain KC8 has a very diverse catabolic ability and a unique ability to adapt and/or survive in different environments. According to the proposed estrogen and testosterone degradation pathways (5, 7, 10), several genes encoding the enzymes putatively involved in estrogen degradation such as hydroxysteroid dehydrogenase, 3-ketosteroid-delta1-dehydrogenase, Rieske dioxygenase, and catechol 2,3-dioxygenase were also observed in the genome of KC8. Further studies are needed to clone these genes to confirm their functions. In addition, a more detailed analysis of this genome and comparative genome analysis with other PAH-degradingSphingomonasmembers will reveal the unique biochemical and molecular characteristics of this strain. Nucleotide sequence accession number. The draft genome sequence of Sphingomonas strain KC8 has been deposited atGenBank under accession number

AFMP01000000.



Acknowledgements

We thank the members of the Chinese National Human Genome Center at Shanghai for their supports on sequencing and assembling the genome. This work was supported by the Hundred Talents Program of the Chinese Academy of Sciences, Special Program for Key Basic Research of the Ministry of Science and Technology, China (2010CB434802), Science and Technology Planning Project of Xiamen, China

(3502Z20102017), and the CAS/SAFEA International Partnership Program for Creative Research Teams (KZCX2-YW-T08).



Reference

1. Aziz, R. K., D. Bartels, A. A. Best, M. DeJongh, T. Disz, R. A. Edwards, K. Formsma, S

Gerdes, E. M. Glass, M. Kubal, F. Meyer, G. J. Olsen, R. Olson, A. L. Osterman, R. A. Overbeek, L. K. McNeil, D. Paarmann, T. Paczian, B. Parrello, G. D. Pusch, C. Reich, R. Stevens, O. Vassieva, V. Vonstein, A. Wilke, and O. Zagnitko. 2008. The RAST server: rapid annotations using subsystems technology. BMC Genomics 9:75.

2. Benson, D. A., I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, and D. L. Wheeler. 2008. GenBank. Nucleic Acids Res 36:D25-D30.

3. Combalbert, S., and G. Hernandez-Raquet. 2010. Occurrence, fate, and biodegradation of estrogens in sewage and manure. Appl Microbiol Biot 86:1671-1692.

4. Delcher, A. L., K. A. Bratke, E. C. Powers, and S. L. Salzberg. 2007. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics 23:673-679.

5. Horinouchi, M., T. Hayashi, and T. Kudo. 2004. The genes encoding the hydroxylase of 3-hydroxy-9,10-secoandrosta-1,3,5(10)-triene-9,17-dione in steroid degradation in Comamonas testosteroni TA441. J Steroid Biochem 92:143-154.

6. Kanehisa, M., M. Araki, S. Goto, M. Hattori, M. Hirakawa, M. Itoh, T. Katayama, S.

Kawashima, S. Okuda, T. Tokimatsu, and Y. Yamanishi. 2008. KEGG for linking genomes to life and the environment. Nucleic Acids Res 36:D480-D484.

7. Kurisu, F., M. Ogura, S. Saitoh, A. Yamazoe, and O. Yagi. 2010. Degradation of natural estrogen and identification of the metabolites produced by soil isolates of Rhodococcus sp and Sphingomonas sp. J Biosci Bioeng 109:576-582.

8. Lagesen, K., P. Hallin, E. A. Rodland, H. H. Staerfeldt, T. Rognes, and D. W. Ussery. 2007. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res 35:3100-3108.

9. Lowe, T. M., and S. R. Eddy. 1997. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 25:955-964.

10. Ren, H. Y., S. L. Ji, N. U. D. Ahmad, W. Dao, and C. W. Cui. 2007. Degradation characteristics and metabolic pathway of 17α-ethynylestradiol by Sphingobacterium sp JCR5. Chemosphere 66:340-346.

11. Sumpter, J. P., and A. C. Johnson. 2008. 10th Anniversary perspective: reflections on endocrine disruption in the aquatic environment: from known knowns to unknown unknowns (and many things in between). J Environ Monitor 10:1476-1485.

12. Tatusov, R. L., E. V. Koonin, and D. J. Lipman. 1997. A genomic perspective on protein families. Science 278:631.

13. Tyler, C. R., S. Jobling, and J. P. Sumpter. 1998. Endocrine disruption in wildlife: a critical

review of the evidence. Crit Rev Toxicol 28:319-361.

14. Yu, C. P., H. Roh, and K. H. Chu. 2007. 17β-estradiol-degrading bacteria isolated from activated sludge. Environ Sci & Technol 41:486-492.







