Zebrafish has emerged as a valuable model to study live cellular processes15,16 due to its sequenced genome, transparency through early adulthood and amenability to high-throughput screens17,18. To determine if zebrafish is a useful in vivo model for the study of mucus physiology, we searched its genome for genes with homology to secreted mucins from other vertebrates. We focused on polymeric secreted mucins, the major gel-forming building blocks of the mucus barrier19,20,21. One characteristic of gel-forming mucins is the concurrence of two protein domains, the Proline, Threonine and Serine (PTS) domain, which is the main site of O-linked glycosylation on the protein and the Von Willebrand Factor D (VWD) domain, which contributes to the polymerization of mucins. Using previously characterized computational tools22 we identified five putative mucin genes in zebrafish that contain coding regions for both PTS and VWD domains (Supplementary Fig. S1). Based on the subsequent analysis (see below) we named the five putative mucin genes muc5.1 (Ensembl ID ENSDARG00000070331), muc5.2 (ENSDARG00000058556), muc5.3 (ENSDARG00000089847), muc2.1 (ENSDARG00000074142) and muc2.2 (ENSDARG00000078994).

The organization of the protein domains in the identified zebrafish mucins is shown in Figure 1a. The better-studied mammalian polymerizing mucins MUC2, MUC5AC and MUC5B are characterized by an arrangement of four VWD domains with an extensive PTS domain between the third and fourth VWD domains. In addition, they contain CysD domains that are interspersed in the PTS domains and a cystine knot at the C terminus; both CysD domains and the cystine knot are involved in polymerization. The mammalian MUC6 mucin is similar in architecture but lacks the CysD domains as well as a fourth VWD domain at its C terminus23. Our data show that two identified members of the zebrafish Muc5-family, Muc5.1 and Muc5.2, contain the characteristic arrangement of four VWD domains and a PTS domain localized between the third and fourth VWD domains (Figure 1a, Supplementary Figs. S1, S2, S3). Muc5.3 is different in that it appears to contain only the first three VWD domains. For the members of the Muc2-family (Muc2.1 and Muc2.2) we had less sequence information available and therefore, offer a more preliminary interpretation. For Muc2.1 we identified three VWD domains and a truncated PTS domain in the predicted N-terminal portion of the protein. For Muc2.2, only a short Ensembl transcript was available, which was the basis for the depicted C-terminal VWD domain (Fig. 1a). In addition, muc2.1 mRNA was used as a guide to predict the exon/intron structure of the muc2.2 gene from the available genomic sequence. From the resulting muc2.2 gene model two further VWD domains were identified at the N-terminus of Muc2.2. In the genomic prediction we also found evidence for a PTS region (Supplementary Fig. S1). The PTS region was not included in the protein domain illustration because it was absent from the Ensembl transcript model at the time (Fig. 1a). However, should future updated transcript models include the sequence for the PTS, Muc2.2 would have the standard architecture of a MUC2 type mucin. For further details regarding the genomic organization and protein homology of the zebrafish mucins, as well as information on additional mucin-like transcripts, the reader is referred to Supplementary Figs. S1 and S3 and Supplementary Table S1.

Figure 1 Identification of five polymeric secreted mucins in zebrafish. (a) – Illustration of predicted mucin protein domain architectures. Two members of the Muc5 family (Muc5.1 and Muc5.2) are composed of three successive VWD domains, followed by a PTS domain and a fourth VWD domain at the C-terminus. This architecture is typical for mammalian gel-forming secreted mucins. The third Muc5 family member, Muc5.3, has a similar predicted domain composition but lacks the fourth VWD domain at the C-terminus. For the Muc2 family members, Muc2.1 and Muc2.2, regions of the protein sequence are missing as the current genome assembly is incomplete. The Muc2.2 protein domain lacks a PTS domain because it was absent from the Ensembl transcript prediction, though such domain was found at the genomic level (Supplementary Fig. S1). VWD: Von Willebrand Factor type D domain; PTS: Proline, Threonine and Serine domain; CysD and Cys-knot are cysteine rich domains. All domain structures except the N-terminal portion of Muc2.2 were identified based on Ensembl transcripts. The sequences used for the construction of the depicted protein models are listed in Supplementary Add. S1. (b) - Phylogenetic tree comparing zebrafish mucins with chicken and human polymeric secreted mucins from N-terminal portions of the mucins containing the three first VWD domains. The numbers at the branches represent posterior probabilities. The tree shows that the identified mucins group with MUC5 and MUC2, but not with MUC6, from chicken and human. (c) - Tissue distribution of the mucin transcripts as detected by RT-PCR. The muc5 family of mucins is expressed in respiratory organs (skin, gills, pharynx and esophagus). muc2.1 expression is detected in the digestive system, predominantly in the gut which is typical for MUC2 mucins in mammals. muc2.2 expression is detected in reproductive organs. Full size image

Based on the N-terminal portions of the mucin proteins that contain the first three VWD domains we constructed phylogenetic trees (Fig. 1b: MrBayes, Supplementary Fig. S2: neighbor-joining tree). The results show that Muc5.1, Muc5.2 and Muc5.3 group with the human and chicken MUC5AC and MUC5B. Muc5.1 and Muc5.2 appear closely related to each other. Muc2.1 and Muc2.2 group with the human and chicken MUC2 mucins. Our data also suggest that none of the studied zebrafish genes are related to the vertebrate MUC6 mucin.

From the genomic structure we derived that the genes muc5.1, muc5.2 and muc2.2 are localized in a cluster on chromosome 25 (Supplementary Fig. S1). muc2.1 currently has an unassigned position in the genome, Zv9_NA774, with approximately two thirds of its length unknown. The muc5.3 gene, which is located on chromosome 7, has the gene tollip as its immediate neighbor. The occurrence of zebrafish mucin genes in a cluster, as well as the genomic synteny with the tollip gene, is reminiscent of the mucin gene organization in other vertebrates, including humans24.

To determine the tissue distribution of the putative zebrafish mucin gene transcripts we performed RT-PCR from tissue isolated from adult zebrafish (Fig. 1c). Our data show that muc5.1 and muc5.2 are both expressed in the skin, the gills and the pharynx/esophagus, while muc5.3 expression appears to be restricted to the pharynx/esophagus. By in situ hybridization on adult zebrafish sections we were able to further specify the distribution of muc5.1 and muc5.2 to the pharynx and muc5.3 to the esophagus (Supplementary Fig. S4). In addition, by separating the gill lamellae from the gill arches we saw that only muc5.1 is expressed in the lamellar part (Supplementary Fig. S5) and hence, appears to represent a bona fide respiratory mucin. muc2.1 is predominantly found in the gut, while muc2.2 is expressed in testes and ovaries (Fig. 1c). Together, the expression pattern of the zebrafish muc5 family appears reminiscent of the human MUC5 family, which is found in the respiratory and the upper digestive tracts25. Moreover, muc2.1 in zebrafish shares its tissue localization with human MUC2, the major mucin in the gut. The outlier is muc2.2, which is expressed in the reproductive organs in zebrafish; in humans, MUC5AC, MUC5B and MUC6 are found in the male urogenital tract26 and female endocervix. We also observed that expression of the mucin genes during zebrafish development correlates with the initiation of development of the respective organs, in which mucins are found in the adult fish27,28 (Supplementary Fig. S6).

The above sequence information on mucin genes was used to create a fluorescent reporter of mucin activity in zebrafish. The complete open reading frames of secreted mucins is difficult to tag due to its large size. As a consequence, thus far only one full length mucin, MUC5AC from the mouse, has been successfully fluorescently tagged and expressed under a constitutive promoter29. Our goal here was to generate a reporter in zebrafish that is expressed under the endogenous mucin promoter, which would enable the real time tracking of mucin production in response to physiological changes. To achieve this, we excised from the BAC CH211-19808 9.8 kb of genomic sequence of muc5.1 that comprises 4.6 kb upstream and 5.2 kb downstream of the mucin start codon ATG and cloned it into the pBSII KS(+) vector. Our goal was to express the Red Fluorescent Protein (RFP) in frame with the ATG and the muc5.1 secretory signaling sequence to enable secretion of the mucin reporter. We inserted a Tag-RFP targeting cassette at 93 bp downstream from the mucin ATG, using Lambda-Red homologous recombination (Supplementary Fig. S7). In this construct, the RFP is terminated with the stop codon and hence, does not include the mucin coding sequence beyond the signaling sequence.

To test for the functionality of the mucin reporter, the construct was linearized, injected into one-cell stage zebrafish embryos and the animals were raised to adulthood and screened for germ line transmission. In germ line transgenics of the muc5.1:S-RFP reporter, fluorescence became detectable at two days post-fertilization (dpf) as dots on the skin (Fig. 2a). At day four, the fluorescent signal was evident along the body axis including in the mouth of the larva (Fig. 2a). At two weeks post fertilization, the fluorescent signal was scattered throughout the skin at an average of 2500 dots/mm2 (Fig. 2b). Such dotted distribution has been reported for mucus-secreting cells in the zebrafish intestine and in the human respiratory epithelium30,31. A closer inspection of the fluorescent loci by confocal microscopy shows that the mucin reporter is stored in relatively large granules inside the cells, which resemble secretory vesicles characteristically produced by mucus-secreting cells (Fig. 2c)32. The cellular membranes in Fig. 2c were visualized with GFP fused to a CAAX motif, which targets the GFP to the membrane and which is expressed under the regulation of the muc5.1 promoter. The GFP-CAAX reporter delineates the cellular and vesicular membranes of the muc5.1 producing cells (Fig. 2c). Together these data suggest that the mucin reporter is produced in secretory cells and compartmentalized in granules, as is expected for secreted mucins.

Figure 2 Expression of the fluorescent mucin reporter muc5.1:S-RFP in zebrafish. (a) –Visualization of embryos at 2, 4, 7 and 14 days post fertilization (dpf) by fluorescence (top row) and bright field (bottom row) microscopy shows that the mucin reporter expresses in distinct loci distributed across the skin of the fish. Scale bars are 0.5 mm. (b) –Live visualization of the fluorescent mucin reporter in the head (top) and trunk (bottom) of 14 dpf fish. Scale bar is 200 μm. (c) – Confocal image of a muc5.1:S-RFP-positive locus shows that the mucin reporter is packed inside secretory vesicles in cells within the skin. The cell membranes are labeled with GFP that was expressed under the promoter of muc5.1 and targeted to the membrane via a CAAX motif. Scale bar is 5 μm. (d) – Exposure of fish to LPS results in the partial loss of muc5.1:S-RFP loci and a simultaneous appearance of fluorescence in the immediate surrounding of the fish, suggesting the secretion of the mucin reporter. The bright field image (bottom) shows that fish remain intact during this treatment. (e) – Quantification of fluorescent loci within a consistent region (approximately between the head and top of the trunk) in ten individual fish before and after exposure to LPS. The red square is the mean of 10 control fish (no LPS addition) at t = 0, 5 and 10 minutes. The mean value is 35.7 across all three time points. For the LPS addition, each point represents the number of fluorescent loci in the same fish at the various time points (10 fish total). The error bars indicate standard deviation. One * indicates p < 0.05 between 0 minutes and 5 minutes using paired two-tailed T-test. Two ** indicates p < 0.01 using the same test between 0 minutes and 10 minutes. In most fish, a substantial proportion of loci are lost on treatment with LPS, suggesting that the mucin-reporter is secreted on this stimulus. Full size image

To test if muc5.1:S-RFP can be expelled from the cells, hence allowing the observation of live secretion, we used lipopolysaccharide (LPS) from E. coli as a characterized stimulant33. Our data show that in 4 dpf fish, on exposure to 0.5 μg/μl LPS, the majority of muc5.1:S-RFP-producing cells lose fluorescence within minutes while the released RFP collects as a halo around the fish (Fig. 2d). A quantification of cells that expel muc5.1:S-RFP reveals that roughly 40% of the putative goblet cells per fish release the fluorescence within five minutes after induction with LPS, with further signal loss within the next five minutes (Fig. 2d, 2e). 10 minutes after initial addition of LPS the anesthetized fish displayed normal heartbeat and circulation, suggesting that the expulsion of the mucin-reporter is not due to toxicity.

In summary, we show evidence for five gel-forming secreted mucin genes in zebrafish with a high degree of homology to other vertebrate mucins in their genomic and protein domain organization, as well as their tissue specific expression. We developed a strategy to build a fluorescent mucin reporter expressed under native regulatory elements and show that its release can be triggered and quantified in the live fish. Together, our work offers a useful set of tools to study the dynamics of mucin secretion and expression in the unperturbed fish and upon pathogenic, pharmacological or genetic challenges. Our hope is that this experimental system may allow for screening of conditions that not just trigger mucus secretion, but cause long-term effects of goblet cell differentiation seen as metaplasia/hyperplasia in mucosal diseases.