In this study, higher prenatal fluoride exposure, in the general range of exposures reported for other general population samples of pregnant women and nonpregnant adults, was associated with lower scores on tests of cognitive function in the offspring at age 4 and 6–12 y. https://doi.org/10.1289/EHP655

We had complete data on 299 mother–child pairs, of whom 287 and 211 had data for the GCI and IQ analyses, respectively. Mean (SD) values for urinary fluoride in all of the mothers ( n = 299 ) and children with available urine samples ( n = 211 ) were 0.90 ( 0.35 ) mg / L and 0.82 ( 0.38 ) mg / L , respectively. In multivariate models we found that an increase in maternal urine fluoride of 0.5 mg / L (approximately the IQR) predicted 3.15 (95% CI: − 5.42 , − 0.87 ) and 2.50 (95% CI − 4.12 , − 0.59 ) lower offspring GCI and IQ scores, respectively.

We studied participants from the Early Life Exposures in Mexico to Environmental Toxicants (ELEMENT) project. An ion-selective electrode technique was used to measure fluoride in archived urine samples taken from mothers during pregnancy and from their children when 6–12 y old, adjusted for urinary creatinine and specific gravity, respectively. Child intelligence was measured by the General Cognitive Index (GCI) of the McCarthy Scales of Children’s Abilities at age 4 and full scale intelligence quotient (IQ) from the Wechsler Abbreviated Scale of Intelligence (WASI) at age 6–12.

Some evidence suggests that fluoride may be neurotoxic to children. Few of the epidemiologic studies have been longitudinal, had individual measures of fluoride exposure, addressed the impact of prenatal exposures or involved more than 100 participants.

Introduction

Community water, salt, milk, and dental products have been fluoridated in varying degrees for more than 60 y to prevent dental caries, while fluoride supplementation has been recommended to prevent bone fractures (Jones et al. 2005). In addition, people may be exposed to fluoride through the consumption of naturally contaminated drinking water, dietary sources, dental products, and other sources (Doull et al. 2006). Whereas fluoride is added to drinking water [in the United States at levels of 0.7 – 1.2 mg / L (Doull et al. 2006)] to promote health, populations with exceptionally high exposures, often from naturally contaminated drinking water, are at risk of adverse health effects, including fluorosis.

In the United States, the U.S. Environmental Protection Agency (EPA) is responsible for establishing maximum permissible concentrations of contaminants, including fluoride, in public drinking-water systems. These standards are guidelines for restricting the amount of fluoride contamination in drinking water, not standards for intentional drinking-water fluoridation. In 2006 the U.S. EPA asked the U.S. National Research Council (NRC) to reevaluate the existing U.S. EPA standards for fluoride contamination, including the maximum contaminant level goal (MCLG, a concentration at which no adverse health effects are expected) of 4 mg / L , to determine if the standards were adequate to protect public health (Doull et al. 2006). The committee concluded that the MCLG of 4 mg / L should be lowered because it puts children at risk of developing severe enamel fluorosis, and may be too high to prevent bone fractures caused by fluorosis (Doull et al. 2006). The Committee also noted some experimental and epidemiologic evidence suggesting that fluoride may be neurotoxic (Doull et al. 2006).

The National Toxicology Program (NTP) recently reviewed animal studies on the effects of fluoride on neurobehavioral outcomes and concluded that there was a moderate level of evidence for adverse effects of exposures during adulthood, a low level of evidence for effects of developmental exposures on learning and memory, and a need for additional research, particularly on the developmental effects of exposures consistent with those resulting from water fluoridation in the United States (Doull et al. 2006; NTP 2016). Human studies have shown a direct relationship between the serum fluoride concentrations of maternal venous blood and cord blood, indicating that the placenta is not a barrier to the passage of fluoride to the fetus (Shen and Taves, 1974). Fluoride was shown to accumulate in rat brain tissues after chronic exposures to high levels, and investigators have speculated that accumulation in the hippocampus might explain effects on learning and memory (Mullenix et al. 1995). An experimental study on mice has shown that fluoride exposure may have adverse effects on neurodevelopment, manifesting as both cognitive and behavioral abnormalities later in life (Liu et al. 2014).

Most epidemiologic studies demonstrating associations between fluoride exposure and lower neuropsychological indicators have been conducted in populations living in regions with endemic fluorosis that are exposed to high levels of fluoride in contaminated drinking water. The epidemiologic evidence is limited, however, with most studies using an ecologic design to estimate childhood exposures based on neighborhood measurements of fluoride (e.g., drinking water levels) rather than personal exposure measures. Moreover, almost all existing studies of childhood outcomes are cross-sectional in nature, rendering them weak contributors towards causal inference.

The main objective of this study was to assess the potential impact of prenatal exposures to fluoride on cognitive function and test hypotheses related to impacts on overall cognitive function. We hypothesized that fluoride concentrations in maternal urine samples collected during pregnancy, a proxy measure of prenatal fluoride exposure, would be inversely associated with cognitive performance in the offspring children. Overall, to our knowledge, this is one of the first and largest longitudinal epidemiologic studies to exist that either address the association of early life exposure to fluoride to childhood intelligence or study the association of fluoride and cognition using individual biomarker of fluoride exposure.

Methods

This is a longitudinal birth cohort study of measurements of fluoride in the urine of pregnant mothers and their offspring (as indicators of individual prenatal and postnatal exposures to fluoride, respectively) and their association with measures of offspring cognitive performance at 4 and 6–12 y old. The institutional review boards of the National Institute of Public Health of Mexico, University of Toronto, University of Michigan, Indiana University, and Harvard T.H. Chan School of Public Health and participating clinics approved the study procedures. Participants were informed of study procedures prior to signing an informed consent required for participation in the study.

Participants

Mother–child pairs in this study were participants from the successively enrolled longitudinal birth cohort studies in Mexico City that comprise the Early Life Exposures in Mexico to Environmental Toxicants (ELEMENT) project. Of the four ELEMENT cohorts [that have been described elsewhere (Afeiche et al. 2011)], Cohort 1 and Cohort 2B recruited participants at birth and did not have archived maternal-pregnancy urine samples required for this analysis; they were thus excluded. Mothers for Cohort 2A ( n = 327 ) and 3 ( n = 670 ) were all recruited from the same three hospitals in Mexico City that serve low-to-moderate income populations. Cohort 2A was an observational study of prenatal lead exposure and neurodevelopmental outcomes in children (Hu et al. 2006). Women who were planning to become pregnant or were pregnant were recruited during May 1997–July 1999 and were considered eligible if they consented to participate; were ≤ 14 wk of gestation at the time of recruitment; planned to stay in the Mexico City study area for at least 5 y; did not report a history of psychiatric disorders, high-risk pregnancies, gestational diabetes; did not report current use of daily alcohol, illegal drugs, and continuous prescription drugs; and were not diagnosed with preeclampsia, renal disease, circulatory diseases, hypertension, and seizures during the index pregnancy.

Cohort 3 mothers were pregnant women ( ≤ 14 wk of gestation) recruited from 2001 to 2003 for a randomized trial of the effect of calcium supplementation during pregnancy on maternal blood lead levels (Ettinger et al. 2009). Eligibility criteria were the same as for Cohort 2A, and 670 agreed to participate.

Exposure Assessment

By virtue of living in Mexico, individuals participating in the study have been exposed to fluoridated salt (at 250 ppm ) (Secretaría-de-Salud 1995, 1996) and to varying degrees of naturally occurring fluoride in drinking water. Previous reports, based on samples taken from different urban and rural areas, indicate that natural water fluoride levels in Mexico City may range from 0.15 to 1.38 mg / L (Juárez-López et al. 2007; Martínez-Mier et al. 2005). Mean fluoride content for Mexico City’s water supply is not available because fluoride is not reported as part of water quality control programs in Mexico.

Mother–child pairs with at least one archived urine sample from pregnancy and measures of neurocognitive function in the offspring were included in this study. In terms of when the archived samples were collected, the pregnant mothers were invited for assessments with the collection of samples during trimester 1 ( 13.6 ± 2.1 wk for Cohort 3 and 13.7 ± 3.5 wk for Cohort 2A), trimester 2 ( 25.1 ± 2.3 wk for Cohort 3 and 24.4 ± 2.9 wk for Cohort 2A), and trimester 3 ( 33.9 ± 2.2 wk for Cohort 3 and 35.0 ± 1.8 wk for Cohort 2A).

A spot (second morning void) urine sample was targeted for collection during each trimester of pregnancy of ELEMENT mothers as well as the offspring children at the time of their measurements of intelligence at 6–12 y old. The samples were collected into fluoride-free containers and immediately frozen at the field site and shipped and stored at − 20 ° C at the Harvard T.H. Chan School of Public Health (HSPH), and then at − 80 ° C at the University of Michigan School of Public Health (UMSPH).

A procedure for urine analysis of fluoride described elsewhere (Martínez-Mier et al. 2011) was adapted and modified for this study. The fluoride content of the urine samples was measured using ion-selective electrode-based assays. First, 3 M sulfuric acid saturated with hexamethyldisiloxane (HMDS) was added to the sample to allow fluoride to diffuse from the urine for 20–24 hr. The diffused fluoride was allowed to collect in 0.05 M of sodium hydroxide on the interior of the petri dish cover. Once the diffusion was complete, 0.25 M of acetic acid was added to the sodium hydroxide to neutralize the solution and then analyzed directly using a fluoride ion-selective electrode (Thermo Scientific Orion, Cat#13-642-265) and pH/ISE meter (Thermo Scientific Orion, Cat#21-15-001). All electrode readings (in millivolts) were calculated from a standard curve. Analyses were performed in a Class 100/1,000 clean room. Quality control measures included daily instrument calibration, procedural blanks, replicate runs, and the use of certified reference materials (Institut National de Santé Publique du Québec, Cat #s 0910 and 1007; NIST3183, Fluoride Anion Standard). Urinary fluoride concentrations were measured at the UMSPH and the Indiana University Oral Health Research Institute (OHRI) as previously described (Thomas et al. 2016). A validation study comparing measures taken by the two labs in the same samples revealed a between-lab correlation of 0.92 (Thomas et al. 2016).

There were a total of 1,484 prenatal samples measured at the UMSPH lab. All of these samples were measured in duplicate. Of these, 305 (20%) of them did not meet the quality control criteria for ion-selective electrode-based methods (i.e., RSD < 20 % for samples with F level < 0.2 ppm or RSD < 10 % when F level > 0.2 ppm ) (Martinez-Mier et al. 2011). Of these 305, 108 had a second aliquot available and were successfully measured at the OHRI lab in Indiana (sufficient urine volume was not available for the remaining 197 samples). The OHRI lab in Indiana also measured an additional 289 samples. Of the 397 total samples measured at the OHRI lab in Indiana, 139 (35%) were measured in duplicate, for which > 95 % complied with the quality control criteria above; thus, all 139 values were retained. The remaining 258 (65%) were not measured in duplicate because of limitations in available urine volume, but were included in the study given the excellent quality control at the OHRI lab. In total, we ended up with 1,576 prenatal urine samples with acceptable measures of fluoride.

Of these 1,576 urine samples, 887 also had data on urinary creatinine and were associated with mother–offspring pairs who had data on the covariates of interest and GCI or IQ in the offspring. The urinary creatinine data were used to correct for variations in urine dilution at the time of measurement (Baez et al. 2014). Creatinine-adjusted urinary fluoride concentrations were obtained for each maternally derived sample by dividing the fluoride concentration (MUF) in the sample by the sample’s creatinine concentration (MUC), and multiplying by the average creatinine concentration of samples available at each trimester ( MUC average ) using the formula: ( MUF / MUC ) × MUC average . The values of average creatinine concentration used for the MUC average at each trimester were derived from the larger pool of trimester-1, -2, and -3 samples from Cohorts 2A and 3 examined in our previous report on maternal fluoride biomarker levels (Thomas et al. 2016): 100.81, 81.60, and 72.41 (mg/L), respectively. For each woman, an average of all her available creatinine-adjusted urinary fluoride concentrations during pregnancy (maximum three samples and minimum one sample) was computed and used as the exposure measure ( MUF cr ). For children, as creatinine measurements were not available, urinary fluoride values (CUF) were corrected for specific gravity (SG) using the formula CUFsg = CUF ( 1.02 − 1 ) / ( SG − 1 ) (Usuda et al. 2007).

After calculating MUF cr for the 887 urine samples noted above, 10 values of MUF cr were identified as extreme outliers ( > 3.5 SDs ) and were dropped, leaving 877 measures of MUF cr . These 877 measures of MUF cr stemmed from 512 unique mothers. Of these 512, 71 participants had measurements from each of the three trimesters; 224 had measurements from two of the three trimesters (74, T1 and T2; 131, T1 and T3; and 19, T2 and T3); and 217 had measurements from only one of the trimesters (159, T1; 34, T2; and 24, T3).

Measurement of Outcomes

At age 4 y, neurocognitive outcomes were measured using a standardized version of McCarthy Scales of Children’s Abilities (MSCA) translated into Spanish (McCarthy 1991). MSCA evaluates verbal, perceptual-performance, quantitative, memory, and motor abilities of preschool-aged children, and it has previously been successfully used in translated versions (Braun et al. 2012; Julvez et al. 2007; Kordas et al. 2011; Puertas et al. 2010). For this analysis, we focused on the General Cognitive Index (GCI), which is the standardized composite score produced by the MSCA (McCarthy 1991). For children 6–12 y old a Spanish-version of the Wechsler Abbreviated Scale of Intelligence (WASI) (Wechsler 1999) was administered. WASI includes four subtests (Vocabulary, Similarities, Block Design, and Matrix Reasoning), which provide estimates of Verbal, Performance, and Full-Scale IQ (Wechsler 1999). Both tests were administered by a team of three psychologists who were trained and supervised by an experienced developmental psychologist (L.S.). This team of three psychologists applied all of the McCarthy tests as well as the WASI-FSIQ tests. At the time of follow-up visits (age 4 and 6–12 y), each child was evaluated by one of the psychologists who was blind to the children’s fluoride exposure. The inter-examiner reliability of the psychologists was evaluated by having all three psychologists participate in assessments on a set of 30 individuals. For these 30, the inter-examiner reliability of the psychologists was evaluated by calculating the correlation in GCI scores by two of the psychologists with the scores of a third psychologist whom they observed applying the test in all three possible combinations with 10 participants for each observers–examiner pair (i.e., psychologist A (applicant) was observed by psychologist B and psychologist C; psychologist B (applicant) was observed by psychologist A and psychologist C; and psychologist C (applicant) was observed by psychologist A and psychologist B). The mean observer–examiner correlation was 0.99. All raw scores were standardized for age and sex (McCarthy 1991). Inter-examiner reliability was not examined on the WASI test.

Measurement of Covariates

Data were collected from each subject by questionnaire on maternal age (and date of birth), education, and marital status at the first pregnancy visit; on birth order, birth weight, and gestational age at delivery; and on maternal smoking at every prenatal and postnatal visit. Gestational age was estimated by registered nurses. Maternal IQ was estimated using selected subtests of the Wechsler Adult Intelligence Scale (WAIS)-Spanish (Information, Comprehension, Similarities, and Block Design), which was standardized for Mexican adults (Renteria et al. 2008; Wechsler et al. 1981). Maternal IQ was measured at the study visit 6 mo after birth or at the 12-mo visit if the earlier visit was not completed.

The quality of the children’s individual home environments was assessed using an age-appropriate version of the HOME score. However, the measure was not available for all observations because it was only added to on-going cohort evaluation protocols beginning in April 2003, when a version of the HOME score instrument that is age-appropriate for children 0–5 y old was adopted, following which a version of the HOME score instrument that is age-appropriate for children ≥ 6 y old was adopted in September 2009 (Caldwell and Bradley 2003). Thus, we adjusted for HOME score using the measures for 0- to 5-y-old children in the subset of children who had this data in our analyses of GCI, and we adjusted for HOME score using the measures for > 6 - y - old children in the subset of children who had this data in our analyses of IQ.

Statistical Analyses

Univariate distributions and descriptive statistics were obtained for all exposure variables, outcome variables, and model covariates. For each variable, observations were classified as outliers if they were outside the bounds of the mean ± 3.5 SDs. Primary analyses were conducted with exposure and outcome outliers excluded. Statistical tests of bivariate associations were conducted using chi-square tests for categorical variables and analysis of variance (ANOVA) to compare the means of the outcomes or exposure within groups defined according to the distribution of each covariate. Spearman correlation coefficients were used to measure the correlation between MUF cr and CUF sg . Regression models were used to assess the adjusted associations between prenatal fluoride and each neurocognitive outcome separately. Generalized additive models (GAMs) were used to visualize the adjusted association between fluoride exposure and measures of intelligence [SAS statistical software (version 9.4; SAS Institute Inc.)]. Because the pattern appeared curvilinear, and because GAMs do not yield exact p-values for deviations from linearity, we used a Wald p-value of a quadratic term of fluoride exposure to test the null hypothesis that a quadratic model fit the data better than the model assuming a linear relationship, and thus obtained a p-value for deviation from linearity of the fluoride–outcome associations. Residual diagnostics were used to examine other model assumptions and identify any additional potentially influential observations. Visual inspection of default studentized residual versus leverage plot from SAS PROC REG did not identify potential influential observations. Visual inspection of the histogram of the residuals did not indicate lack of normality; however, a fanning pattern in the residual versus predicted value plot indicated lack of constant variance (data not shown). Hence, robust standard errors were obtained using the “empirical” option in SAS PROC GENMOD.

Our overall strategy for selecting covariates for adjustment was to identify those that are well known to have potential associations with either fluoride exposure or cognitive outcomes and/or are typically adjusted for as potential confounders in analyses of environmental toxicants and cognition. All models were adjusted for gestational age at birth (in weeks), birthweight (kilograms), birth order (first born yes vs. no), sex, and child’s age at the time of the neurocognitive test (in years). All models were also adjusted for maternal characteristics including marital status (married vs. others), smoking history (ever-smoker vs. never-smoker), age at delivery, IQ, and education (itself also a proxy for socioeconomic status). Finally, all models adjusted for potential cohort effects by including indicator variables denoting from which cohort (Cohort 2A, Cohort 3 + Ca supplement, and Cohort 3 -placebo) the participants came. We used 0.5 mg / L , which was close to the interquartile range of MUF cr for the analyses of both GCI ( IQR = 0.45 ) and IQ ( IQR = 0.48 ), as a standard measure of incremental exposure. SAS statistical software (version 9.4; SAS Institute Inc.) was used for all data analyses described.

Sensitivity Analyses

Models were further adjusted for variables that relate to relatively well-known potential confounders (but for which we were missing a significant amount of data) and variables that were less-well known but possible confounders. The HOME scores were subject to sensitivity analyses because, as noted in the “Methods” section, they were not added to the subject evaluation protocols until 2003, resulting in a significantly smaller subsample of participants with this data. Models of the association between prenatal fluoride exposure ( MUF cr ) and IQ at 6–12 y old were also adjusted for the child’s urine fluoride concentration at 6–12 y of age ( CUF sg ), a measure that was collected in a significantly smaller subset of individuals, to evaluate the potential role of contemporaneous exposure. Associations between prenatal fluoride exposure ( MUF cr ) and GCI at 4 y old could not be adjusted for contemporaneous fluoride exposure because urine samples were not collected from children when the MSCA (from which the GCI is derived) was administered. Maternal bone lead measured by a 109-Cd K-X-ray fluorescence (KXRF) instrument at 1 mo postpartum, a proxy for lead exposure from mobilized maternal bone lead stores during pregnancy (Hu et al. 2006), was included in the model to test for the possible confounding effect of lead exposure during pregnancy. We focused on the subset of women who had patella bone lead values because these were found to be most influential on our previous prospective study of offspring cognition (Gomaa et al. 2002). Average maternal mercury level during pregnancy was also tested for being a potential confounder (Grandjean and Herz 2011). Mercury was measured as total mercury content in the subsample of women who had samples of archived whole blood samples taken during pregnancy with sufficient volume to be analyzed using a Direct Mercury Analyzer 80 (DMA-80, Milestone Inc., Shelton, CT, USA) as previously described (Basu et al. 2014).

To address the potential confounding effect of socioeconomic status (SES) we conducted sensitivity analyses that adjusted our model for SES (family possession score). The socioeconomic questionnaire asked about the availability of certain items and assets in the home. Point values were assigned to each item, and SES was calculated based on the sum of the points across all items (Huang et al. 2016). Given that the calcium intervention theoretically could have modified the impact of fluoride, in examining our results, we repeated the analyses with and without the Cohort 3 participants who were randomized to the calcium intervention to omit any potential confounding effect of this intervention. Another sensitivity test was performed to examine the potential effect of the psychologist who performed the WASI test by including tester in the regression model. The information about psychologists who performed the WASI was available for 75% of participants, as recording this data was added later to the study protocol. We also re-ran models with exposure outliers included as a sensitivity step. Finally, we ran models that focused on the cross-sectional relationship between children’s exposure to fluoride (reflected by CUF sg ) and IQ score, unadjusted; adjusting for the main covariates of interest; and adjusting for prenatal exposure ( MUF cr ) as well as the covariates of interest.

Results

Flow of Participants

Of the 997 total mothers from two cohorts evaluated, 971 were eligible after removing mothers < 18 y old. Of these 971, 825 had enough urine sample volume to measure fluoride in at least one trimester urine sample, and of these 825 participants, 515 participants had urine samples with previously measured creatinine values, enabling calculation of creatinine-adjusted urinary fluoride ( MUF cr ) concentrations. Of these 515, 3 participants were excluded based on the 10 extreme outlier values identified for MUF cr (see the “Methods” section, “Exposure Assessment” subsection) and not having any other MUF cr values to remain in the analysis. Thus, we had a total of 512 participants (mothers) with at least one value of MUF cr for our analyses (Figure 1).

Figure 1. Flowchart describing source of mother–offspring subject pairs, fluoride and cognition study. Cohort 2A was designed as an observational birth cohort of lead toxicodynamics during pregnancy, with mothers recruited early during pregnancy from 1997 to 2001. Cohort 3 was designed as a randomized double-blind placebo-controlled trial of calcium supplements, with mothers recruited early during pregnancy from 2001 to 2006. “Ca” denotes subjects who were randomized to the calcium supplement; “placebo” denotes subjects who were randomized to the placebo. GCI is the McCarthy Scales General Cognitive Index (administered at age 4 y). IQ is the Wechsler Abbreviated Intelligence Scales Intelligence Quotient (administered at age 6–12 y and age-adjusted).

Of these 512 mothers, 312 had offspring with outcome data at age 4 (i.e., GCI), and 234 had offspring with outcome data at age 6–12 (i.e., IQ). Of these, complete data on all the covariates of main interest (as specified in the “Methods” section) were available on 287 mother–child pairs for the GCI analysis and 211 mother–child pairs for the IQ analysis. A total of 299 mother–child pairs had data on either GCI or IQ, and 199 mother–child pairs had data on both GCI and IQ (Figure 1).

Number of Exposure Measures per Subject

In terms of repeated measures of MUF cr across trimesters, of the 287 participants with data on GCI outcomes; 25 participants had MUF cr data for all three trimesters (11 from Cohort 2A and 14 from Cohort 3), 121 participants had MUF cr data from two trimesters (48 from Cohort 2A and 73 from Cohort 3), and 141 participants had MUF cr data from one trimester (51 from Cohort 2A and 90 from Cohort 3). Of the 211 participants with data on IQ outcomes, 10 participants had MUF cr data for all three trimesters (6 from Cohort 2A and 4 from Cohort 3), 82 participants had data from two trimesters (32 from Cohort 2A and 50 from Cohort 3), and 119 participants had data from one trimester (40 from Cohort 2A and 79 from Cohort 3).

Comparisons across the Cohorts

In terms of the mother–child pairs who had data on all covariates as well as data on either GCI or IQ ( n = 299 ), the mean (SD) values of creatinine–corrected urinary fluoride for the mothers was 0.90 ( 0.36 ) mg / L . The distributions of the urinary fluoride, outcomes (GCI and IQ), and additional exposure variables examined in our sensitivity analyses (maternal bone lead, maternal blood mercury, and children’s contemporaneous urinary fluoride) across the three cohort strata (Cohort 3-Calcium, Cohort 3-placebo, and Cohort 2A) and all strata combined are shown in Table 1 for the mother–child pairs who had data for the GCI outcome ( n = 287 ) and the IQ outcome ( n = 211 ). The distributions showed little variation across the cohort strata except for bone lead and possibly blood mercury, for which, in comparison with Cohort 3, Cohort 2A clearly had higher mean bone lead levels ( p < 0.001 ) and possibly higher blood mercury levels ( p = 0.067 ). The mean (SD) values of specific gravity–corrected urinary fluoride for the children who had these measures (only available for those children who had IQ; n = 189 ) were 0.82 ( 0.38 ) mg / L .

Table 1 Comparisons across cohorts with respect to the distributions of biomarkers of exposure to prenatal fluoride ( MUF cr ), prenatal lead (maternal bone Pb), prenatal mercury (maternal blood Hg), and contemporaneous childhood fluoride ( CUF sg ); and cognitive outcomes (GCI and IQ). Table 1 lists analyses in the first column. The corresponding measurements; cohorts; n values; mean; SD; minimum; 25th, 50th, and 75th percentiles; maximum, and p-values are listed in the other columns. Analysis Measurement Cohort N Mean SD Min Percentiles Max p - Value a 25 50 75 GCI Analysis GCI Cohort 3-Ca 84 96.88 14.07 50 88 96 107 124 0.997 Cohort 3-placebo 93 96.80 13.14 50 89 96 105 125 Cohort 2A 110 96.95 15.46 56 88 98 110 125 Totalb 287 96.88 14.28 50 88 96 107 125 MUF cr (mg/L) Cohort 3-Ca 84 0.92 0.41 0.28 0.60 0.84 1.14 2.36 0.57 Cohort 3-placebo 93 0.87 0.34 0.23 0.62 0.82 1.10 2.01 Cohort 2A 110 0.92 0.33 0.23 0.68 0.86 1.11 2.14 Totalb 287 0.90 0.36 0.23 0.65 0.84 1.11 2.36 Maternal bone Pb ( μ g / g ) Cohort 3-Ca 62 7.30 7.37 0.05 0.75 4.40 12.93 26.22 < 0 . 01 Cohort 3-placebo 43 9.21 7.31 0.11 1.50 8.60 13.97 27.37 Cohort 2A 62 13.60 11.36 0.15 5.35 10.52 19.46 47.07 Totalc 167 10.13 9.41 0.05 2.37 8.22 15.37 47.07 Maternal blood Hg ( μ g / L ) Cohort 3-Ca 38 3.32 1.40 0.73 2.40 3.00 4.15 7.06 0.12 Cohort 3-placebo 28 2.80 1.33 1.27 1.89 2.53 3.40 7.22 Cohort 2A 75 4.53 5.61 0.77 2.30 3.24 4.37 35.91 Totalc 141 3.86 4.25 0.73 2.20 3.08 4.15 35.91 IQ Analysis IQ Cohort 3-Ca 58 94.91 9.86 76 87 96 100 120 0.69 Cohort 3-placebo 75 96.29 9.63 75 89 97 102 124 Cohort 2A 78 96.47 13.20 67 87 96 107 131 Totald 211 95.98 11.11 67 88 96 107 131 MUF cr (mg/L) Cohort 3-Ca 58 0.89 0.38 0.29 0.57 0.84 1.10 1.85 0.86 Cohort 3-placebo 75 0.87 0.35 0.23 0.61 0.82 1.11 2.01 Cohort 2A 78 0.90 0.34 0.23 0.67 0.85 1.09 2.14 Totald 211 0.89 0.36 0.23 0.64 0.82 1.07 2.14 Maternal bone Pb ( μ g / g ) Cohort 3-Ca 67 6.97 7.20 0.05 0.76 4.36 11.73 26.22 < 0 . 01 Cohort 3-placebo 48 9.07 7.42 0.11 1.00 8.49 14.41 27.37 Cohort 2A 62 13.60 11.36 0.15 5.35 10.52 19.46 47.07 Totale 177 9.86 9.33 0.05 2.29 7.95 15.22 47.07 Maternal blood Hg ( μ g / L ) Cohort 3-Ca 43 3.25 1.41 0.51 2.43 2.87 4.02 7.06 0.067 Cohort 3-placebo 31 2.66 1.36 0.78 1.81 2.40 3.26 7.22 Cohort 2A 75 4.53 5.61 0.77 2.30 3.24 4.37 35.91 Totale 149 3.77 4.16 0.51 2.19 2.90 4.11 35.91 CUF sg (mg/L) Cohort 3-Ca 71 0.84 0.4 0.31 0.53 0.78 1.12 2.8 0.29 Cohort 3-placebo 53 0.85 0.38 0.35 0.57 0.75 1.14 1.85 Cohort 2A 65 0.76 0.34 0.18 0.51 0.7 0.89 1.76 Totale 189 0.82 0.38 0.18 0.54 0.73 1.01 2.8 All available measurements GCI Cohort 3-Ca 133 97.32 13.67 50 88 96 107 124 0.57 Cohort 3-placebo 149 95.99 13.07 50 88 96 106 125 Cohort 2A 150 97.57 14.63 56 88 99 109 131 Totalf 432 96.95 13.80 50 88 96 107 131 IQ Cohort 3-Ca 91 95.92 10.15 76 88 95 103 120 0.92 Cohort 3-placebo 114 96.56 9.84 75 89 96 102 124 Cohort 2A 111 96.25 12.67 67 87 95 105 131 Totalf 316 96.27 10.97 67 88 96 103 131 MUF cr (mg/L) Cohort 3-Ca 181 0.89 0.36 0.28 0.64 0.83 1.09 2.36 0.11 Cohort 3-placebo 183 0.84 0.31 0.02 0.61 0.81 1.02 2.01 Cohort 2A 148 0.91 0.35 0.23 0.67 0.86 1.10 2.15 Totalf 512 0.88 0.34 0.02 0.64 0.82 1.07 2.36 Maternal bone Pb ( μ g / g ) Cohort 3-Ca 97 7.07 7.26 0.01 0.83 4.36 11.78 26.22 < 0 . 01 Cohort 3-placebo 74 9.15 8.38 0.11 0.85 8.62 13.41 40.8 Cohort 2A 86 13.77 11.30 0.15 5.49 10.52 20.58 47.07 Totalf 257 9.91 9.51 0.01 2.01 7.64 15.31 47.07 Maternal blood Hg ( μ g / L ) Cohort 3-Ca 55 3.03 1.41 0.51 2.12 2.77 3.62 7.06 0.09 Cohort 3-placebo 48 2.87 2.09 0.34 1.82 2.37 3.34 13.47 Cohort 2A 104 4.06 4.88 0.77 2.14 3.10 4.16 35.91 Totalf 207 3.51 3.70 0.34 2.07 2.80 3.79 35.91 CUF sg (mg/L) Cohort 3-Ca 104 0.84 0.39 0.31 0.56 0.75 1.07 2.80 0.227 Cohort 3-placebo 84 0.90 0.46 0.35 0.58 0.75 1.09 2.89 Cohort 2A 96 0.79 0.34 0.18 0.53 0.73 0.92 2.11 Totalf 284 0.84 0.40 0.18 0.57 0.74 1.00 2.89

In terms of the comparability of the participants across Cohort 2A and Cohort 3 with respect to our covariates, the distribution of the covariates was very similar with the exception of age of the offspring when IQ was measured, for which the mean ages were 7.6 and 10.0 y, respectively; and birth weight in the GCI analysis, for which Cohort 3 participants were slightly heavier than Cohort 2 participants (see Table S1).

GCI versus IQ Scores

There was a significant correlation between GCI at 4 y and IQ at 6–12 y old (Spearman r = 0.55 ; p < 0.01 ). There was no significant correlation between prenatal MUF cr and offspring CUF sg (Spearman r = 0.54 , p = 0.44 ).

Comparisons of Participants in Relation to Missing Data

In comparing the participants who were included for the GCI and IQ analyses with the participants who were not included (based on data missing on GCI, IQ or other covariates), the distribution of covariates were similar except for sex, for which the proportion of females was somewhat higher in the included versus excluded group for both the GCI and IQ analyses (Table 2).

Table 2 Analysis comparing subjects with and without data of interest [ n (%) or mean ± SD ] with respect to characteristics of mothers and children and sensitivity analysis covariates. Table 2 lists characteristics in the first column; the corresponding values included and excluded in GCI analysis and IQ analysis are listed in the other columns. Characteristic GCI analysis IQ analysis Included Excluded Included Excluded Total numbera 287 710 211 786 Sex Female 160 (56%) 244 (47%) 116 (55%) 288 (48%) Male 127 (44%) 275 (53%) 95 (45%) 307 (52%) Birth order First child 96 (33%) 184 (35%) 93 (32%) 279 (36%) ≥ 2 nd child 191 (67%) 335 (65%) 118 (68%) 507 (65%) Birth weight (kg) 3.11 ± 0.45 3.11 ± 0.44 3.11 ± 0.46 3.11 ± 0.43 Gestational age (wk) 38.66 ± 1.84 38.58 ± 1.68 38.56 ± 1.80 38.63 ± 1.72 Age at outcome assessment (y) 4.04 ± 0.05 4.05 ± 0.05 8.50 ± 1.31 8.83 ± 1.64 Maternal age at delivery (y) 26.78 ± 5.53 26.49 ± 5.37 27.16 ± 5.61 26.41 ± 5.36 Maternal education (y)b 10.63 ± 2.76 10.75 ± 3.08 10.80 ± 2.85 10.69 ± 3.03 Maternal IQc 88.63 ± 12.17 89.27 ± 14.6 89.01 ± 12.45 88.27 ± 13.00 Marital statusd 3.11 ± 0.45 3.11 ± 0.44 3.11 ± 0.46 3.11 ± 0.43 Married 201 (70%) 493 (70%) 149 (71%) 544 (69%) Other 86 (30%) 216 (30%) 62 (29%) 240 (31%) Maternal smokinge Ever 141 (49%) 335 (51%) 102 (48%) 374 (51%) Never 146 (51%) 325 (49%) 109 (52%) 362 (49%) Cohort Cohort 3-Ca 93 (32%) 241 (34%) 76 (36%) 259 (33%) Cohort 3-placebo 84 (29%) 252 (36%) 59 (28%) 278 (35%) Cohort 2A 110 (38%) 217 (31%) 78 (37%) 249 (32%) Sensitivity Analyses HOME score f N † = 138 N ‡ = 87 N † = 124 N ‡ = 55 35.24 ± 6.31 33.23 ± 6.55 35.54 ± 7.46 35.8 ± 7.44 SESg N † = 188 N ‡ = 110 N † = 199 N ‡ = 98 6.35 ± 2.43 6.94 ± 2.72 6.36 ± 2.41 6.98 ± 2.79 Maternal Bone Pb ( μ g / g )h N † = 167 N ‡ = 91 N † = 177 N ‡ = 80 9.26 ± 10.55 8.97 ± 10.32 9.02 ± 10.43 9.48 ± 10.55 Maternal Blood Hg ( μ g / L )i N † = 141 N ‡ = 67 N † = 149 N ‡ = 58 3.86 ± 4.25 2.76 ± 1.95 3.77 ± 4.16 2.83 ± 2.01 CUF sg j (mg/L) N † = 124 N ‡ = 55 35.54 ± 7.46 35.8 ± 7.44

In terms of the sensitivity analyses, for each sensitivity variable of interest, we compared participants who had data on our exposures, outcomes, covariates, and the sensitivity variable of interest (and were thus included in the sensitivity analysis) versus participants who had data on the sensitivity variable of interest but were missing data on the exposure, outcomes, and/or covariates of interest (and were thus excluded from the sensitivity analysis; Table 2). It can be seen that for each sensitivity analysis, most of the participants with data on the sensitivity variable of interest also had data on the exposures, outcomes, and covariates and were therefore included in the sensitivity analysis. In addition, the distributions appeared to be similar comparing those included with those excluded in each sensitivity analysis (means were within 10% of each other), with the exception of maternal blood Hg, for which the mean levels for those included were 28.5% and 24.9% higher than the mean levels for those excluded in the GCI and IQ analyses, respectively.

Comparisons of GCI and IQ across Covariates

Table 3 shows mean and SD values for MUF cr and offspring cognitive scores across categories of the covariates. In the participants with GCI data, the offspring cognitive scores were higher among mothers with higher levels of education, measured IQ, and HOME scores for both analyses; and scores were higher among first children and girls. In the IQ analysis a statistically significant difference was observed in MUF cr as a function of child sex. No significant differences in MUF cr values across levels of other covariates were observed. A modest difference (not statistically significant), was observed in MUF cr as a function of maternal IQ ( p = 0.09 ), and MUF cr as a function of child sex ( p = 0.09 ). Among other co-variates there were significant differences in age ( p < 0.01 ) in both analyses.

Table 3 Distributions of maternal creatinine-adjusted urinary fluoride ( MUF cr ) and offspring cognitive scores across categories of main covariates. Table 3 lists covariates in the first column. The corresponding n values, M U F subscript cr and their p-values, GCI (Age 4) and their p-values in the GCI analysis and n values, M U F subscript cr and their p-values, GCI (Age 6 to 12) and their p-values in the IQ analysis are listed in the other columns. Covariate GCI Analysis IQ Analysis n MUF cr a p - Value GCI (Age 4) p - Value n MUF cr a p - Value IQ (Age 6–12) p - Value Mothers Age ≥ 25 y 123 0.88 ± 0.36 0.45 96.22 ± 14.12 0.50 88 0.89 ± 0.37 0.98 95.75 ± 11.64 0.80 < 25 y 164 0.92 ± 0.36 97.37 ± 14.43 123 0.89 ± 0.35 96.15 ± 10.76 Education < 12 y 153 0.91 ± 0.4 0.92 94.22 ± 14.23 0.001 111 0.87 ± 0.37 0.53 93.09 ± 10.54 < 0 . 001 12 y 97 0.89 ± 0.34 98.56 ± 14.46 70 0.93 ± 0.35 98.29 ± 10.72 > 12 y 37 0.89 ± 0.42 103.49 ± 11.21 30 0.85 ± 0.31 101.3 ± 11.16 Marital status Married 201 0.90 ± 0.37 0.81 96.40 ± 14.46 0.39 62 0.90 ± 0.35 0.79 96.55 ± 11.06 0.63 Other 86 0.91 ± 0.33 98.00 ± 13.88 149 0.88 ± 0.36 95.74 ± 11.16 Smoking Ever smoker 141 0.90 ± 0.36 0.80 97.77 ± 13.9 0.30 102 0.90 ± 0.36 0.56 97.21 ± 10.7 0.12 Nonsmoker 146 0.91 ± 0.35 96.01 ± 14.63 109 0.87 ± 0.35 94.83 ± 11.41 HOME score b Mid - low ≤ 30 49 0.88 ± 0.37 0.47 90.73 ± 13.36 < 0.001 32 0.87 ± 0.36 0.85 89.88 ± 8.45 0.011 High > 30 137 0.92 ± 0.38 99.29 ± 14.61 92 0.88 ± 0.38 99.05 ± 11.65 Maternal IQ Mid - low ≤ 85 116 0.95 ± 0.35 0.09 93.16 ± 15.04 < 0.001 86 0.92 ± 0.36 0.23 91.26 ± 9.72 < 0.001 High > 85 171 0.87 ± 0.36 99.4 ± 13.21 125 0.86 ± 0.35 99.23 ± 10.87 Children Sex Boy 127 0.94 ± 0.36 0.09 93.93 ± 13.98 0.002 95 0.96 ± 0.38 0.008 96.82 ± 12.02 0.32 Girl 160 0.87 ± 0.36 99.22 ± 14.12 116 0.83 ± 0.32 95.29 ± 10.31 Birthweight ≥ 3.5 kg 241 0.91 ± 0.36 0.57 96.52 ± 14.36 0.33 201 0.89 ± 0.36 0.88 95.66 ± 11.29 0.58 < 3.5 kg 46 0.87 ± 0.35 98.76 ± 13.88 10 0.88 ± 0.34 97.38 ± 9.42 Gestational age ≤ 39 wk 192 0.90 ± 0.35 0.90 96.66 ± 14.23 716 146 0.89 ± 0.36 0.712 95.71 ± 11.62 0.65 > 39 wk 95 0.90 ± 0.37 97.32 ± 14.46 65 0.88 ± 0.34 96.58 ± 9.91 First child Yes 96 0.91 ± 0.38 0.75 99.97 ± 12.87 0.009 68 0.88 ± 0.36 0.91 97.00 ± 11.00 0.36 No 191 0.90 ± 0.35 95.32 ± 14.73 143 0.89 ± 0.36 95.50 ± 11.17 CUF sg c ≥ 0.80 mg / L 112 0.86 ± 0.32 0.49 96.80 ± 11.16 0.37 < 0.80 mg / L 77 0.90 ± 0.38 95.37 ± 10.31

Regression Models of GCI

Before adjustment, a 0.5 mg / L increase in MUF cr was negatively associated with GCI at 4 y old [mean score − 3.76 ; 95% confidence interval (CI): − 6.32 , − 1.19 ] (Table 4). The association was somewhat attenuated after adjusting for the main covariates (model A, − 3.15 ; 95% CI: − 5.42 , − 0.87 ). The smooth plot of the association between GCI and maternal prenatal urinary fluoride from an adjusted GAM model suggested a linear relation over the exposure distribution (Figure 2).

Figure 2. Adjusted association of maternal creatinine-adjusted urinary fluoride ( MUF cr ) and General Cognitive Index (GCI) scores in children at age 4 y. Adjusted for gestational age, weight at birth, sex, parity (being the first child), age at outcome measurement, and maternal characteristics including smoking history (ever smoked vs. nonsmoker), marital status (married vs. others), age at delivery, IQ, education, and cohort (Cohort 3-Ca, Cohort 3-placebo and Cohort 2A). Shaded area is 95% confidence interval. Short vertical bars on the x-axis reflect the density of the urinary fluoride measures. Individual data points are individual observations, n = 287 .

Table 4 Multivariate regression models: unadjusted and adjusted differences in GCI and IQ per 0.5 mg / L higher maternal creatinine-adjusted urinary fluoride ( MUF cr ). Table 4 lists estimates in the first column; the corresponding n values, beta values (95 percent confidence intervals), and p-values in GCI analysis and n values, beta values plus or minus S E (95 percent confidence intervals), and p-values in IQ analysis. Estimate GCI IQ n β (95%CI) p - Value n β ± S . E (95%CI) p - Value Unadjusted 287 − 3.76 ( − 6.32 , − 1.19 ) < 0.01 211 − 2.37 ( − 4.45 , − 0.29 ) 0.03 model Aa 287 − 3.15 ( − 5.42 , − 0.87 ) 0.01 211 − 2.50 ( − 4.12 , − 0.59 ) 0.01 Model A − HOME 138 − 3.63 ( − 6.48 , − 0.78 ) < 0.01 124 − 2.36 ( − 4.48 , − 0.24 ) 0.03 Model A + HOME 138 − 3.76 ( − 7.08 , − 0.45 ) 0.03 124 − 2.49 ( − 4.65 , − 0.33 ) 0.02 Model A − CUF sg 189 − 1.79 ( − 3.80 , 0.22) 0.08 Model A + CUF sg 189 − 1.73 ( − 3.75 , 0.29) 0.09 Model A − SES 188 − 4.55 ( − 7.23 , − 1.88 ) 0.01 199 − 2.10 ( − 4.02 , − 0.18 ) 0.03 Model A + SES 188 − 4.45 ( − 7.08 , − 1.81 ) 0.01 199 − 2.10 ( − 4.06 , − 0.15 ) 0.04 Model A − Pb 167 − 5.57 ( − 8.48 , − 2.66 ) < 0.01 177 − 3.21 ( − 5.17 , − 1.24 ) < 0.01 Model A + Pb 167 − 5.63 ( − 8.53 , − 2.72 ) < 0.01 177 − 3.22 ( − 5.18 , − 1.25 ) < 0.01 Model A − Hg 141 − 7.13 ( − 10.26 , − 4.01 ) < 0.01 149 − 4.59 ( − 7.00 , − 2.17 ) < 0.01 Model A + Hg 141 − 7.03 ( − 10.19 , − 3.88 ) < 0.01 149 − 4.58 ( − 6.99 , − 2.16 ) < 0.01 Model A − Ca 194 − 3.67 ( − 6.57 , − 0.77 ) 0.01 136 − 3.23 ( − 5.88 , − 0.57 ) 0.02

Regression Models of IQ

A 0.5 mg / L increase in prenatal fluoride was also negatively associated with IQ at age 6–12 y based on both unadjusted ( − 2.37 ; 95% CI: − 4.45 , − 0.29 ) and adjusted models ( − 2.50 ; 95% CI: − 4.12 , − 0.59 ) (Table 4). However, estimates from the adjusted GAM model suggest a nonlinear relation, with no clear association between IQ scores and values below approximately 0.8 mg / L , and a negative association above this value (Figure 3A). There was a nonsignificant improvement in the fit of the model when a quadratic term was added to the linear model ( p = 0.10 ).

Figure 3. (A) Adjusted association of maternal creatinine-adjusted urinary fluoride ( MUF cr ) and children’s IQ at age 6–12 y. Adjusted for gestational age, weight at birth, sex, parity (being the first child), age at outcome measurement, and maternal characteristics including smoking history (ever smoked vs. nonsmoker), marital status (married vs. others), age at delivery, IQ, education, and cohort (Cohort 3-Ca, Cohort 3-placebo and Cohort 2A). Short vertical bars on the x-axis reflect the density of the urinary fluoride measures. Individual data points are individual observation, n = 211 . (B) Association of maternal creatinine-adjusted urinary fluoride ( MUFU cr ) and children’s IQ at age 6–12 y, adjusted for specific gravity–adjusted child urinary fluoride ( CUF sg ). Adjusted for gestational age, weight at birth, sex, parity (being the first child), age and CUF sg at outcome measurement, and maternal characteristics including smoking history (ever smoked vs. nonsmoker), marital status (married vs. others), age at delivery, IQ, education. and cohort (Cohort 3-Ca, Cohort 3-placebo and Cohort 2A). Shaded area is 95% confidence interval. Short vertical bars on the x-axis reflect the density of the urinary fluoride measures. Individual data points are individual observation, n = 189 .

Sensitivity Analyses

In sensitivity analyses, adjustment for HOME score increased the magnitude of the association between MUF cr and GCI, though the difference was less pronounced when associations with and without adjustment for HOME score were both estimated after restricting the model to the subset of 138 children with HOME score data (Table 4). The association of IQ scores with MUF cr did not substantially change after adding HOME score to the model (Table 4).

The association between MUF cr and IQ was attenuated slightly after adjusting for contemporaneous children’s urinary fluoride ( CUF sg ) and comparing estimates with [ − 1.73 (95% CI: − 3.75 , 0.29)] and without [ − 1.94 (95% CI: − 4.15 , 0.26)] adjustment for CUF sg among the 189 children with this data (Table 4). In addition, the evidence of nonlinearity was more pronounced, with no clear evidence of an association for MUF cr < 1.0 mg / L based on the GAM model (Figure 3B), and a significant improvement in model fit when a quadratic term was added to the linear regression model ( p = 0.01 ).

When we restricted models to subsets of children with available data for each additional covariate, there was little difference between adjusted and unadjusted associations between MUF cr and GCI or IQ when socioeconomic status (family possession), maternal bone lead, and blood mercury, were added to models (Table 4). However, the effect estimates associated with MUF cr for these analyses appear to be higher in the subsets with available data for these variables.

Adding tester (psychologist who performed WASI) in the model did not substantially change the results (data not shown). In the sensitivity analyses in which we excluded Cohort 3 participants who received the calcium supplement, we continued to observe a negative association between MUF cr and GCI [ 0.5 mg / L increase in MUF cr associated with 3.67 lower GCI (95% CI: − 6.57 , − 0.77 ), n = 194 ]; and between MUF cr and IQ [ 0.5 mg / L increase in MUF cr associated with 3.23-lower IQ (95% CI: − 5.88 , − 0.57 ), n = 136 ].

In sensitivity analyses in which we re-ran models that included the 10 outliers with respect to fluoride exposure (for each of seven participants already in our models, an additional value of MUF cr [from a different trimester]; for three participants, a value of MUF cr that then allowed the participants to be added to our models), the results did not change in any meaningful way (data not shown). There were no outliers with respect to cognitive outcomes.

Independent Influence of Child Fluoride Exposure

Finally, in models that focused on the cross-sectional relationship between children’s exposure to fluoride (reflected by their specific gravity–adjusted urinary fluoride levels) and IQ score and that contained the main covariates of interest, there was not a clear, statistically significant association between contemporaneous children’s urinary fluoride ( CUF sg ) and IQ either unadjusted or adjusting for MUF cr . A 0.5 mg / L increase in CUF sg was associated with a 0.89 lower IQ (95% CI: − 2.63 , 0.85) when not adjusting for MUF cr ; and 0.77-lower IQ (95% CI: − 2.53 , 0.99), adjusting for MUF cr ( n = 189 ).

Discussion

In our study population of Mexican women and children, which accounted for two of the three cohorts included in the ELEMENT study, higher prenatal exposure to fluoride (as indicated by average creatinine-adjusted maternal urinary fluoride concentrations during pregnancy) was associated with lower GCI scores in children at approximately 4 y old, and with lower Full-Scale IQ scores at 6–12 y old. Estimates from adjusted linear regression models suggest that mean GCI and IQ scores were about 3 and 2.5 points lower in association with a 0.5 mg / L increase in prenatal exposure, respectively. The associations with GCI appeared to be linear across the range of prenatal exposures, but there was some evidence that associations with IQ may have been limited to exposures above 0.8 mg / L . In general, the negative associations persisted in sensitivity analyses with further adjustment for other potential confounders, though the results of sensitivity analyses were based on subsets of the population with available data.

Overall, our results are somewhat consistent with the ecological studies suggesting children who live in areas with high fluoride exposure (ranging from 0.88 to 11.0 mg / L fluoride in water, when reported) have lower IQ scores than those who live in low-exposure or control areas (ranging from 0.20 to 1.0 mg / L fluoride in water) (Choi et al. 2012) and with results of a pilot study of 51 children (mean age 7 y) from southern Sichuan, China, that reported that children with moderate or severe dental fluorosis (60% of the study population) had lower WISC-IV digit span scores than other children (Choi et al. 2015). A distinction is that our study, which was longitudinal with repeated measures of exposure beginning in the prenatal period, found associations with respect to prenatal fluoride exposures.

To our knowledge, the only other study that is similar to ours was only recently published. Valdez Jiménez et al. (2017) studied the association of prenatal maternal urinary fluoride levels (not corrected for dilution) and scores on the Bayley Scales of Infant Development II among 65 children evaluated at age 3–15 mo (average of 8 mo). The mothers in their study had urinary fluoride levels of which the means at each of the three trimesters of pregnancy (1.9, 2.0, 2.7 mg / L ) were higher than the mean MUF cr in our participants ( 0.88 mg / L ) (Valdez Jiménez et al. 2017). These levels of exposure were found to be associated with statistically significantly lower scores on the Bayley Scales’ Mental Development Index (MDI) score after adjusting for gestational age, age of child, a marginality index, and type of drinking water (Valdez Jiménez et al. 2017). By comparison, our study had much longer periods of follow-up and larger sample sizes, controlled for a much larger set of covariates and sensitivity variables, and used creatinine–corrected urinary fluoride measures (which, by adjusting for urinary dilution effects, provides a more reliable measure of internal fluoride exposure).

With respect to understanding the generalizability of our findings to other populations, there are very few studies that measured prenatal fluoride levels among women derived from population-based samples. Gedalia et al. (1959) measured urinary fluoride in multiple samples collected from each of 117 healthy pregnant women living in Jerusalem, where fluoride in the water was approximate 0.50 mg / L , and reported mean levels per person that ranged from 0.29 to 0.53 mg / L . However, these analysis were not conducted utilizing modern analytical techniques. In a study of 31 pregnant women living in Poland, Opydo-Szymaczek and Borysewicz-Lewicka (2005) measured urinary fluoride in healthy pregnant women patients of a maternity hospital in Poland, where fluoride in the water ranged from 0.4 to 0.8 mg / L , and found a mean level of 0.65 mg / L for women in their 28th week of pregnancy, 0.84 mg / L in their 33rd week, and 1.30 mg / L in healthy non-pregnant women of similar age. This would suggest that the mothers in our study, who had a mean MUF cr value of 0.90 mg / L , had fluoride exposures slightly higher than prior-mentioned populations.

In terms of comparing our findings with other studies of fluoride (using urinary fluoride as a biomarkers of exposure) and intelligence (i.e., those not involving prenatal exposures), of the 27 epidemiologic studies on fluoride and IQ reviewed by Choi et al. in their 2012 meta-analysis, only 2 had measures of urinary fluoride. Both were of urinary fluoride measures in children (not pregnant mothers), and neither corrected for dilution (either by correcting for urinary creatinine or specific gravity). Of these two, in comparison with the urinary fluoride levels of both our mothers ( 0.88 mg / L ) and our children ( 0.82 mg / L ), the mean levels of children’s urinary fluoride were higher in the non-fluorosis ( 1.02 mg / L ) and high-fluorosis ( 2.69 mg / L ) groups found by Li et al. (1995) as well as the control ( 1.5 mg / L ) and high-fluorosis ( 5.1 mg / L ) groups described by Wang et al. (2007).

Among the limitations of our study are that we measured fluoride in spot (second morning void) urine samples instead of 24-hr urine collections. However, others have noted a close relationship between the fluoride concentrations of early morning samples and 24-hr specimens (Watanabe et al. 1994; Zohouri et al. 2006). Another limitation relates to the potential differences in the distribution of covariates over our study cohorts, raising the issue of potential bias. In the analyses we conducted across cohorts, we saw that, in comparison with Cohort 3, Cohort 2A clearly had higher mean bone lead levels ( p < 0.001 ) and possibly higher blood mercury levels ( p = 0.067 ). However, we saw no other differences and the differences in these measures have a clear likely explanation: Cohort 2A had bone lead levels measured in 1997–2001 and Cohort 3 had bone lead levels measured in 2001–2005. Given that environmental lead and mercury exposures were steadily decreasing during this time interval (due to the phase-out of lead from gasoline), this difference likely relates to an exposure–time–cohort effect. We do not anticipate that this phenomenon would have introduced a bias in our analyses of fluoride and cognition controlling for bone lead.

Another limitation relates to the missing data that pertain to our covariate and sensitivity variables. In the comparisons of participants in relation to missing data (Table 2A,B), the proportion of females was somewhat higher in the included versus excluded group for both the GCI and IQ analyses, and the mean levels of maternal blood Hg for those included were 28.5% and 24.9% higher than the mean levels for those excluded in the GCI and IQ analyses, respectively. We also note that the coefficients for the associations between fluoride on cognition varied substantially in some of the sensitivity analyses, particularly with respect to the subgroups of participants who have data on SES, lead exposure, and mercury exposure (of which, for the latter, the effect estimates almost doubled). We do not have a ready explanation for this phenomenon, given that there is no obvious way that each of the selection factors governing which mothers had these measurements (discussed above) could have influenced the fluoride–cognition relationship. Nevertheless, it is not possible to entirely rule out residual confounding or in the population as a whole (that might have been detected had we had full data on larger sample sizes) or bias (should the subpopulations that had the data for analysis have a different fluoride–cognition relationship than those participants who were excluded from the analyses).

Other limitations include the lack of information about iodine in salt, which could modify associations between fluoride and cognition; the lack of data on fluoride content in water given that determination of fluoride content is not reported as part of the water quality monitoring programs in Mexico; and the lack of information on other environmental neurotoxicants such as arsenic. We are not aware of evidence suggesting our populations are exposed to significant levels of arsenic or other known neurotoxicants; nevertheless, we cannot rule out the potential for uncontrolled confounding due to other factors, including diet, that may affect urinary fluoride excretion and that may be related to cognition.

Another potential limitation is that we adjusted maternal urinary fluoride levels based on urinary creatinine, whereas we adjusted children’s urinary fluoride levels based on urinary specific gravity; however, these two methods are almost equivalent in their ability to account for urinary dilution. We also had no data to assess the inter-examiner reliability of the testers administering the WASI test; however, the excellent reliability of these same testers in administering the McCarthy tests provides some reassurance that the WASI tests were conducted in a consistent manner.

Finally, our ability to extrapolate our results to how exposures may impact on the general population is limited given the lack of data on fluoride pharmacokinetics during pregnancy. There are no reference values for urinary fluoride in pregnant women in the United States. The Centers for Disease Control and Prevention has not included fluoride as one of the population exposures measured in urine or blood samples in its nationally representative sampling. The WHO suggests a reference value of 1 mg / L for healthy adults when monitoring renal fluoride excretion in community preventive programs (Marthaler 1999). As part of the NRC’s review of the fluoride drinking-water standard, it was noted that healthy adults exposed to optimally fluoridated water had urinary fluoride concentrations ranging from 0.62 to 1.5 mg / L .

Conclusion

In this study, higher levels of maternal urinary fluoride during pregnancy (a proxy for prenatal fluoride exposure) that are in the range of levels of exposure in other general population samples of pregnant women as well as nonpregnant adults were associated with lower scores on tests of cognitive function in the offspring at 4 and 6–12 y old.

Community water and salt fluoridation, and fluoride toothpaste use, substantially reduces the prevalence and incidence of dental caries (Jones et al. 2005) and is acknowledged as a public health success story (Easley 1995). Our findings must be confirmed in other study populations, and additional research is needed to determine how the urine fluoride concentrations measured in our study population are related to fluoride exposures resulting from both intentional supplementation and environmental contamination. However, our findings, combined with evidence from existing animal and human studies, reinforce the need for additional research on potential adverse effects of fluoride, particularly in pregnant women and children, and to ensure that the benefits of population-level fluoride supplementation outweigh any potential risks.