Data

LSAC is an internationally recognized and widely-used longitudinal cohort survey of child health and development33, 34. There are multiple cohorts of LSAC that are designed for examining specific development periods. In this study, we use the LSAC ‘B cohort’, which has an initial sample of around 5,100 infants born between March 2003 to February 2004, drawn randomly from all registered national births at the time. The initial survey, conducted between 6 months and a year after birth, contains rich perinatal information about the mother and study child along with detailed information on the parents’ background and current life circumstances. From the initial survey, children and their primary care giver are surveyed biennially. At the time of analysis, data were only available up until wave 5 (2014) of the survey when the children are 8–9.

Of the initial 5,100 LSAC participants, we omit around 1,300 from the sample because of missing information by wave 5, due mainly to survey attrition. To examine the potential impact of attrition, for two measures that are observed in wave 3 (Peabody Picture Vocabulary and Who Am I?), we compare relations estimated using the full sample available at wave 3 and relations estimated at wave 3 on a restricted sample that remain in the survey until wave 5 (n = 3,666). Using a two-sample t-test, we find no evidence that results estimated results at wave 3 are different for the full and restricted samples (\({\chi }^{2}\) = 0.54, p-value = 0.46 and \({\chi }^{2}\) = 1.64, p-value = 0.20 respectively). The implication is that non-random attrition does not appear to be seriously biasing our results.

Whether participants in LSAC are cesarean born is identified in the initial survey by the primary care giver’s response to the question, “was…the type of birth/ delivery method a cesarean?” In our sample, approximately 30% of primary care givers report that their children were cesarean-born (between March 2003 and February 2004). To the extent that there is a stigma associated with elective procedures, a potential concern is that the rate of cesarean birth may be under-reported in LSAC (social desirability bias). However, this is not borne out in national statistics from hospital records that show comparable rates of cesarean birth — 28% and 30% for 2003 and 2004 respectively35, 36. Compared to other OECD countries, Australia’s rate is similar to that of the United States and Germany, but lower than in Italy (38%) and Brazil (54%) and higher than Finland, Norway and Sweden (around 17%)37.

Data availability

No data were specifically collected for this project. The data used was previously collected by the Australian Institute of Family Studies (AIFS) and was conducted in full accordance with all relevant guidelines and regulations as spelt out in the AIFS Ethics Committee approval. This includes obtaining consent for survey participation and use of anonymized information for research purposes, as approved by AIFS. AIFS provided anonymized data for this project under an individual license agreement. In the analysis and in preparing the manuscript, we have adhered to all requirements under the license agreement. Data can be made available upon request to the corresponding author, subject to approval from AIFS.

Outcomes

Measures of cognitive development

There are two types of cognitive development measures in LSAC. The first are scores from three interviewer-administered tests conducted between ages 4 and 9 that are part of the LSAC survey and the second are scores from a national standardized test in numeracy and literacy at age 8–9 that are matched to LSAC participants. All cognitive measures are age-normalized and standardized with respect to the weighted sample mean and standard deviation (descriptive statistics in Table 1 are age normalized, but unstandardized).

Table 1 Sample mean values for main control variables in LSAC B cohort. Full size table

The interviewer-administered cognitive tests from LSAC B are the Peabody Picture Vocabulary Test (PPVT)38; Who Am I? (WAI)39 and the Matrix Reasoning test (MR)40. The PPVT is an age appropriate vocabulary test designed to measure a child’s knowledge of the meaning of spoken words and their comprehension and ability to respond. The test was carried out in survey when the children were aged 4–5, 6–7 and 8–9. WAI is an assessment of the child’s readiness for school and measures the child’s ability to perform a range of tasks, such as reading, writing, copying, and symbol recognition. WAI was only carried out when the children were 4–5. Finally, MR is a test of problem solving ability, based on the Wechsler Intelligence Scale for Children. The test is age appropriate and was conducted when the children were aged 6–7 and 8–9.

The national standardized tests are from the National Assessment Program for Literacy and Numeracy (NAPLAN). NAPLAN is conducted in all Australian schools to measure performance in numeracy and literacy at grades 3, 5, 7 and 9, corresponding to ages 8–9, 10–11, 12–13 and 14–15. In the case of literacy, performance is measured over the domains of reading, writing, grammar and spelling. Each student’s performance in NAPLAN, including their national ranking compared to similar-age students, is made available to schools and students to monitor performance. At the time of analysis, only NAPLAN data for grade 3 was available for the LSAC B cohort. For students who were not in grade 3 at the time of testing, usually because they commenced school at a later age than the rest of the cohort, we imputed their values. The imputation process involved two steps. The first step involved using the sample with observed grade 3 NAPLAN scores to estimate regression model relations between grade 3 NAPLAN scores and personal information (such as other interviewer-administered cognitive tests). In the second step, we use results from the first step to generate predicted NAPLAN values for those with missing scores, that is, we apply the estimated regression model relations to the personal characteristics of those with missing NAPLAN scores41. Imputing these values instead of omitting them makes little difference to our analysis (see Table S1 of the online supporting material).

Mediators

In this study we measure the extent to which any relation between cesarean birth and cognitive development is mediated by lower rates of breastfeeding and adverse child and maternal health outcomes. We selected these variables because they have been previously associated with cesarean birth and are available in the data9,10,11,12,13,14,15,16,17,18,19. Adverse child health outcomes include measures of obesity and care giver reported diagnosis of asthma, ADD or ASD. While measures of asthma and obesity are available at the time of cognitive testing, measures for ADD and ASD are only available at ages 6–7 and 8–9 years. Because only around two-thirds of care givers answer questions related to ADD and ASD at age 6–7, we use information at age 8–9 years when the response rate is much higher. Obesity is measured by whether the child’s Body Mass Index (BMI) is above the upper-limit of normal range of 19.3 for those 4–7 years and 23 for 8–9 years. Breastfeeding and maternal health measures are from the initial sample (6–12 months after birth). Breastfeeding is a binary measure of whether the child was breastfed at three months or not. Two self-reported postnatal maternal health measures were used: whether any depressive symptoms were experienced in the last 4 weeks (score of below 3 on a 6-point Kessler scale) and whether general health was reported as fair to poor (4 or 5 on a 5-point scale, where 1 is excellent health).

Controls

The analysis includes over 20 confounders grouped into two main categories (Table 1): those related to perinatal risk factors and those related to the socio-economic advantage associated with cesarean-born children in Australia. Perinatal risk factors include the taking of medication during pregnancy for blood pressure or diabetes (proxies for pre-eclampsia and gestational diabetes respectively), the taking of antibiotic medication (a proxy for bacterial infection, which may also affect the development of the infant’s gut microbiome); a dummy variable for low birth weight (coded 1 if less than 2.5 kg; 0 otherwise); weeks of gestation; maternal age at birth; dummy variable for multiple infant pregnancy; length and head circumference of baby (z-scores); dummy variable for whether the baby was conceived using IVF treatment and a gender dummy. We include taking antibiotic medication as a control because it has been associated with changes to the infant’s gut microbiome42 and possibly the risk of cesarean birth, which means failure to control for it will lead to bias due to unobserved confounding. However, we stress that our results are not sensitive to the inclusion of this control, or other child and maternal health risks in the data (refer to results for the ‘low-risk privately insured’ group in Table 2). Length and head circumference z-scores are based on Centre for Disease Control and Prevention (CDC) growth charts and are age and gender-adjusted. We also estimated a model with an alternative treatment of low birth weight (below 1.5 kg) and a model on a sub-sample of births with gestational ages of 37–40 weeks using binary dummy variable controls for each gestational week (excluding the reference category of 40 weeks). Results for these alternative models are much the same as those reported in Table 2 (available upon request from the corresponding author).

Table 2 OLS regression estimates and Oster lower-bound estimates of the relations between cesarean birth and child cognitive (standard deviations). Full size table

Postnatal interventions such as the use of a ventilator and the use of intensive care were not included as controls because they may be considered an outcome of delivery mode. We also refrain from including prenatal risk factors available in the data that have a high rate of missing observations. These include an identifier for whether the mother is a regular smoker, maternal average consumption of more than two standard drinks a day and maternal body mass index outside of normal range (18.5 to 25). Excluding postnatal interventions and prenatal risk factors with high rates of missing observations makes little difference to the results (see Table S2 of the online supporting material for model results with these factors included as additional controls).

In general, descriptive statistics of the control variables presented in Table 1 show that cesarean birth is associated with higher perinatal risk factors and a socio-economic advantage, especially higher maternal education, fewer older siblings, lower rates of un-partnered birth and higher rates of private health insurance. Despite Australia’s universally available free public health system, many high income earners in Australia choose to hold private health insurance for two reasons. First, it provides a greater coverage of medical treatments, especially for allied health services; and second, it enables them to circumvent the payment of an income-contingent levy that to help meet the cost of the public health system. Important in the context of this study, maternal requested cesarean birth (without any medical risk factors) is not covered by the public health system. While mothers without private health insurance can still elect for cesarean birth in a public hospital, this is uncommon because they would incur all medical costs. It is much more common for elective cesareans to occur in private hospitals under the cover of private health insurance. This explains the 11 percentage point higher rate of private health insurance among cesarean born children than among vaginally born children.

Statistical method

In our main analysis, we use OLS multivariate regression models for each of the cognitive measures to estimate the relation between cesarean birth and cognitive development. These models are of the following form:

$${Y}_{i}={\gamma }_{0}+{\gamma }_{1}C{S}_{i}+SE{S}_{i}{\gamma }_{2}+P{N}_{i}{\gamma }_{3}+{\upsilon }_{i}$$ (1)

where \(C{S}_{i}\) equals one if child i was cesarean-born, 0 otherwise, \(SE{S}_{i}\) and \(P{N}_{i}\) are vectors of family socio-economic status and perinatal characteristics respectively (from Table 1) and \({\upsilon }_{i}\,\)is an error term.

To try and explain the importance of possible channels, we measure the extent to which \({\gamma }_{1}\)is mediated by lower rates of breastfeeding and adverse child and maternal health outcomes in the data. Mediating effects are estimated using the product of the coefficients method43.

Identification of the main parameter of interest, \({\gamma }_{1}\) in equation (1) and the mediating effects is complicated by potential unobserved confounding, which means that \({\gamma }_{1}\) may be biased by correlation between \(C{S}_{i}\) and \({\upsilon }_{i}\). The main potential sources of unobserved confounding may be missing controls for perinatal risk factors (such as oxygen deprivation during birth) and socio-economic advantage (such as greater household income) associated with cesarean-born children. The presence of the former will lead to an over-estimate of any true negative relation between cesarean birth and cognitive development; whereas the presence of the latter will lead to an under-estimate of any negative relation.

Without the possibility for randomization, a common approach for dealing with this form of bias is instrumental variables. This method relies on the presence of factors in the data that affect cognitive development only through altering the chances of cesarean birth. We are unaware of any strong candidates in our data and we instead concentrate on testing the sensitivity of our results to bias from the presence of selection on unobserved covariates. These are discussed in detail below.

Sensitivity to unobserved perinatal confounders

We use two approaches to test the sensitivity of our OLS estimates to bias from unobserved perinatal covariates. First, we re-estimate OLS relations (using equation (1)) on a sub-sample of 2,140 births that are free of any observed health risk that may lead to cesarean birth and that are privately insured; termed, ‘low-risk privately insured’ group. The idea is that by restricting the sample to those that are more similar on observed covariates, we are reducing bias by also restricting differences in unobserved covariates. The ‘Low-risk privately insured’ sub-sample are those that are not low birth weight (above 2,500 grams), full-term (38–40 week gestation), singleton, conceived without IVF, whose mother took no blood pressure or diabetes medication during pregnancy and whose parents were privately insured in the year of birth. These models include all other covariates included in the full sample. We omitted those without private health insurance because they are more likely than those with private insurance to have a cesarean performed for medical reasons.

In the second approach, we use the Oster32 method, which like approaches proposed by Rosenbaum44 and Altonji45, bounds the relation under assumptions about selection on unobserved covariates (see Nghiema, Nguyen, Khanam and Connelly46 and Hanushek, Schwerdt, Woessmann and Zhang47 for recent applications of the Oster method). Under the Oster method, we estimate the lower bound of the OLS relation between cesarean birth and measures of child cognitive development using the following:

$${{\rm{\gamma }}}_{1,{\rm{lower}}}={\rm{f}}({{\rm{\gamma }}}_{1}-{{\rm{\gamma }}}_{1{\rm{u}}},\,{{\rm{R}}}^{2}-{{\rm{R}}}_{{\rm{u}}}^{2};\,{{\rm{R}}}_{{\rm{\max }}}^{2}-{{\rm{R}}}^{2},\,{\rm{\delta }}).$$ (2)

Variables \({{\rm{\gamma }}}_{1{\rm{u}}}\) and \({{\rm{R}}}_{{\rm{u}}}^{2}\) are results from the ‘uncontrolled’ model, which is equation (1) estimated without perinatal and socio-economic controls; \({{\rm{R}}}_{{\rm{\max }}}^{2}\) is theoretical maximum R-squared, or the maximum proportion of variation in the outcome variable that can be explained by the model; and \({\rm{\delta }}\) is the coefficient of proportionality, or the ratio of selection on unobserved covariates to selection on observed covariates:

$$\delta =\frac{cov({W}_{U},CS)}{cov({W}_{O},CS)}.\frac{var({W}_{O})}{var({W}_{U})},$$ (3)

where \({{\rm{W}}}_{{\rm{O}}},\,{{\rm{W}}}_{{\rm{U}}}\) are vectors of linear combinations of observed and unobserved covariates weighted by their true coefficients (or \(\,{W}_{O}=\sum _{j=1}^{{J}_{O}}{\omega }_{j}^{O}{\gamma }_{j}^{O}\) and \({W}_{U}=\sum _{j=1}^{{J}_{U}}{\omega }_{j}^{U}{\gamma }_{j}^{U}\)). Following Oster conventions, we set \({\rm{\delta }}=1,\) or estimate \({{\rm{\gamma }}}_{1,{\rm{lower}}}\) under the conservative assumption that selection on unobserved covariates is equal to selection on observed covariates and set \({{\rm{R}}}_{{\rm{\max }}}^{2}=1.3{{\rm{R}}}^{2}\). Setting \({{\rm{R}}}_{{\rm{\max }}}^{2}=1.3{{\rm{R}}}^{2}\) contrasts with the closely related Altonji45 method that assigns \({{\rm{R}}}_{{\rm{\max }}}^{2}=1\). Oster32 argues that assuming a value of 1 is unreasonable in the presence of measurement error. Under equation (2), the higher the proportion of variation in the outcome variable explained by the model (\({{\rm{R}}}^{2})\), the smaller the discrepancy in the estimates of \({\gamma }_{1}\,\)and \({{\rm{\gamma }}}_{1,{\rm{lower}}}\).

For any estimated negative relation between cesarean birth and cognitive development, the larger the discrepancy between estimates of \({{\rm{\gamma }}}_{1,{\rm{lower}}}\) and \({\gamma }_{1}\), the more unreliable \({\gamma }_{1}\) as an estimate of the true relation.