We report here the 30 year results of a prospective follow-up study on a birth cohort of full term and normal weight newborns with hyperbilirubinemia (HB > 340 µmol/l) as the only birth risk factor whose adult outcome is compared to healthy, typically developed controls similarly followed-up. Our aim was to study if the cognitive and behavioral problems discovered in childhood continue in adulthood, and if they affect the educational, occupational, and social functioning as well as health and wellbeing of the adult subjects.

These studies have mainly relied on IQ data and only offer limited outcome information. Intelligence scores are often insensitive to executive dysfunction, learning deficits and affective disorders ( Johnson & Bhutani, 2011 ). An association between neonatal jaundice and neurodevelopmental syndromes, e.g., ADHD has been suggested ( Jangaard et al., 2008 ) but not proven ( Kuzniewicz, Escobar & Newman, 2009 ). Also a link to autism and other disorders of psychological development has been presented in HB ( Jangaard et al., 2008 ; Maimburg et al., 2010 ). In a national cohort from Denmark, full-term children exposed to jaundice as neonates had 56% to 88% greater risks of psychological development disorders, compared with children not exposed to jaundice ( Maimburg et al., 2010 ). The conclusion of HB being of benign nature based on intelligence test findings may therefore be inaccurate. Very little is known about the vocational and social outcome of hyperbilirubinemia in adults.

In a Norwegian study reporting results of 39 male conscripts at 18 years of age, the mean intelligence scores were comparable to the total conscript cohort, but seven subjects with a more severe hyperbilirubinemia had significantly lower scores than the national average ( Nilsen et al., 1984 ). A study from Israel with a much larger sample ( n = 1948) found no linear association between neonatal bilirubin levels and intelligence test scores or school achievement at 17 years of age, but the risk for low intelligence test scores was higher among full term males (but not females) with serum bilirubin levels above 342 µmol/l ( Seidman et al., 1991 ). The most recent study was from Denmark, where no association between higher bilirubin levels and the risk of obtaining a neuropsychiatric diagnosis, or scoring lower in the cognitive test used in the drafting was seen in males with a median age of 18.8 years ( Ebbesen et al., 2010 ). There was however a larger proportion of men found unfit for military service in the HB group. Also, only 25 subjects (6% of all hyperbilirubin cases) in the Danish study were exposed to bilirubin levels above 342 µmol/l, the cut-off value that has previously been regarded significant.

Statistical analysis was performed using the STATISTICA analysis software ( StatSoft, Inc., 2013 ). For continuous outcome variables, a multiple analysis of variance (MANOVA) was used in three separate models that included variables of for birth data, cognitive tests at 9 years of age, and ADHD symptom scores at 30 years. Missing data was replaced by mean values of relevant groups. Differences between three groups were further tested by univariate analysis of variance (ANOVA) and pair-wise comparisons. Due to multiple comparisons, we used the Bonferroni method post-hoc test because of the conservative estimates it produces. The Kruskall–Wallis nonparametric analysis of variance was used for ordinal variables or when the distribution was skewed. For testing significance of two independent variables, the Mann–Whitney-U or the Kolmogorov–Smirnov two sample tests were used. Frequencies were calculated from contingency tables and Fisher’s exact test was used to test the significance in 2 × 2 tables.

Of the cases fulfilling the criteria of the present study, 128 HB cases (54%) and 82 controls (57%) returned an adequately completed questionnaire at the age of 30 years. See Fig. 1 for the complete flow chart. We analyzed the possible attrition bias caused by the 54% compliance in the HB group by comparing the responders and the non-responders (data not shown). At birth, there was no difference between the groups in the parity, the percentage of small-for-date, apgar points, gestational weeks, birth weight, head circumference, gender ratio, mother’s smoking habits during pregnancy, or the family socio-economic status. The developmental classification score at the age of 9 years was similar in both groups.

Long term outcome at the age of 30 years was assessed with an extensive questionnaire prepared for the purposes of this study. Educational outcome was measured by questions regarding school achievement (graduation from first-level obligatory school, the need for remedial teaching, the final school grades earned, and the highest degree completed in secondary or tertiary education). Occupational achievement was measured with a question about the type of present employment (full-time, part-time, student, maternity leave, unemployed). Social functioning was assessed with a combined score measuring social satisfaction (three items) and social contacts (two items), higher value indicating more dissatisfaction. Life satisfaction, satisfaction with social relationships and satisfaction with social support received were measured with a 5 point scale (from very satisfied to very unsatisfied) and the social contact items included reported difficulty in making friends or maintaining social relationships (yes/no). General health was measured with a combined score of subjective health (rated with a five-point scale from very good to very poor), the number of doctors’ appointments (four-point scale), the use of medication for physical illnesses, and the presence of headaches or gastro-intestinal complaints (yes/no). The incidence of past traumas, fractures and concussions was also noted. Substance use (illicit drugs, alcohol consumption frequency, alcohol related problems) were also included in the questionnaire as indicators of health and well-being. In addition to categorical variables, an alcohol consumption score was constructed by combining items of regularly drinking more than once a month (dichotomized from a six-point scale from never to daily), excessive use expressed by family members (yes/no), excessive use estimated by self (yes/no) and reported incidences of driving while intoxicated (yes/no). The psychiatric status was measured with a combined score of reported psychiatric problems (presence or absence of depression, anxiety disorder, panic disorder, obsessive-compulsive traits, manic-depression, or sleep disorders), and the use of prescribed sedatives or hypnotics. The behavioural outcome was measured with the ADHD Current Symptoms Scale as well as the ADHD Childhood Symptoms Scale where each of the 18 DSM-IV ( American Psychiatric Association, 1994 ) diagnostic symptom criteria is scored from 0 to 3 depending on the severity of symptoms ( Barkley & Murphy, 1998 ). Cognitive outcome was measured with ongoing cognitive complaints (presence or absence of persisting subjective learning difficulties, writing difficulties, reading difficulties, perceptual problems, mathematical problems, speech problems, motor/dexterity problems).

A questionnaire was filled out by parents both at ages 5 and 9 regarding the parents’ work situation, existing developmental problems or disabilities within the family, the child’s home environment, development and skill acquisition, illnesses, school achievement and problem behaviors. The existing problems within the family that were probed were: delayed speech, delayed motor skills, clumsiness, reading and spelling difficulties, other school problems, inattention or short temperedness, squint, impaired hearing in childhood, mental retardation, mental illnesses and seizures, present or absent in other family members. In addition, the questionnaire at 5 years included 28 questions regarding the activity level, motor skills, communication and emotional affectivity of the child rated in a three point scale. At 9 years of age a questionnaire was also sent to the teachers concerning school achievement and the need of remedial teaching or a special class. Additionally, the teacher was asked to perform an evaluation of the pupil’s behavior and personality in class using 24 adjective pairs (e.g., observant—day dreaming) with a five point rating between the extremes, as well as five open-ended questions for additional information.

The family’s social and economic status, maternal risk factors, genetic traits, medical data about delivery and delivery complications was recorded prospectively. The socio-economic status was defined as the father’s occupational level (1 to 5, 1 being the highest category). Psychosocial distress score was formed to include poor housing conditions, divorces, relocations, alcohol abuse, unemployment, family conflicts, domestic violence, imprisonment, severe diseases or mental problems in family. Body-mass index was calculated as an indicator of the general health and development of the child.

Ethical review has been conducted over the course of the longitudinal study, and the latest approval was obtained from the Ethical Review Board of the Helsinki and Uusimaa hospital district in May 2013 (number 147/13/3/00/2013). All participants gave their written consent to the study.

The control subjects, also prospectively studied from childhood, were born in the same hospital during the study period, attended the same schools, and were free of any perinatal risk factors. For the present study those with gestational age at or above 37 weeks, birth weight at or above 2500 g and full assessment information from 9 years of age are included ( n = 145).

For the present study only full term neonates with hyperbilirubinemia were included. For inclusion, at least two serum bilirubin values of 340 µmol/l (20 mg/100 ml) or more were required. Children who had received blood exchange transfusion because of rapidly increasing serum bilirubin values were also included. Those who died during early childhood and those with severe disabilities (incl. kernicterus) were excluded from follow up ( Michelsson, Donner & Lindahl, 1988 ). Additional exclusion criteria for the present study included gestational age below 37 weeks, or birth weight below 2500 g. There were 238 HB cases fulfilling these inclusion and exclusion criteria. All subjects were Caucasian and the families spoke Finnish.

The birth risk cohort originates from a single maternity hospital in Helsinki, Finland. During the recruitment period 1971–74, there were 22,359 consecutive births in this hospital, out of which 1196 (5.4%) neonates had at least one predefined risk: APGAR score lower than 7 at 5 or 15 min ( n = 372), birth weight under 2000 grams ( n = 317), jaundice with bilirubin > 340 µmol/l ( n = 368), severe respiratory difficulties necessitating external ventilation ( n = 161), neurological symptoms ( n = 195), maternal diabetes ( n = 95), infant hypoglycemia ( n = 104), and severe infection ( n = 36); 19% had more than one risk factor ( Michelsson et al., 1978 ).

Results and Discussion

There were 57 HB cases with at least one neurobehavioral disability used for classification at 9 years of life (the affected-HB group) and 71 with no disabilities (the unaffected-HB group). The controls were similarly classified into unaffected (n = 70) and affected (n = 12) groups. The odds ratio was calculated as OR = 4.68 (95% confidence interval 2.21–10.11) for a subject with HB of belonging in the affected group as compared to controls.

The result indicates that the odds for a subject with neonatal HB being in the affected group was nearly five-fold compared to the controls. As ADHD, dyslexia and other developmental disabilities are each found in about 5%–10% of the general population (Shaywitz & Shaywitz, 2005; Polanczyk et al., 2007; Willcutt et al., 2010), finding individuals affected with, e.g., reading or writing impairment, motor difficulties, and attentional problems in our control group as well was expected. HB however greatly increases this risk.

To study if the neurobehavioral difficulties of childhood improve or disappear in adulthood, we continued to examine the affected and unaffected subgroup separately. The subgroups were based on categories that had been formed prospectively, independently from the present study. Because the aim was to compare the HB group with typically developed healthy subjects, controls affected with developmental disabilities were excluded from further analyses. The rationale was that a control group containing both affected and unaffected individuals would have created a bias in favor of unaffected HB and in disfavor of affected-HB cases, obscuring the potential differences.

The management of neonatal hyperbilirubinemia Table 1 shows the perinatal background data in the three groups. A statistically significant difference was found in MANOVA for the continuous variables (Wilks’ Lambda p < 0.0001) but in univariate ANOVAs only the 5 min apgar differed between the groups, with the control group actually being the lowest. There was a slight but statistically not significant over representation of girls in the control group (57%) and boys in the affected (61%) and unaffected (55%) HB group. Affected-HB

n = 57 Unaffected-HB

n = 71 Controls

n = 70 Mean

(95% CI) Mean

(95% CI) Mean

(95% CI) Univariate test across

the three groups Gestational age (weeks) 39.5

(39.1–39.8) 39.3

(38.9–39.6) 39.8

(39.5–40.2) ns Birth weight (g) 3505

(3376–3633) 3469

(3354–3585) 3538

(3428–3649) ns Head circumference (cm) 34.6

(34.2–34.9) 35.0

(34.7–35.3) 35.0

(34.7–35.4) ns Apgar 1 min 8.6

(8.4–8.8) 8.5

(8.3–8.8) 8.9

(8.7–9.2) ns Apgar 5 min 9.8

(9.7–10.0) 9.9

(9.7–10.0) 9.6

(9.4–9.7) p < 0.002 Apgar 15 min 9.9

(9.9–10.1) 9.9

(9.8–10.0) 10.0

(9.9–10.1) ns Maternal age (years) 25.4

(24.3–26.5) 26.0

(25.0–27.0) 26.8

(25.4–28.1) ns Peak bilirubin value,

distribution of cases <300 2 <300 1 300–339 5 300–339 3 340–399 29 340–399 32 400–449 14 400–449 23 >450 3 >450 3 Gender m/f 35/22 39/32 30/40 ns DOI: 10.7717/peerj.294/table-1 No difference between the affected-HB and unaffected HB-groups was found in parity, the percentage of small-for-date, mother’s smoking habits during pregnancy, maternal diabetes, the number of X-ray investigations during pregnancy, past miscarriages, maternal blood pressure, mother’s weight gain during pregnancy, glycosuria, toxemia, mother’s marital status or the hereditary traits (diabetes, neurological, sensory deficits). Among all cases with HB, 54 received phototherapy and 79 had blood exchange transfusions (in six cases more than one transfusion). The number of treatments was equal in the affected-HB group (46 treated, 11 non-treated) and the unaffected-HB HB group (55 treated, 16 non-treated). Neither were there any significant differences in the start time, duration of HB or the peak bilirubin value. ABO blood group incompatibility was present in 40 cases and Rh incompatibility in one. The treated and non-treated HB subjects were equal in terms of cognitive testing, school performance, and other assessments at 9 years of age indicating that treatment decisions did not explain the measured outcome differences.

Medical events and development at birth, and 9 years of life There were no differences in the children’s hospitalizations, traumas, or other somatic complaints and diseases in follow-up. The body-mass-index BMI of the child was similar in the three groups at age 9 years (see Table 2). The groups did not differ in the amount of reported past developmental problems within the family (44% in the affected-HB, 28% in the unaffected-HB and 34% in the controls). Affected-HB

n = 57 Unaffected-HB

n = 71 Controls

n = 70 Mean

(95% CI) Mean

(95% CI) Mean

(95% CI) Univariate test across

the three groups Cognitive functioning WISC VIQ 104.8

(102.0–107.6) 119.2

(116.6–121.7) 120.0

(117.4–122.5) p < 0.0001 WISC PIQ 109.1

(106.1–112.2) 118.2

(115.5–121.0) 122.4

(119.7–125.2) p < 0.0001 WISC FSIQ 107.4

(104.6–110.3) 121.0

(118.4–123.5) 123.3

(120.8–125.9) p < 0.0001 ITPA total 33.8

(33.0–34.7) 37.7

(36.9–38.4) 37.2

(36.5–38.0) p < 0.0001 Reading test 8.3

(8.1–8.6) 8.8

(8.6–9.0) 8.8

(8.6–9.0) p < 0.003 Spelling test 7.0

(6.8–7.3) 7.9

(7.6–8.2) 8.5

(8.3–8.8) p < 0.0001 Health and family situation BMI 29.8

(28.5–31.1) 31.6

(30.5–32.7) 31.5

(30.4–32.6) ns Social distress score 4.3

(3.4–5.1) 2.7

(2.0–3.4) 3.1

(2.4–3.8) p < 0.02 Socio-economic status 2.4

(2.1–2.7) 1.9 (1.6–2.1) 2.0

(1.8–2.3) p < 0.04 DOI: 10.7717/peerj.294/table-2 The family’s socio-economic status was the highest and the social distress scores the lowest in the unaffected HB group differentiating it from the affected HB group but not from the control group in pair-wise comparisons. The affected-HB group did not differ from the controls. The significance levels were low and considering the multiple tests performed, the findings are not considered meaningful. The performance of the groups in the Illinois Test of Psycholinguistic Abilities (ITPA), the Reading and spelling tests as well as the Wechsler Intelligence Scale for Children (WISC) performed at the age of 9 years is presented in Table 2 demonstrating significant differences between the three groups (MANOVA Wilks’ lambda p < 0.0001). In pair-wise comparisons the affected-HB group differed from the controls in all tests conducted; the unaffected-HB group differed from the controls only in spelling test (p < 0.003). The frequency of remedial instruction in preschool and school until nine years of age had been 86% in affected-HB, 44% in unaffected-HB, and 28% in the control groups (p < 0.0001). The results show that despite similar medical history and growth pattern, many children in the HB group managed less well in the psychological tests than the control group. This is in line with a previous cross-sectional study of this cohort (Michelsson, Donner & Lindahl, 1988). In that study, where cases under 2500 g and 35–36 weeks at birth were also included, half of the HB group presented with writing difficulties at the age of 9 years (Helenius, 1987). They also had more problems at school, had lower grades, and more often attended special classes (Michelsson, Donner & Lindahl, 1988). In the present study, where the pre-term cases were excluded, cognitive tests showed the same result. The mean IQs were above average in all groups but as the controls performed at a high average range, possibly due to somewhat outdated norms and the so-called Flynn effect (Flynn, 1987), the differences were significant. As expected based on the classification performed, the affected-HB children also needed more special education and help in school. Thus, the negative effect of HB to cognition was not caused by the borderline cases of low weight and gestational age. In contrast to our observations, studies reporting childhood outcomes of HB have mostly found no significant consequences (Culley et al., 1970; Newman & Klebanoff, 2002; Ip et al., 2004; Gamaleldin et al., 2011). In our cohort 55% of the HB subjects belonged to the unaffected group and these subjects were comparable to controls in most variables studied, supporting the general view. However, a smaller percentage was affected, resembling the cohort in Denmark (Maimburg et al., 2010), emphasizing the need to identify separate subgroups. Conflicting findings may also depend on the different ages of children studied. An important factor influencing the clinical picture is the maturation of frontal cortices, which is not complete until after adolescence, perhaps in the ages between 15 and 20 years of age (Paus, 2005). The rate of this development is poorly known, and maturation rates may differ between genders, which should be considered when interpreting results of follow-up studies. The association of the cortical maturation with IQ scores is also unclear (Burgaleta et al., 2014). A third potential factor is the Hawthorne effect: very active treatment is likely to be given in centers where also research is done, which may skew results in the direction of less serious consequences.

Academic and occupational achievement reported at 30 years of age All but 2 subjects in all groups combined had graduated from first level education (the first obligatory nine years). However, 11% of the affected-HB subjects reported having required remedial teaching or a special class during those nine school years, in contrast to only one child in the control group and none in the unaffected-HB group (p < 0.004). The proportion of subjects who had received the diploma at the end of the secondary education (after 12 years of school) was 30% in the affected-HB group, 70% in the unaffected-HB and 75% in the control group (p < 0.0001). Within the HB group, the relative risk of not completing secondary education was more than two-fold (RR = 2.49, 95% confidence interval 1.65–3.73, p < 0.0001) for a person with affected-HB compared to an unaffected-HB. At the point of school graduation the mean of all school marks (the total average of all subjects) was significantly different in the three groups F(2,207) = 29.1, p < .0001, the affected-HB group having the lowest grades and differing from the other two groups in pair-wise comparison (p < .0001) (see Fig. 2). Figure 2: Average school grades at school graduation. The average school grades (mean, SEM, 99% confidence intervals) of subjects with hyperbilirubinemia and controls. The proportion of subjects who by the age of 30 had completed an academic degree was 11% in the affected-HB group, 32% in the unaffected-HB and 31% in the control group (p < 0.0001). When other forms of tertiary education (applied universities) were also included, the relative risk of not completing tertiary education was still elevated (RR = 1.57, 1.13–2.13, p < 0.004) for a person with affected-HB compared to an unaffected-HB. At 30 years of age 86%, 93% and 90% of the groups (affected-HB, unaffected-HB and controls, respectively) were working or studying full time, showing a similar outcome. At the other end of the spectrum, however, 11% of the affected-HB, 3% of the controls and none of the unaffected-HB groups were unemployed (p < 0.006). The results indicate that the subgroup of affected-HB had poorer academic achievement, i.e., lower mean school grades, and less potential to graduate from secondary or tertiary education. There was also a greater risk for later unemployment. Corresponding findings in academic and occupational failure are reported in other causes of high birth risk, e.g., low birth weight and asphyxia (Hack & Klein, 2006; Arpino et al., 2010; Stuart, Otterblad Olausson & Källen, 2011). The academic achievement is widely regarded as an objective and a sensitive measure, and it is influenced not only by general intelligence but also by memory, motivation, executive functions and social skills. A similar course of underachievement has been found, e.g., in adults with ADHD who may have significantly less education than expected based on their IQ, and lower occupational levels than expected based on their education (Biederman et al., 2008).