Study sample and recruitment

A total of 105 boys ages 8 to 12 were included in the study (Table 1). The study was limited to boys to obviate any sex-related differences in facial phenotypes. The narrow 8- to 12-year-old age range was selected so that the boys were prepubertal but had completed 90% to 95% of head growth [33, 34] and brain growth [35] and were at the same stage of facial development, which is a continuous process through the seventh decade of life [36]. To ensure that our sample populations did not differ in age, we employed a two-sample t-test using diagnosis (that is, ASD or not) as the categorical variable and age as the continuous variable. We determined that age was not significantly different between the two groups (P = 0.827).

Table 1 Study sample and age range Full size table

Participants with ASD (n = 64) were recruited through the Thompson Center for Autism and Neurodevelopmental Disorders. All participants were screened before inclusion and met the following criteria: individuals were male; of Caucasian ethnicity; had not worn dental braces; were prepubertal (by parent report); were able to sit relatively still for picture-taking; had been diagnosed with Autistic Disorder, Asperger syndrome, or pervasive developmental disorder-not otherwise specified (PDD-NOS) according to the DSM-IV criteria prior to the day of the study; and had no additional syndrome diagnoses. Boys with fragile X syndrome and/or chromosomal disorders, including copy number variants (CNV), generalized dysmorphology or gestational age less than 35 weeks were excluded.

Of the 64 boys with ASD, 36 had completed the Simons Simplex Collection (SSC) protocol, which includes the Autism Diagnostic Interview, Revised (ADI-R) [37] and the Autism Diagnostic Observation Schedule (ADOS) [38], which were used in conjunction with the clinical judgment of one of the authors (JHM) to make the diagnosis of ASD. The 28 boys recruited through the Autism Medical Clinic were diagnosed on the basis of the DSM-IV criteria using a center-specific protocol based on the ADI-R together with the clinical judgment of the same author (JHM). The boys were assessed for generalized dysmorphology using the Autism Dysmorphology Measure [39].

TD boys (n = 41) were recruited from the Columbia, MO, USA, community via a notice published in the University of Missouri online information email and by word of mouth. Participants were screened using the same criteria described in the preceding paragraph, with the exception of a diagnosis of ASD. We chose TD boys as the control group, as our hypothesis was that development in ASD deviates from normal development. Samples were not matched for IQ, since this would not have allowed us to interpret how ASD deviates from the normal developmental trajectory. Recruitment and data collection procedures were carried out in accordance with approved Institutional Review Board protocols.

Three-dimensional stereophotogrammetric imaging

Three-dimensional images were acquired using the 3dMDcranial System (3dMD, Atlanta, GA, USA). Briefly, the 3dMDcranial System works by projecting random light patterns on the subject of interest (in our case, the human face). The subject is captured with multiple, precisely synchronized digital cameras configured in four modular units for a 360° full-head capture. Each unit contains three digital machine vision cameras. Because multiple cameras are used, there is no need for post-data capture ''stitching'' of multiple images into the single composite picture. Thus this technology removes a potential source of error by creating a valid three-dimensional representation of the subject at the time of data acquisition. Three-dimensional surface geometry and texture are acquired nearly simultaneously. Algorithms developed by 3dMD integrate the multiple images to produce a single three-dimensional image (Figure 2), which can be visualized and analyzed on a desktop computer using the 3dMD Patient software. A complete summary of the 3dMDcranial System is available online at http://3dMD.com/.

Figure 2 3dMD image acquisition and analysis. (A) Example of a 3dMD image acquired from an individual chosen at random from the study sample. Full size image

Prior to each session, all cameras were calibrated and tested to ensure that the images collected were consistent and usable. Individuals participating in the study were brought into the camera area and asked to sit as still as possible, look directly at one camera marked with a sticker and maintain a neutral expression (verbal instructions included closed mouth, no smile, no visible teeth and no raised eyebrows). Once the participant was comfortable and able to sit still, collection of the images began. Multiple pictures of each child were taken to ensure that the image used for analysis adequately captured all of the facial areas needed for landmarking.

Anthropometric landmark data collection

Anthropometry, the biological science of measuring the size, weight and proportions of the human body [40], provides objective characterization of phenotypic variation and morphology. Facial anthropometry is performed on the basis of measures taken between landmarks defined on surface features of the face. The anthropometric landmarks defined by Farkas [40] located on the soft tissue of the face and head are repeatable, biologically relevant anatomical points. The three-dimensional landmark coordinate data were collected for 17 landmarks on the three-dimensional images by two raters (IDG and JRA) using the 3dMDpatient software program (Figure 3 and Table 2). Previous studies have shown three-dimensional landmark data collected from 3dMD images to be highly precise and repeatable [41, 42]. All landmarks were checked for gross errors (for example, switching of right and left sides) prior to analysis.

Figure 3 Illustration of the anthropometric landmarks collected from the 3dMD images. Landmarks are defined in Table 2. Full size image

Table 2 Three-dimensional anthropometric landmarks acquired from 3dMD images Full size table

To determine the reliability of data collection, an error study was performed. Four trials of landmark coordinate data were collected from two 3dMD images by both raters. Coordinate data were converted to all possible linear distances among the landmarks (all linear distances between 17 landmarks, resulting in 136 linear distances). Means, standard deviations and values of standard deviations as percentages of the linear distances were calculated for each linear distance in each observer's trials. The results derived by both raters are presented.

The ranges of the standard deviations expressed as percentages of linear distance were 0.19% to 8.11% for rater 1 and 0.19% to 14.3% for rater 2. Of the 136 total linear distances evaluated, linear distances with standard deviations greater than 5% of the mean totaled seven for rater 1 and eight for rater 2, leaving 129 and 128 linear distances, respectively, with less than a 5% error for each rater, respectively. Of the seven and eight linear distances with greater degrees of error, three of them were shared by the two raters.

The results of this error study indicate that the landmark coordinate data and the linear distances calculated from them can be collected with a very low degree of error. Therefore, the data collected by the two raters in this study are highly precise and repeatable.

Morphometric data analysis

The landmark coordinate data collected using 3dMDpatient software were analyzed using Euclidean Distance Matrix Analysis (EDMA) [43], which is a linear distance-based morphometric method that does not rely on registration or fitting criteria [43, 44]. Linear distances calculated between all possible pairs of landmarks were compared across samples as ratios. EDMA represents the form of each individual as a form matrix (FM), which is the set of all possible linear distances between the facial landmarks. Average FMs for each sample, that is, ASD and controls, are compared as ratios of like linear distances. This set of ratios of corresponding linear distances is called a "form difference matrix" (FDM). If a ratio in a FDM is equal to 1, then the faces being compared do not differ for that discrete linear distance. If the ratio is above 1, the linear distance is greater in the face used as the numerator. Likewise, if the ratio is less than 1, the linear distance is greater in the face used as the denominator. We used a nonparametric bootstrapping algorithm to calculate confidence intervals for each discrete linear distance to test for the significance of localized form differences [45]. The null hypothesis is that each discrete linear distance is similar for the two samples. Individual linear distances were considered significantly different if the calculated two-tailed 90% confidence interval did not include 1.0. Evaluation of confidence intervals for differences in specific linear distances enables localization of differences to specific facial regions.

This test of empirical differences in shape between samples is based on marginal confidence intervals of the bootstrap estimates of the linear distances between unique pairs of landmarks. Bonferroni-type corrections are not needed for these marginal confidence intervals, because in this approach multiple tests of linear distance differences using the same data are not conducted. Instead, with each bootstrapping step, all measures are estimated for an individual and tested in a high-dimensional space where each dimension represents a unique linear distance. The low-dimensional projection of these results for each linear distance is reported (see [43, 45] for details).

This method has been used in numerous previous studies to compare facial morphologies (for example, see [42–49]) as well as the morphologies of other anatomic regions. A validation of this method for the data set in this study was performed to ensure that differences found in comparing the boys with ASD to TD boys using EDMA were not spurious. The group of TD boys was split into two randomly assigned age-equivalent groups. These two groups were then compared, and confidence intervals were calculated for each linear distance. On the basis of the results of these analyses, we determined that there were three significantly different linear distances among the total of 136 compared (2.2% of 136). These results show that fewer were statistically different than would be expected by chance (that is, 5%), demonstrating that this method is both sensitive and specific.

Principal Coordinates Analysis (PCOORD) application of EDMA was then performed on the scaled data for all participants in both groups [43–45, 50]. This procedure is a form of clustering analysis that detects groups of forms with similar shapes and identifies linear distances that are influential in forming the defined clusters. In this procedure, the distribution of participants in multidimensional morphological space is examined. Unlike the form difference analyses described above, PCOORD compares individuals rather than samples. Axes are fitted through the shape space of this analysis such that the first axis accounts for the majority of the variation, the second axis accounts for the second-largest amount of variation and so on. These axes are referred to as the "first principal axis," the "second principal axis," and so forth. The position of participants along these axes is defined in terms of the linear distances between landmarks most highly correlated with these axes. Therefore, participants who cluster along a particular axis are similar in terms of the linear distances correlated with that axis. This analysis was performed to determine whether participants clustered on the basis of facial morphology, to identify the metrics that contributed to determination of the clusters and to explore the nature of the clusters to formulate hypotheses about the pattern of differences in the development of the face in children with ASD. The PCOORD analyses were performed on data that were scaled for differences in size. To do this, the FM for each individual was scaled such that each linear distance was divided by the geometric mean of all linear distances within that individual's FM. Thus each participant's data were scaled using a unique scaling factor. The geometric mean was chosen as a surrogate for size [51–53].

We analyzed all of the participants in the study to determine (1) whether there are aspects of facial morphology that distinguish the facial phenotypes of boys with ASD compared to TD boys and (2) whether there are facial subgroups within the ASD cohort that differ in their associated clinical and behavioral parameters.

Behavioral and medical data

Each of the boys was evaluated for characteristics of their ASD diagnosis (social function, verbal function, repetitive behavior and language level), behavioral problems (aggression, attention deficits and self-injurious behaviors), outcome measures (IQ, communication, daily living skills, socialization and Vineland Adaptive Behavior Scale composite scores), the clinical course of their disorder (age at onset and presence of regression at onset), medical and neurological variables (seizures, electroencephalogram results, hypotonia, hypertonia, clumsiness, vision or hearing problems, tics, enuresis, handedness, feeding difficulties in infancy and allergies), physical morphology (head circumference, height, weight and dysmorphology) and family history of autism and related neuropsychiatric disorders among first-degree relatives.

The tests administered to all or the majority of participants included the ADI-R [37], ADOS [38], Social Communication Questionnaire (SCQ) [54], Vineland Adaptive Behavior Scale II [55], Peabody Picture Vocabulary Test (PPVT) [56], Child Behavior Checklist (CBCL) [57], an age- and development-appropriate IQ test (Full Scale IQ (FSIQ), Verbal IQ (VIQ), Nonverbal IQ (NVIQ)) and the Autism Dysmorphology Measure [39]. Not all measures of IQ were available for a small number of boys. NVIQ was available for the entire sample, FSIQ was available for all but one boy and VIQ was missing for four of the boys. Comprehensive prenatal, perinatal, teratogen exposure, development, general health, neurological and family histories (including income and education), were obtained using either the SSC Medical History or the Thompson Center Medical History, which record similar information. All participants received complete medical and neurological examinations, including assessment of growth and dysmorphology.

Statistical comparisons of facial phenotypes with clinical and behavioral phenotypes

We compared clinical and behavioral traits to determine whether there were significant correlations between subgroup membership and the variables described in the preceding subsection. Continuous random variables were summarized by their mean, standard deviation and range. For categorical random variables, univariate comparisons of subgroup 1, subgroup 2 and the remainder were made using the χ2 test or Fisher's exact test. For continuous variables, comparisons were made using Student's t-test. Because the IQ score is skewed, comparisons of IQ scores were made using the Kolmogorov-Smirnov test and the t-test after log-transforming IQ.