Study population

The Developmental Neurophysiology Laboratory, under the direction of the first author, maintains a database of patients and research subjects that includes unprocessed (raw) EEG data in addition to referral information. Patients typically are referred in order to rule out epilepsy and/or sensory processing abnormalities by studies incorporating EEG and Evoked Potentials (EP).

Patients with ASD

The goal of the current study was to select only those patients whom experienced clinicians recognized and identified as patients on the autistic spectrum, while excluding children in the extremes of this entity, confounding neurological diagnoses that may present with autistic features, and other entities that might have independent impact upon EEG data.

Necessary inclusion criteria included the diagnosis of ASD or Pervasive Developmental Disorder not otherwise specified (PDD-nos) - both hereafter bundled and together referred to as ASD - as determined by an independent pediatric neurologist, psychiatrist, or psychologist at CHB or at one of several other Harvard teaching hospitals, specializing in childhood developmental disabilities, including ASD. Diagnoses relied upon DSM-IV [1] and/or ADOS [34–36] criteria aided by clinical history and expert team evaluation.

Exclusion criteria included: (1) co-existing primary neurologic syndromes that may present with autistic features (for example, Rett's, Angelman's and Fragile X syndromes, tuberous sclerosis, or mitochondrial disorders); (2) clinical seizure disorders or results of EEG readings suggestive of an active seizure disorder or epileptic encephalopathy. (Note: Patients with occasional EEG spikes were not excluded); (3) a primary diagnosis of global developmental delay (GDD), developmental dysphasia or high functioning autism and/or Asperger's syndrome; (4) expressed doubt by the referring clinician as to the diagnosis of ASD; (5) taking medication(s) at the time of the study; (6) other concurrent neurological disease processes that might induce EEG alteration, for example, hydrocephalus, hemiparesis or known syndromes affecting brain development; and (7) significant primary sensory disorders, for example, blindness and/or deafness. A total of 463 patients met the above study criteria and were designated as the study's ASD sample.

Healthy controls

From among normal children recruited and studied for developmental research projects, the goal was to provide a comparison group of children selected to be normally functioning while avoiding creation of an exclusively 'super-normal' group. For example, subjects with the sole history of prematurity or low-weight birth, and not requiring medical treatment after birth hospital (Harvard affiliated hospital) discharge were included.

Necessary inclusion criteria were as follows: (1) living at home with and considered normal by the parents; and (2) identified as functioning within the normal range on standardized developmental and/or neuropsychological assessments performed during the respective research study.

Exclusion criteria were as follows: (1) diagnosed neurologic or psychiatric illness or disorder or expressed suspicion of such, for example, global developmental delay (GDD), developmental dysphasia, attention deficit disorder (ADD) and attention deficit with hyperactivity disorder (ADHD); (2) abnormal neurological examination as identified during the research study; (3) clinical seizure disorder or EEG reading suggesting an active seizure disorder or epileptic encephalopathy (Note: Subjects with rare EEG spikes were not excluded); (4) noted by the research psychologist or neurologist to present with autistic features; (5) newborn period diagnosis of intraventricular hemorrhage (IVH), retinopathy of prematurity, hydrocephalus, or cerebral palsy or other significant condition likely influencing EEG data; and/or (6) taking medication(s) at time of EEG study. A total of 571 patients met the criteria for neuro-typical controls and were designated as the study's control (C) sample.

Institutional Review Board approvals

All control subject families, and subjects as age appropriate, gave informed consent in accordance with protocols approved by the Institutional Review Board (IRB) of Children's Hospital Boston. Subjects with ASD who had been referred clinically were studied under an IRB protocol that solely required de-identification of data without requirement of informed consent.

Measurements and data analysis

EEG data acquisition

Registered EEG technologists, naïve to the study's goals, and specifically trained and skilled in working with children within the study's age group and diagnostic range, obtained all EEG data by use of up to 32 gold-cup scalp electrodes applied with collodion after measurement. Analyses were subsequently restricted to the following 24 channels available for all subjects: FP1, FP2, F7, F3, FZ, F4, F8, T7, C3, CZ, C4, T8, P7, P3, PZ, P4, P8, O1, OZ, O2, FT9, FT10, TP9, TP10 (see Figure 1). EEG data were gathered in the awake and alert state assuring that adequate periods of waking EEG were gathered. EEG data collected during EP formation were not utilized for the study. Data were primarily obtained from Grass™ (Grass Technologies Astro-Med, Industrial Park 600, East Greenwich Avenue, West Warwick, RI 02893 USA) EEG amplifiers with 1 to 100 Hz bandpass filtering and digitized at 256 Hz for subsequent analyses. All amplifiers were individually calibrated prior to each study. One other amplifier type was utilized for five patients with ASD (Bio-logic™, Bio-logic Technologies, Natus Medical Inc., 1501 Industrial Road, San Carlos, CA 04070 USA; 250 Hz sampling rate, 1 to 100 Hz bandpass) and one other amplifier type was utilized for 11 control subjects (Neuroscan™, Compumedics Neuroscan, 6605 West W.T. Harris Boulevard, Suite F, Charlotte, NC 28269 USA, 500 Hz sampling rate, 0.1 to 100 Hz bandpass). Data from these two amplifiers, sampled at other than 256 Hz. were interpolated to the rate of 256 Hz by the BESA 3.5™ software package. As the band-pass filter characteristics differed among the three EEG machines, frequency response sweeps were performed on all amplifier types so as to permit modification of data recorded with the Biologic and Neuroscan amplifiers to be equivalent to those gathered by the Grass amplifiers. This was accomplished by utilizing special software developed in-house by the first author using forward and reverse Fourier transforms [37].

Figure 1 Standard EEG electrode names and positions. Head in vertex view, nose above, left ear to left. EEG electrodes, Z, Midline, FZ, Midline Frontal; CZ, Midline Central; PZ, Midline Parietal; OZ, Midline Occipital. Even numbers, right hemisphere locations; odd numbers, left hemisphere locations, Fp, Frontopolar; F, Frontal; C, Central; T, Temporal; P, Parietal; O, Occipital. The standard 19, 10 to 20 electrodes are shown as black circles. An additional subset of five, 10-10 electrodes are shown as open circles. Full size image

Measurement issues and solutions

EEG studies are confronted with two major methodological problems. First is the management of the abundant artifacts observed in young and behaviorally difficult to manage children (for example, eye movement, eye blink and muscle activity). It has been well established that even EEGs appearing clean by visual inspection may contain significant artifacts [38, 39]. Moreover, as shown in schizophrenia EEG research, certain artifacts may be group specific [40]. Second is the capitalization upon chance, that is, application of statistical tests to too many variables and incorrect reports of those that appear significant by chance as support for the experimental hypothesis [41]. Methods discussed below were designed to specifically address these problems.

Artifact management - Part 1: Unprocessed EEG signals

At the conclusion of each subject's data collection, digitized EEG data were inspected by the EEG technologist and those EEG epochs were visually identified which were recorded during breaks for relaxation, or showed movement artifact, electrode artifact, eye blink storms, drowsiness, epileptiform discharges, and/or bursts of muscle activity. Once identified, they were marked in order to allow complete exclusion from subsequent analyses of all channels recorded during such epochs. Results were reviewed and confirmed and/or modified by an experienced pediatric electroencephalographer (first author). After such visual inspection and treatment, data were low pass filtered below 50 Hz with an additional 60 Hz mains rejection notch filter. Remaining eye blink and eye movement artifacts, which may be surprisingly prominent even during the eyes closed state, were removed by utilizing the source component technique [42, 43] as implemented in the BESA (BESA GmbH, Freihamer Strasse 18, 82116 Gräfelfing - Germany) software package. These combined techniques resulted in EEG data that appeared largely artifact free, with rare exceptions of low level temporal muscle artifact and persisting frontal and anterior temporal slow eye movement, which remain capable of contaminating subsequent analyses. The final reduction of such persisting contamination of processed variables (coherence) is discussed below under Artifact management - Part 2

Calculation of spectral coherence variables

Approximately 8 to 20 minutes of awake state EEG data per subject were transformed by use of BESA software, which supplies an implementation of a spherical spline algorithm [44] to compute scalp Laplacian or current source density (CSD) estimates for surface EEG studies. The CSD technique was employed as it provides reference independent data that are primarily sensitive to underlying cortex and relatively insensitive to deep/remote EEG sources. Srinvasan et al. [29] point out that..."EEG coherence is often used to assess functional connectivity in human cortex. However, moderate to large EEG coherence can also arise simply by the volume conduction of current through the tissues of the head... (and)...EEG coherence appears to result from a mixture of volume conduction effects and genuine source coherence. Surface Laplacian EEG methods minimize the effect of volume conduction on coherence estimates by emphasizing sources at smaller spatial scales than unprocessed potentials (EEG)."

Spectral coherence was calculated, using a Nicolet™ (Nicolet Biomedical Inc., 5225 Verona Road, Madison, WI 53711 USA) software package, according to the conventions recommended by van Drongelen [30] (pages 143-144, equations 8.40, 8.44). Coherency [45] is the ratio of the cross-spectrum to the square-root of the product of the two auto-spectra and is a complex-valued quantity. Coherence is the square modulus of coherency, taking on a value between 0 and 1. In practice, coherence is typically estimated by averaging over several epochs or frequency bands [30] and in the current project a series of two second epochs were utilized over the total available EEG segments.

Furthermore, the quest for better measures of connectivity between brain regions in EEG and MRI has recently generated new techniques for connectivity assessment in MRI and EEG [46–48]. Such techniques involve partial coherence as the measure of functional connectivity and appear particularly useful when comparing connectivity across tasks. As this was not the case in the current study, partial coherence was not utilized for the current project.

Spectral coherence measures were derived from the 1 to 32 Hz range, in 16, two-Hz-wide, spectral bands which results in 4,416 unique coherence variables. The 24 by 24 electrode coherence matrix yields 576 possible coherence values; the matrix diagonal has a value of 1 - each electrode to itself - and half of the 552 remaining values duplicate the other half, which results in 276 unique coherences per spectral band. Multiplication by the 16 spectral bands in turn results in 4,416 unique spectral coherence values per subject.

Artifact management - Part 2: Coherence data

As has been recently discussed in a study of normal adults and adults with chronic fatigue syndrome [49], artifacts cannot be removed from an entire EEG data set alone by visual inspection and direct elimination of electrodes and/or frequencies where a particular artifact is most easily apparent. An established approach to reduce further any persisting artifact contamination of processed coherence data involves multivariate regression. Semlitsch et al. [50] demonstrated that by identifying a signal that is proportional to a known source of artifact, this signal's contribution to scalp recorded data (EEG and its derivatives, such as evoked potentials, and so on) may be diminished by statistical regression procedures. Persisting vertical eye movements and blinks produce slow EEG delta spectral signals in the frontopolar channels FP1 and FP2 and such artifactual contribution may be estimated by the average of the 0.5 and 1.0 Hz spectral components from these channels after EEG spectral analysis by Fast Fourier Transform (FFT) [37] of common average referenced data. Similarly, horizontal eye movements may be estimated by the average of the 0.5 to 1.0 Hz spectral components from the anterior temporal electrodes F7 and F8. Little meaningful information of brain origin is typically found at this slow frequency in these channels in the absence of extreme pathology. Muscle activity tends to peak at frequencies above those of current interest. Accordingly, 30 to 32 Hz spectral components were considered to be largely representative of muscle contamination, especially as recorded from the separate averages of prefrontal (FP1, FP2), anterior temporal (F7, F8), mid-temporal (T7, T8), and posterior temporal (P7, P8) electrodes. These electrodes are the ones most often contaminated by muscle as they are physically closest to the source of the artifact (frontal and temporal muscles). The steps employed in this study involved, first, the fitting of a linear regression model where the dependent variables were those targeted for artifact reduction and the independent variables were those chosen as representative of remaining artifacts; second, the extracting of the residuals which now represent the targeted data with artifacts removed and, third, the use of these residuals in subsequent analyses. The six artifact measures, two very slow delta and four high frequency beta, were the ones submitted as independent variables to the multiple regression analysis (BMDP2007™-6R) [51], which was used to individually predict each of the coherence variables (see below), treated as dependent variables. Residuals of the dependent variables, now uncorrelated with the chosen independent artifact variables, were used in the subsequent analyses.

Prevention of capitalization upon chance: Variable number reduction by creation of coherence factors

In order to facilitate subsequent statistical analysis, specifically in order to avoid capitalization on chance resulting from the use of too many variables, Principal Components Analysis (PCA) of the coherence data was employed as an objective technique to meaningfully reduce variable number [52]. The coherence data were first normalized (centered and shifted to have unit variance) so that eventual factors reflected deviations from the average. In order to avoid loss of sensitivity by a priori data limitation, an unrestricted form of PCA [53] was applied allowing all coherence variables per subject to enter analysis. By employment of an algorithm based upon singular value decomposition (SVD) [37, 54], a data set of uncorrelated (orthogonal) principal components or factors [52, 53] was developed in which the identification of a small number of factors following Varimax rotation [55] describe an acceptably large amount of variance [56]. Varimax rotation enhances factor contrast yielding higher loadings for fewer factors while retaining factor orthogonality. Although not the only PCA method applicable to large, asymmetrical matrices (4,416 variables by 1,034 cases as in the current study), SVD, which may be used to solve under-determined and over-determined systems of linear equations [37], is among the most efficient techniques used for PCA [53]. This approach to variable number reduction has been successfully used in prior studies of EEG spectral coherence in infants [57] and adults [49, 53]. When total population size is over 200, as in the current study, coherence factor formation consistency by split-half replication becomes redundant (unpublished finding).

Data analysis

Discrimination of subject groups by use of EEG spectral coherence variables

Two-group discriminant function analysis (DFA) [58–60] was used extensively in this study. It produced a new canonical variable, the discriminant function, which maximally separated the groups, based on a weighted combination of the entered variables. DFA defined the significance of a group separation, summarized the classification of each subject, and provided approaches to the prospective classification of subjects not involved in discriminant rule generation by means of the jackknifing technique [61, 62] or by classification of a new population. The BMDP2007™ (Statistical Solutions, Stonehill Corporate Center, Suite 104, 999 Broadway, Saugus, MA 01906 USA) statistical package [51] was employed for DFA (program 7 M); it yields the Wilk's Lambda statistic with Rao's approximation. For the estimation of prospective classification success, the jackknifing technique was used [61, 62]. In jackknifing for two-group DFA, as was undertaken in this study, the discriminant function was formed on all subjects but one. The left-out subject was subsequently classified. This initial left out subject was then folded back into the group (hence "jackknifing"), another subject was left out, the DFA was performed again, and the newly left out subject classified. This process was repeated until each individual subject had been left out and classified. The measure of classification success was then based upon a tally of the correct classifications of the left out subjects. This technique is also referred to as the "leaving-one-out" process. Split half analysis was also used. Instead of leaving out a single subject for each iteration, 50% of subjects were left out, that is, the analysis was performed on a randomly selected sample consisting of only half the number of subjects. A random number generator within BMDP-7M (stepwise DFA) was employed to permit random assignment of each subject to a training-set (50% of the subjects - used to create the discriminant) and a test-set (remaining 50% of the subjects - used to estimate prospective classification success). The algorithm used by BMDP does not always provide a precise split; the exact ratio of control to experimental subjects within each selected sub-group reflects random chance. As a separate measure of classification success, two-group t-tests (BMDP-3D) were performed utilizing the canonical discriminant variable produced by a training-set test on the corresponding test-set.

Factor description; relationship of PCA outcome factors to input coherence variables

Individual outcome factors were each formed as linear combinations of all input variables with the weight or loading of each coherence variable upon a particular factor as determined by the PCA computation [58]. Meaning of outcome factors was discerned by inspection of the loadings of the input variables upon each individual factor [52, 58]. Factor loadings were treated as if they were primary neurophysiologic data and displayed topographically [63, 64]. Display of the highest 15% of coherence loading values, was utilized [49, 53, 57], to facilitate an understanding of individual factors' meaning, as shown in Figure 2.

Figure 2 Graphic representation of 33 coherence factor loadings. EEG coherence factor loadings. Heads in top view, scalp left to image left, nose above; Factor number is above heads to left and peak frequency for factor in Hz is above to right. Lines indicate top 15% coherence loadings per factor: Red = increased coherence in ASD-group; Yellow = decreased coherence in ASD-group. Involved electrodes shown as small white circles. Uninvolved electrodes are not shown. Full size image

Age grouping

Given the wide age range (14 months to 18 years) of the subjects within the ASD- and C-groups and the well known age effects on EEG and spectral coherence data over this wide age range [65–67], analyses were restricted to the more limited age range of 2 to 12 years (ASD-group: n = 430; C-group: n = 554; total sample: n = 984, see Table 1). A high male (84%) to female (16%) ratio in the ASD-group reflects known male preponderance for this population [68]. A similar pattern in the C-group (male (88%), female 12%) reflects intentional bias as subject selection anticipated studies of autism and other studies from which the C-group was drawn (for example, dyslexia, learning disabilities, and behavior problems where males predominate) [69, 70]. Male to female ratios were not significantly different between the ASD- and C-groups. The effect of age was removed from the 40 coherence variables generated on the 2- to 12-year-old total sample by simple regression using age-at-study as the independent variable and the 40 coherence factors as dependent variables (BMDP-6R). Factors remained statistically uncorrelated after this regression procedure. In order to assure relatively even age distribution of subject numbers between ASD- and C-groups, group comparisons were also independently performed in three narrower age ranges, namely for 2- to 4-year-olds, 4- to 6-year-olds, and 6- to 12-year olds.