Normative data

EEG data were obtained from six different laboratories that were extracted from the Brain Resource International Database (New York, Rhode Island, Nijmegen, London, Adelaide and Sydney). All participants were adults (mean age 43.38 (18.42 SD) y; range 18–98 y; 47% males). Exclusion criteria were a personal or family history of mental illness, brain injury, neurological disorder, serious medical condition, drug/alcohol addiction, first-degree relative with bipolar disorder, schizophrenia, or genetic disorder. Institutional review board approval was obtained for all sites and informed consent from all subjects. All methods were performed in accordance with the relevant guidelines and regulations. IRB approval was obtained for all sites (Nijmegen: Commissie Mensgebonden Onderzoek, Regio Arnhem-Nijmegen; CMO-nr: 2002/008).

EEG recordings

EEG recordings were performed using a standardized methodology and platform (Brain Resource Ltd., Australia) for which full details have been published elsewhere7,47 as have the results of the across-site consistency and reliability of this methodology38,39.

Participants were seated in a sound and light attenuated room, controlled at an ambient temperature of 22 °C. EEG data were acquired from 26 channels: Fp1, Fp2, F7, F3, Fz, F4, F8, FC3, FCz, FC4, T3, C3, Cz, C4, T4, CP3, CPz, CP4, T5, P3, Pz, P4, T6, O1, Oz and O2 (Quikcap; NuAmps; 10–20 electrode international system, sampling frequency 500 Hz). Data were referenced to averaged mastoids with a ground at AFz. Horizontal eye movements were recorded with electrodes placed 1.5 cm lateral to the outer canthus of each eye. Vertical eye movements were recorded with electrodes placed 3 mm above the middle of the left eyebrow and 1.5 cm below the middle of the left bottom eyelid. Skin-electrode impedance was kept <5 kOhm. A low pass filter with an attenuation of 40 dB per decade above 100 Hz was employed prior to digitization. EEG data was recorded for two minutes with eyes open (EO) with the participant asked to fixate on a red dot on the screen. Two minutes with eyes closed (EC) were obtained while the participant was instructed to remain relaxed. Data were EOG-corrected using a regression-based technique similar to that used by Gratton, Coles and Donchin48 and stored in EDF format49.

EEG data was down-sampled to 128 Hz and subsequently band-pass filtered between 0.5–25 Hz. EEG reference was kept unchanged (averaged mastoids) and 24 channels were kept (Fp1, Fp2, F7, F3, Fz, F4, F8, FC3, FCz, FC4, T3, C3, Cz, C4, T4, CP3, CPz, CP4, T5, P3, Pz, P4, T6, O1) while two were removed (O2 and Oz) to achieve low numbers in the prime decomposition of the matrix (3 × 2 × 2 × 2) to later be able to perform a maximum of pooling operations. Filtering was obtained with a first order Butterworth minimum phase distortion filter.

Deep Net Architecture

The architecture of the deep net was inspired by deep convolutional nets that have been designed for image classification37. The input matrix for the net was a 24 (EEG-channels) × 256 (2 s × 128 Hz) matrix. For the filter sizes of the convolutional layers, we used minimal windows of 2 × 2 patches. The number of filters decreased from 300 within the first and second layer to 50 within all other layers. Activation was done using a rectified linear unit. A pooling function was applied before using a dropout function for the first four convolutional layers. The final classification was obtained by applying a dense layer with a softmax activation, resulting in a probability p for male or female sex. The various layers are summarized in Table 1.

Table 1 Layers of the CNN. The input matrix for the net was a 24 (EEG-channels) × 256 (2 s × 128 Hz) matrix. RELU = rectified linear unit. Full size table

Training and testing

We trained the neural network using 40 non-overlapping EEG segments of 2 s duration with eyes closed from every subject. In total, EEGs from 1000 adults were used for the training set (40 epochs × 1000 subjects = 40000 epochs of 2 s with 47% being males). Each segment received one-hot label array, indicating a male or a female. Training was done with a batch size of 70 for 150 runs, meaning all 40000 epochs were presented to the network 150 times in chunks of 70 segments.

Training and testing the accuracy of the data was done on large separate, independent datasets, therefore cross validation was deemed not necessary. The independent test set comprised 308 cases (49% males, 40 segments from each subject × 308 subjects = 12320 samples of 2 s). Classification by the final layer of the network was binary (male (1) or female (0)). Within training, accuracy was computed after each run for all segments of the training set and for the test set. Training was finished after a) accuracy within the training set reached 100% or b) the loss function of the training set did not further decrease or c) 150 runs were finished. Final classification was dichotomous, by taking the mean probability of the 40 segments of 2 s each for each subject; if p > 0.5, the EEG was classified as male.

Visualizing deep layers

The procedure we used to visualize which features of the input data are mainly used by the CNN is similar to a technique called “deep-dreaming” and has been described elsewhere in more detail31,32,50. The essence of the method is that the network is activated “top-down”, meaning that from a desired output (e.g. 1 = male) from the last layer, the connections of the trained network are activated toward the input layer. The activity of the first layer, which normally receives the input matrices (i.e. the raw EEG data), then can be seen as an artificially generated input pattern that most likely would produce the desired output. During this process, the filter layers in between the input and the output are activated, representing archetypal features of the desired output.

We generated artificial input patterns by retrograde ascending of the gradients in the trained network model, repeating this for all filters of all layers and sorting the generated data for the input space by the highest loss (i.e. the maximum activation of a specific filter in a particular layer32).

Estimating significance

First, we randomly assigned sex to each subject in the test set (n = 308), using the prior sex distribution (47% males). To set the p-value for statistical significance at p < 10−5, we performed 100,000 simulations in Matlab. The best classification accuracy reached was 63%, which was subsequently considered the significance threshold.

Spectral features

Power spectrum was estimated using a Fast Fourier Transform using Welch’s method with half overlapping epochs of 10 s, as implemented in Brain Vision Analyzer 2.1.0 (Gilching, Germany).