Executive summary of methods

Since the brain is so highly recurrent, it had been assumed that specifying a global, directional hierarchy would not be feasible, since interconnectedness would, in effect, disguise it. Relating the spectrum of cognitive functions to global cortical architecture required developing new approaches to modeling structural hierarchies. Hierarchy is traditionally defined as: an arrangement, categorization, system, or series of objects (items, names, values, categories, people, groups, subsets, etc.) in which the nodes are ranked, organized, or arranged as being “above,” “below,” or “connected to” one another either directly or indirectly, either vertically or diagonally, in a graded or successive order or class, often according to importance, relevance, category inclusiveness, or according to similarities of structure or origin, all of which can be represented formally, usually by a diagram of connected nodes.

Our formal definition of hierarchy (or graph-theoretic network depth) simplified recurrence into a continuum of statistical connectivity to sensory inputs. This notion of hierarchy was merely probabalistic, where depth was assumed to impact the probability of neuronal routing, rather than necessitate it. Our methods defined a structural pyramid of the human connectome, with the bottom foundation of the pyramid defined as sensory inputs, while the peak or pinnacle of the pyramid was structurally deepest in the cortical network. We associated every region of interest (ROI) in the brain with a number identifying its integrated distance from the average of sensory cortex inputs based on rsfMRI and DTI data, defined below. This defined connectivity-distance of that region from the sensory outside world, or that region’s depth in the cortical network, termed network-depth.

Next, we generated a novel method of describing the distribution of behavioral fMRI activations via their relationship with any node-wise graph statistic of connectome data. Finally, with this method we rank-sorted cognitive functions from fMRI databases by their distribution over network-depth, with cognitive functions distributed shallow in the structural network at the top of the list and functions distributed deeply at the bottom of the list. As a proof of concept, we compared the physiologically-based ranking to a survey of humans’ ranking; around 500 participants were asked to rank behavioral elements by their judgment of whether they were “abstract” versus “concrete”. Further, we compared experimental orders to similar random control experiments and to meaningful control experiments which sorted functions by other node-wise graph statistics. Some of our methods themselves were novel contributions and were thus elaborated in full detail below.

Connectivity methods

Two modern methods are popularly used to map the human brain’s wiring diagram, the connectome, in a living person: 1) resting state functional magnetic resonance imaging (rsfMRI) measuring temporal correlation (functional connectivity) between regions; and 2) diffusion tensor imaging (DTI) measuring tract-based connectivity between regions through fiber bundles from which distances can be calculated. With rsfMRI, a seed region (starting reference point) is selected, revealing to what degree other regions respond in temporal synchronization with that reference region, thus allowing inference about the degree to which brain regions are connected. The rsfMRI data serve as a proxy for processing delay between regions or number of functional synapses between regions, though are limited to peri-second resolution. DTI does not share the aforementioned temporal limitation, yet is neither functionally based, nor evenly distributed in signal resolution across the brain. Network models derived from DTI and rsfMRI demonstrate similarities and strongly correlate25. Big data connectomics contributed modern approaches, including repositories of connectivity for human cortex such as the 1000 Functional Connectomes Project and the Human Connectome Project26. We extended and complemented the use of such connectivity data.

Building connectivity matrices from rsfMRI and DTI data

Our network analyses utilized rsfMRI and DTI connectivity matrices from the same group of individual participants from the NKI Rockland sample20 hosted in the 1000 Functional Connectomes Project, the International Neuroimaging Data-Sharing initiative and the UCLA Multimodal Connectivity Database (UMCD)25. Data processing for rsfMRI and DTI data followed procedures previously published20. Cortex, sub-cortical structures and brain-stem were detailed. The rsfMRI data included negative correlations, which still indicate functional connectedness and thus |absolute| value of that full matrix was used. Each full connectivity matrix was normalized between 0 (least connected) to 1 (most connected); by definition, normalizations do not change later rank-ordering. Diagonals of matrices were maximum values of the matrix, since regions are connected to themselves. The rsfMRI matrices were divided into 188 clusters (Regions-Of-Interest; ROIs), using a published parcellation process27 with resulting clusters imposed upon DTI data to parcel both data sets identically.

An increase in connectivity derived from rsfMRI + DTI is hereinafter termed connectivity-distance. Importantly, the connectivity-distance from any seed region may serve as a proxy for an increase in number of sequential synapses, processing time, or general connectedness from the seed region to any other region.

Defining sensory inputs

First, we specified inputs to cortex. Information from the outside world is transduced via the senses and information first reaches cortex in regions called “primary sensory cortices”, usually devoted to initial processing and perception of data within a single modality. These were hypothesized to be the sensors (inputs) to the information processing network of the cortex. The structure of sensory cortices differs from association cortices, for example in local/distant connectivity ratios28 and gross myelin distributions (Fig. 1a).

To identify known input regions, we performed extensive meta-analyses and employed the Harvard-Oxford cortical-subcortical probabilistic atlas included in FMRIB Software Library (FSL) and Brodmann’s labels from the BrainMap Talairach client29,30. Coordinates were determined for primary sensory input locations for visual, auditory, somatosensory, gustatory (taste), olfactory (smell), interoceptive (gut feelings) and vestibular (balance) modalities (Fig. 1). The vestibular modality can be subdivided into two modalities for detecting static head rotation (gravity) and dynamic momentum and the interoceptive modality is not always considered a single modality nor a modality at all; however, both were included as single modalities in these analyses. The primary sensory cortices were labeled upon the T1/T2 map representing myelin distribution (Fig. 1a) and coordinates (x, y, z) corresponding to the centers of several clusters for each modality were listed in (Fig. 1b). All processing employed Montreal Neurological Institute (MNI) space.

Vestibular

The sense of balance relays through the vestibular nuclei of the brain-stem, through thalamus, branching to an area of posterior parietal cortex overlapping with Brodmann area 5 and nestled near somatosensory cortex, superior temporal gyrus and central suclus31,32. Vestibular cortex displays the greatest degree of lateralization between hemispheres when compared to other modalities33,34.

Gustatory

Taste input relays through the gustatory nucleus of the solitary tract complex in the medulla, to the ventral posterior complex and the ventral posterior medial nucleus of the thalamus, on the way to primary gustatory cortex. Primary gustatory cortex overlaps with anterior insula and the frontal operculum on the inferior frontal gyrus of the frontal lobe35,36,37 and in part Brodmann area 43. These regions are often referred to as AIFO (Anterior Insula, Frontal Operculum).

Olfactory

The sense of smell is unique, as it is the only modality that does not primarily relay through the thalamus on the way to the cortex; projections are received directly to cortex. Targets include pyriform cortex, olfactory tubercle, amygdala, periamygdaloid cortex and entorhinal cortex, roughly corresponding to Brodmann area 28 and 3438,39.

Interoceptive

Anterior insula is hypothesized to receive input from internal organs and to play a roll in bodily awareness, emotional processing, physiological homeostasis, or more generally, interoceptive awareness40,41. This composes parts of Brodmann area 13.

Visual

The well-studied visual inputs relay through the lateral geniculate nucleus (LGN) of the thalamus to primary visual cortex (striate cortex or V1) and is traditionally defined as Brodmann area 17.

Auditory

Inputs for the sense of hearing transmit through brainstem (cochlear nucleus, superior olivary complex), inferior colliculi, then medial geniculate nucleus of the thalamus, before arriving at primary auditory cortex (A1). A1 roughly corresponds to superior temporal gyrus (STG), Heschl’s gyrus, as well as parts of planum polare and planum temporale. A1 overlaps with Brodmann area 41, some of Brodmann area 42 and is sometimes proposed to be in small part overlapping with Brodmann area 22.

Somatosensory

The well-studied inputs to cortex for the sense of touch relay through the brainstem to the ventral posterior nucleus of the thalamus before reaching the primary somatosensory cortex (postcentral gyrus). The primary somatosensory cortex overlaps primarily with Brodmann area 3a and 3b, though 2 and 1 may overlap partly as well.

Connectivity Data processing

Degree of connectivity to each input

For each sensory area independently, we defined a starting seed (reference) region and calculated its degree of connectivity to all other regions in cortex (Fig. 2b). Procedures follow: If primary sensory cortex input coordinates (Fig. 1) were fully within a cluster (out of 188), then the cluster was considered as a primary seed region (Fig. 3). Then, we averaged connectivity profiles from multiple seeds clusters of each single modality. (Fig. 3). This process revealed a continuum of regions, which were most to least connected to a sensory input, for each modality. Specifically, it generated a single vector representation (one value for every brain ROI) indexing the degree to which every other region is connected to that sensory seed input group. For example, connectivity to primary visual cortex was defined by averaging the connectivity to multiple seed clusters with centers in primary visual cortex (Fig. 1). This provided gross network-depth directionality for each single modality, thus defining our “hierarchical” rank.

Figure 3 Defining generalized network-depth within cortex. Fully connected weighted symmetric matrices (DTI or rsfMRI) display sets of connectedness values, (188 x same 188), 1 for each major ROI. Connectivity for a single region was defined color in by 1, 1*188 row (or identically symmetric column). Mean for a single modality was generated by averaging multiple columns corresponding to regions within a single sensory cortex, displayed by arrows below the matrix; e.g., column labels and arrows for visual in left bottom grey box. Averaging resulted in 1 vector of length 188, representing the connectivity to a single modality, displayed as a wider column at the right of the matrix. To generate the cross-modality connectivity (global network-depth), 7 modalities (each represented by wide column at right) were integrated together (mean or max; bottom right grey boxes) to create 1 vector (far right widest column) defining connectivity of each brain region to all sensory inputs. This was used to sort the matrix along both dimensions. Cross-modality mean can be conceptualized as the average neuronal path length from any region to sensory cortices. Alternatively, cross modality maximum across all sensory columns was taken, representing the shortest neuronal path from each region to sensory seeds. Exemplar data illustrate rsfMRI with mean processing and sorting (plotted on the brain in Fig. 4 panel 1). Full size image

Importantly, all remaining figures are color-coded identically, in gradients of blue representing network-depth, with shallow defined as dark blue and deep defined as light blue.

Degree of connectivity to fused inputs

All modalities were examined in unison, since any behavior involves not only a single sensory input, but a combination of different senses. Above, we generated connectivity profiles for each modality independently first. Then, we integrated connectivity to all modalities (via max or mean), to provide a cross-modality connectivity dataset with equal weighting for each modality. The cross-modality seed was computed via either: 1) taking the mean across modalities, or 2) taking the maximum across modalities (Figs 3 and 4). Any ties during max comparison were broken by means. This aggregate seed composed of sensory regions illustrated connectivity of every region in cortex to sensory inputs, a proxy for global network-depth.

Figure 4 Network-depth via integrated sensory inputs displayed over whole-cortex. Brains display binned connectivity to an aggregate seed cluster of all grouped inputs in parallel, with DTI (brains on left) or rsfMRI (brains on right), using network mean (brains on top) or maximum (brains on bottom). In top left quadrant, key regions discussed in main text were labeled by respective color (e.g., insular cortex in dark blue text and inferior frontal in light blue). Full size image

Binning network-depth: discretizing generalized connectivity-distance from sensory inputs

For the purpose of statistical summary, we divided the sorted distribution of connectivity-distance. Specifically, we artificially discretized sorting of regions via connectivity by distributing clusters into 10 step-wise statistical bins based on the degree of connectivity-distance of each ROI to input cortices. We divided the brain into 10 equally sized, irregularly shaped, groups of regions, ranging from highly connected to least connected, defining 10 steps of regional depth within the cortical network. The first clustered bin of ROIs, bin 1 (dark blue in Figs 2, 3, 4), included sensory cortices and highly connected close regions, bin 2 included proximate regions slightly less connected… bin 5 (blue) included intermediately connected regions… and bin 10 (light blue) included regions least connected to sensory inputs and deep in the brain’s network. It is important to note that these bins were not hypothesized to be layers or functional divisions, but were merely a method of descriptive analysis.

This process was performed for each modality and integrations of modalities (mean/maximum). The process of parceling into 188 clusters was performed before binning, thus facilitating correct super-clustering into bins, since bin divisions were more likely to occur between natural boundaries than within. Binning was performed evenly in region space, rather than connectivity space; each bin has a similar size. Plotting of images upon cortical surfaces42 was performed in Connectome Workbench26. To display this binned connectivity, the entire coordinate set from the BrainMap database was allotted into 10 bins and plotted to generate color-coded brain images (Figs 2b and 4).

The brain’s network only receives sense information via these sensory cortices and nowhere else. Thus, this measure estimates global network-depth within the brain.

Behavioral fMRI methods

One popular method for investigating the localization of brain functions is fMRI, which measures changes related to blood flow and thus indirectly, neuronal activity. Various experimental designs allow for inferences regarding activity of brain regions during behavioral tasks. Since fMRI studies are notoriously statistically under-powered or biased43,44, databasing many studies has strengthened potential reliability of conclusions21,22,23,24,45,46. Building upon a successful history of discovery in cognitive neuroscience using fMRI, our global synthesis included multiple large databases of studies, enabling exploration of larger-scale organizational principles of brains and behavior.

Determining functional localization from large behavioral fMRI databases

We combined our newly-directional cortical network models with large-scale behavioral data. For a fully-blind synthesis enabling objective results, we chose two different large repositories of behavior-associated fMRI activation coordinates. These repositories contain thousands of experiments and included:

1 BrainMap is a database of manually entered fMRI publications including activation coordinates associated with tasks, among other meta-data categories21,22,23,24. To date, it included 2390 papers, 11353 experiments, 46366 subjects and 91039 coordinate locations. Studies are entered manually and thus higher quality than automated extractions. The database is around 20 years old and hosted by Research Imaging Institute of the University of Texas Health Science Center San Antonio < http://www.brainmap.org>. We utilized two meta-data category labels: The Behavioral Domain label describes cognitive processes tested by fMRI contrasts in various experimental designs, composed of sub-elements such as: motion perception, anxiety and memory. The Paradigm Class label describes the type of task performed, composed of sub-elements such as: grasping, whistling and Stroop task. (Example data in Fig. 5a; Complete lists available in Fig. 6a; Supplementary Data Tables S2,S3). Figure 5 Binning behavioral-fMRI activation coordinates by physiological connectivity data. (a) Top table displays format of fMRI data, with behavioral elements tallied (at right of table) by sorted connectivity to inputs (at left of table). (b,c) Lower line-plots display examples of calculations performed for BrainMap and Neurosynth behavioral elements, with activation proportions illustrated from 0–1 on Y-axis and network-depth sorted bins 1–10 on X-axis. Small table immediately below X-axes of the lineplots displayed each database’s behavioral elements summarized using the number of voxels activated per bin and the relative activation proportions (in parenthesis). Exemplar 1,197 and 976 activations were each derived from 151 and 53 experiments respectively. Binning was performed on sorted connectivity from inputs, with matching color codes for network-depth (Figs 2, 3, 4). For each behavioral element, a linear regression was calculated, with slope. Slope of activation proportions over bins provided a concise model of each behavioral element’s activation distribution relative to sensory inputs. Brains at right illustrate corresponding activations in brain space. Sensory-distributed behavioral element activations appeared more grouped, while deep-distributed appeared more dispersed. Full size image Figure 6 BrainMap paradigm class and Neurosynth behavioral elements ranked in order of slope via network-depth, progressing from all inputs concurrently across cortex. In auto-sorted lists, sensory-related behavioral elements emerged at the top and aggregate symbolic tasks appeared near the list’s bottom, revealing an apparent relationship between behavioral element abstractness and network-depth. For example, compare the plots for the initial and final behavioral elements. (a) Every behavioral element from Paradigm Class auto-ranked in a sorted list. Interestingly, this linear regression merely approximated the actual distributions, which are all shown as small subsets next to each word. (b) Neurosynth behavioral elements auto-ranked; only terminal ends of list displayed (sensory and abstract ends). All other evaluated lists included in Supplementary Data Tables S2–4. Full size image 2 The Neurosynth database auto-extracts tabular fMRI activation coordinates and word frequencies from published studies45,46. To date, it contained 200,000 activation coordinates derived from 6,000 publications and is hosted at the University of Texas at Austin. Word frequencies (features in Neurosynth) auto-extracted from each article are associated with auto-extracted fMRI coordinates. This results in a set of features for each study, labeling a linked set of activation coordinates. For example, an article may use the word “faces” at a greater frequency than other articles, which is bound to coordinates reported by that article (a sample of Neurosynth word elements occur in Fig. 6b; complete list in Supplementary Data Table S3). Neurosynth data can be obtained via < http://neurosynth.org>, or via NeuroDebian47 repositories. Auto-extracted data are representative enough of actual human brain functions that this project enabled the beginnings of cursory computational ‘mind-reading’46, entailing the ability to predict basic states of mind using brain activations by comparing these states of mind to the database of known behavior-activation mappings.

Each single fMRI study in these databases associated the topic(s) of study of a single paper (e.g., face processing, visual cognition) with a set of regions which differed in activity between conditions, after statistical thresholding, for that single paper. Both database quality controls and analyses confirm external validity21,46. Together, these repositories tabulated activations from a potential ~17,000 fMRI experiments, with each individual study typically ranging from 500 mb to 5 gb per subject of raw data, summarizing potentially ~600 terabytes (0.6 petabyes). Notably, human input only indirectly shaped the results reported here via the studies that went into the fMRI databases and the publication process itself. There is some limited documented overlap in study indexing between databases. Since the Neurosynth database includes auto-extracted data, it likely contains some limited non-task-fMRI data, for example by incidentally indexing voxel-based morphometry (VBM) studies. In the BrainMap or Neurosynth databases, any task-based behavioral fMRI data in Talairach brain space were converted into MNI space using the icbm2tal transform48,49.

Importantly, each repository distinctly categorizes studies by one or several of about 650 employed labels (hereinafter: behavioral elements). Behavioral elements were conceptualized here as brain functions, tasks, or behaviors, which through various experimental designs elicited activations at corresponding coordinates. The data for each behavioral element included many points of activation from many studies.

Synthesizing physiological connectome with behavioral fMRI, via generalized network-depth from sensory inputs

We synthesized the above structural data and functional task-fMRI databases, to sort all behavioral elements by network depth. We objectively graphed distributions of activations associated with each behavioral element, across 10 bins of connectivity-distance to primary sensory cortices (Fig. 5). For each bin of network-depth, we summed all fMRI activation coordinate locations associated with each behavioral element showing activations in that bin and calculated the proportion of activations out of the total. This was performed by dividing the number in each bin by the total in the brain, then normalizing, to generate a series of 10 proportions representing the relative brain activity corresponding to each behavior in each of the 10 bins (Fig. 5):

For example, one BrainMap behavioral element, ‘naming’, associated with 78 activation coordinates in bin 1 , 64 points in bin 5 … with 976 total brain activations; thus 78/976 = 0.08 for bin 1 (dark blue)… 64/976 = 0.07 for bin 5 (blue)… This process generated length-10 vectors describing distributions of activation proportion (Y-axis) over 10 bins (X-axis), for each behavioral element of meta-data categories (Fig. 5). Linear regressions approximating vectors were drawn; this line’s slope served as a concise model of the distribution of activations in sensory-connected bins (bin 1 …) versus activations in sensory-distant bins (…bin 10 ). Thus, negative slopes described behavioral elements with activation distributions closer to sensory inputs, horizontal lines described distributions equally across bins and positive slopes described distributions biased toward regions distant from inputs. In other words, slope is merely a numerical estimate of how connected to sensory inputs the entire set of activation coordinates for a single behavioral element were.

BrainMap (Paradigm Class and Behavioral Domain) data processing differed from Neurosynth, in that when counting activation coordinates in the Neurosynth database46, each feature loading or significance value (all less than 0.05) was used to slightly weight the count of each activation, such that features with higher probability were tallied more strongly. This was performed by tallying the inverse of each loading value, ranging from [0.95, 1]. If this range-modulated counting had any significant effect, it would be to slightly improve the accuracy of such a tallying method. Due to normalization and proportional analysis, our method was robust to biases in number of behavioral elements and publication bias present in the behavioral databases. Such bias has been elegantly illustrated by the article titled: “What is the most interesting part of the brain?”43. To adjust for any remaining sample size effects (varying experiment numbers per behavioral element) we tested our method with SlopeCoefficient * γ ln(Sample Size) with γ such that 0 < γ ln(Sample Size) ≤ 1, to push small samples toward a normalized mean of 0, which had minimal impact.

One benefit of bins is similar to that of spatial smoothing in fMRI analysis, eliminating high frequency noise, before we performed the linear regression. This process emphasizes global trends, which were the goal of our hypotheses. Before a linear regression, binning is optional. However, doing so is mathematically similar to not and thus both would necessarily produce similar sorting above a reasonable number of bins (around roughly 6–7). Higher numbers of bins would mathematically approach a standard linear regression, while lower numbers would begin to lose important resolution. Above a threshold of bins, binning before regression would at most only blur any high-frequency patterns, rather than obscure or bias large-scale results. We ploted the identical methods with 47 bins in Supplementary Fig. S4.

Algorithmic flow for data synthesis

General algorithmic procedure for plotting and sorting the behavioral databases upon connectivity matrices can be outlined in four steps: (1) Compute connectivity order, (2) Compute behavioral element frequencies, (3) Bin connectivity ordering, (4) Regress behavioral element frequency across bins, (5) Corrections on the set of slope coefficients.

1 Compute connectivity order a Import a list of seed regions of interest (labeled sensory input locations, MNI space) and map these to the closest points in the connectivity matrix b Create an average (mean) of connectivities to input regions for each modality, resulting in one vector of dimension |M| * 1 per modality, named V M c Create an integration (mean or max) between each V M , creating vector V C , resulting in a single aggregate seed cluster for connectivity to all inputs d Sort V C and store its index for future ordering of behavioral elements 2 Compute behavioral element frequencies a Import a behavioral database, D (Neurosynth or BrainMap) of points of activation and associated behavioral element labels and map coordinates to the closest points in the matrix b For each behavioral element E ∈ D, generate a vector specifying the frequency of occurrence of coordinates associated with E, creating (V E ) c For each coordinate entry in D, if the point corresponds to the current E, increment the corresponding index in V E by the value 1 (BrainMap) or loading values ∈[0.95, 1] (Neurosynth) 3 Bin connectivity ordering a Index and sort the frequency table of V E using the previously computed connectivity order of V C b Split V E into 10 bins of approximately equal size, bin n c Compute the sum of frequencies for each bin n d Weight bins of different size proportionally 4 Regress behavioral element frequency across bins For each E: a Compute the proportion of activations for each bin n b Regress the proportion vs the bin index: where x is the bin n indices 1, 2, 3, …, 10, p is the proportion of activations, ε is the error term, α is the constant intercept term and β is the slope. As per [50, p. 316] we applied the logit transformation to the proportions, to obtain an appropriate distribution before regression. c Use the slope β to sort the behavioral elements d Plot the binned data and the regression 5 Corrections on the set of slope coefficients a Conservative adjustment for behavioral element sample size variation via: SlopeCoefficient * γ ln(SampleSize) with γ such that , while ensuring that the data are normalized to before correction. b z-score Normalization to facilitate cross-experiment and cross-method comparison and averaging

Random experiments

We defined two random experiments elaborated in more detail below. Each of the random experiments applies randomness to a different point in our algorithm.

1 Create a random index in [Compute connectivity order].d above 2 Sort randomly the bin index (x) in [Regress behavioral element frequency across bins].b above

Human participants rank behavioral elements

To subjectively evaluate impressions of abstractness of behavioral elements, nearly 500 adults participated in our survey. As a recent trend in psychological and social sciences, research participants are often recruited from online services such as Amazon Mechanical Turk, or others including MicroWorkers, ClickWorker, CloudCrowd and ClickChores. The population characteristics of these pools is similar to internet users generally and the data quality is considered reasonably reliable due to screening and vetting processes employed by these services. The subject population for this study was recruited both locally from the university population and from a pool of online research survey participants via the company SocialSci (Cambridge, MA, USA). Participants all stated they were at least 18 years of age, fluent in English and additionally reported details of any second language experience. All procedures complied with departmental and university guidelines for research with human participants and were approved by the University of Massachusetts Amherst Institutional Review Board.

All participants completed informed consent. Then, participants were initially provided with extensive definitions of: concrete, abstract and abstraction and asked to sort experimental phrases based on their judgment of whether they seem concrete versus abstract. Actual instructions were provided (Fig. S5). Two question formats included: 1) ranking a single phrase from 1-concrete to 7-abstract on a Likert scale, or 2) sorting 7 phrases in order of abstractness. To eliminate any order effects or biases, each participant received random samples of questions, presented in random order, both question types and for sort-based questions received random intra-question order. Participants had no knowledge of the goals of the study, or our results.

Human participants’ behavioral element order of judged abstractness was analyzed via mean abstractness rank given to each word by participants (ranked 1–7 for both question types). Since surveys from unsupervised online participants are likely to contain some false or erroneous data due to attrition or technical failure, standard outlier exclusion (outer fences) was performed for each behavioral element, excluding around 2–3 subjects each, a conservative elimination. Relationships between network-depth based sort and normalized survey data were quantified via Pearson’s product moment correlation coefficient and computed in the R core statistical environment. Aggregate ranks for each database-survey pairing were generated via mean behavioral element slope and order across all methods employed (DTI-Max, DTI-Mean, rsfMRI-Max and rsfMRI-Mean) for each database independently.

For Paradigm Class and Behavioral Domain, all behavioral elements were provided to participants and included in analyses. For Neurosynth some behavioral elements appeared as apparent artifacts of the automated extraction process. Thus, ~50 out of 550 theoretically invalid Neurosynth features were excluded, including those which were: statistics-related, experimental-design-related, subject-population-related, neurological terminology, drugs, inappropriate to rank, or highly ambiguous; e.g., we excluded the terms: resting-state fMRI, significant, elderly, gene polymorphism, pharmacology, drug names, etc, since these are not actually topics of study, but artifacts of the word-frequency derived content in the Neurosynth database. These excluded words were chosen before running the survey and we completed an identical analysis including these items; this conservative elimination did not appreciably impact statistical conclusions or sorts (Supplementary Data Tables S2–4).