Stimuli Selection

In a pre-experiment survey, three photographs of an actress expressing a basic emotional valence (positive, ambiguous and negative) were chosen as emotional cues for the current study (Fig. 1b). “Ambiguous” was defined as a neither positive nor negative rating in valence and arousal44. These stimuli were first developed for a previous, behavioral study of emotional improvisation13 and were intended to represent an emotion with minimal distractions and without eliciting a strong emotional reaction from perceivers. Photos showed an actress from the collarbone upward, looking away from the camera and photos were desaturated so there would be no color cues. We developed these visual cues for emotions in order to avoid potential confounds of linguistic labels45.

11 males and 9 females (mean age = 32 ± = 17s.d.), from the Johns Hopkins University community rated a selection of images on a visual analog scales based on Russell’s circumplex model36,44 and results were coded on a nine point scale (0–9, Negative-Positive). We calculated a one-way ANOVA with factors Emotion (Negative, Ambiguous, Positive). Tukey’s honestly significant difference criterion was used for post hoc comparisons. A significant main effect of Emotion [F(1, 2) = 110.87, p < 0.001] was observed. Mean ratings for the stimuli: Negative, mean = 2.95, s.d. = 1.09, Ambiguous, mean = 4.3, s.d. = 0.80, Positive, mean = 7.5, s.d. = 1.05. A full description of our stimuli pre-testing is available in McPherson et al.13. Informed consent was obtained in writing from all subjects and all experimental procedures were approved by the Johns Hopkins University School of Medicine Institutional Review Board. All experimental procedures were carried out in accordance with the approved guidelines.

Musical Performance Analysis

We analyzed the MIDI (Musical Instrument Digital Interface) piano output obtained during fMRI scanning using measures of salient musical features including note density, note duration distribution, note maxima and minima, mode and key. These measures were compared for the chromatic scales and improvisations created in response to the different emotional targets. These results were calculated using the MIDI Toolbox46 and a complete explanation of the calculation of these features can be found in Eerola and Toiviainen, 2004.

Note density is a measure of notes per second and for monophonic compositions can be used as an indication of tempo (higher note densities generally correspond with faster tempos). Note maxima and minima indicate the highest and lowest pitch, respectively, played in a given musical segment. For note density, maxima and minima, we calculated a one-way ANOVA with the within-subject factor Emotion (Negative, Ambiguous, Positive) for both improvisation and chromatic scale trials. Tukey’s honestly significant difference criterion was used for post hoc comparisons.

The duration distribution function of the MIDI Toolbox returns the percentage of notes that fall into nine different logarithmically organized bins (note length categories). Length categories are defined as a unit of beats. We set our MIDI tempo so that 1 beat = 0.5 s (quarter note = 120 Beats Per Minute (BPM)). Therefore, bin 1 = 1/8 s, bin 3 = ¼ s, bin 5 = ½ s, bin 7 = 1 s and bin 9 = 2 s. The relationship between bin 1 and bin 9 is proportional to the relationship between a sixteenth note and a whole note. We compared corresponding duration distribution bins using two-sample Kolmogorov-Smirnov tests.

Key (tonal center) and mode (major vs. minor) were calculated using the Krumhansl & Schmuckler (K-S) key-finding algorithm, which uses the pitch class distribution of a piece (weighted according to duration) to return a key profile for the piece46. We used the K-S key finding algorithm to determine the best fit for each entire 44 s improvisation. Mode and key calculations were confirmed by the authors through a visual inspection of the scores.

Functional Neuroimaging Testing

Subjects

Twelve professional jazz pianists (11 male, 1 female; mean age = 39.9 ± 15.8) participated in the study. All subjects had been performing piano professionally for over 5 years (mean yrs performing professionally = 18.35 ± 13.28). Subjects were recruited as they became available, without an a-priori regard to balancing by gender. None of the subjects reported histories of neurologic, auditory or psychiatric disorders. Informed consent was obtained in writing for all subjects and the research protocol was approved by the Johns Hopkins School of Medicine Institutional Review Board.

Experimental Design

A block-design imaging paradigm was used to assess the effect of emotional intent on musical creativity (Fig. 1a). Rest blocks were 16 seconds in duration and test blocks were 44 seconds in duration. While in the scanner, the pianists were shown the three selected photographs of an actress representing, ‘Positive’, ‘Negative’ and ‘Ambiguous’ emotions. At the beginning of the presentation of each image, subjects were given a simultaneous matching visual and auditory cue instructing them to respond in a specific way to the image. This cue lasted three seconds. The cues were: “View”, “Chromatic Scale” and “Improvise”. Subjects were instructed to simply fixate on the image and keep their eyes open during the entire ‘View’ condition. The chromatic scale condition was designed to assess neural activity during a highly constrained, non-creative and non-emotional musical motor task. For the chromatic condition, subjects were told to play an ascending and descending chromatic scale over the entire range of the keyboard. Pianists were instructed to make this chromatic scale the same tempo regardless of the picture they were viewing. Before scanning began, pianists were familiarized with the target tempo, approximately eighth note = 180 BPM (three notes per second) and were instructed to keep this tempo consistent between blocks. For the improvise condition, subjects were instructed to improvise a composition that they felt best represented the emotion expressed by the images. Improvisation was unrestricted melodically, harmonically and rhythmically, but the subjects were instructed to play monophonically (one note at a time) using their right hand. Pianists were restricted to using their right hand while improvising and playing chromatic scales due to space considerations on the scanner piano. Stimuli were presented in a pseudorandom order and there were 24 test blocks per subject, with 8 per emotion (4 Improvisation, 2 Chromatic and 2 View). Subjects were asked to keep their eyes open during the entire experiment, even rest blocks and to refrain from moving their head or any other part of their body other than their right hand.

Procedure

During scanning, subjects used a custom-built non-ferromagnetic piano keyboard (MagDesign, Redwood, CA) with thirty-five full-size plastic piano keys. The piano keyboard was placed on the subject's lap in supine position, while their knees were elevated with a bolster. A soft velcro square was placed on Middle C of the piano, allowing subjects to orient their hand without viewing the keyboard. Subjects were visually monitored to ensure that they did not move their left (non-playing) hand, head, trunk, or other extremities during performance. Before the experiment commenced, subjects were given time to find a comfortable playing position and familiarize themselves with the keyboard and environment. Subjects also had a trial run to test sound levels, make final keyboard placement adjustments and practice the paradigm before scanning began. Data from these test blocks were discarded and the test block was repeated if the subject requested more time to become comfortable in the scanner.

The scanner keyboard had MIDI output, which was sent to a Macintosh MacBook Pro laptop computer running the Logic Express 8 sequencing environment (Apple Inc., Cupertino, CA). Piano sound output was routed back to the subject via in-ear electrostatic earspeakers (Stax, Saitama, Japan). In addition to the electrostatic earspeakers, subjects wore additional ear protection to minimize background scanner noise. For each subject, ear speaker volume was set to a comfortable listening level that could be easily heard over the background scanner noise. A double mirror mounted on the head coil above the subject's eyes allowed them to view a rear projection screen behind the scanner bore. The stimuli and instructions were presented with EPrime47.

Image acquisition

All scans were performed at the F.M. Kirby Research Center for Functional Brain Imaging at the Kennedy Krieger Institute of Johns Hopkins University. Blood oxygen level dependent imaging (BOLD) data and T1-weighted anatomical images were acquired using a 3-Tesla whole-body scanner (Philips Electronics, Andover, MA) using an sixteen-channel head coil and a gradient-EPI sequence. The following scan parameters were used: TR = 2000 ms, TE = 30 ms, flip-angle = 75 degrees, field of view 216.000 × 128.000 × 240.000 mm, 32 parallel axial slices covering the whole brain, 4 mm thickness (3 × 3 mm in-plane resolution). 720 volumes were acquired for each subject.

fMRI Analysis

Standard preprocessing steps were completed in SPM8, including realignment to the first volume of the run, coregistration with a participant's T1-weighted structural image, indirect normalization of the structural image to template space, propagation of normalization parameters to coregistered functional images and smoothing with an 8 mm FWHM kernel. A first-level general linear model was estimated for each subject using ten regressors, one for rest and one for each experimental condition combination—emotion (positive, negative, ambiguous—Pos, Neg and Amb) and task (view, chromatic scale, improvisation—View, Chrom and Improv). Each regressor was convolved with a standard hemodynamic response function. Design matrices also included covariates of non-interest, which consisted of motion parameters calculated during the realignment stage and mean signal intensity for the run. Between-emotion (e.g. [PosImprov > NegImprov]) and within-emotion (e.g. [PosImprov > PosChrom]) contrasts were estimated for each subject. Contrasts were then entered into a second-level random-effects model using a one-sample t-test. Random-effects analyses take into account inter-subject variability and therefore can be generalized to a broader population. Contrasts were thresholded at an uncorrected p value of 0.005 with a minimum voxel extent of 10 voxels. Analysis of average effect sizes was completed using the rfxplot toolbox48.

PPI Analysis

Psycho-physiological interaction (PPI) analysis can be used to identify task-dependent changes in effective connectivity between a seed region and other regions in the brain49. We used the generalized psycho-physiological interaction (gPPI) toolbox50 to examine differences in the networks of brain areas that exhibited functional connectivity with emotion-specific brain areas during improvisation conditions with different emotional intention. Seed regions for this analysis were derived from emotional condition contrasts estimated in the primary analysis. Significant shifts in functional connectivity in an emotional condition for each seed region were identified by applying inclusive masking for within-emotion (i.e. [PosImprov > PosChrom]) and between-emotion (i.e. [PosImprov > NegImprov]) contrasts, with a minimum voxel extent of 20 voxels and p < 0.001 significance threshold, uncorrected.