Participants

Sixty-two participants (37 females, mean age = 24.9 years, SD = 9.7) took part in the experiment. None of the participants was a professional musician, 38% reported to have no musical training, 49% reported to have less than, and 13% of the participants more than five years of extra-curricular music lessons. Participants were compensated with a gift card (80 NOK, about 10 USD). Signed informed consent was obtained from all participants, and the study was carried out according to the Declaration of Helsinki. Because the experimental procedure was non-invasive and did not involve the collection of any personal information that could be used to identify the participants, no further approval was required (according to the Norwegian Centre for Research Data; NSD).

Stimuli and material

The stimulus-set consisted of six musical excerpts (Supplementary Table S3), selected based on a pilot study (for details see below). The stimuli were organized in three pairs, each pair consisting of one excerpt with a heroic and one excerpt with a sad emotional expression. The stimuli within each pair had equal tempo to ensure that any observed effects resulted only from the different emotional expression of the music, and not the tempo. The tempo of the excerpts belonging to the stimulus pairs was “slow” (64 beats per minute; BPM) for one pair, “medium” (95 BPM) for another pair, and “fast” (115 BPM) for the third pair. The presentation order of pairs, as well as whether the heroic or the sad excerpt within the pairs was presented first, was counter-balanced across subjects. This arrangement also ensured that the participants were not exposed to excerpts with the same emotional expression in succession. Each excerpt had a duration of two minutes, with a stable emotional expression from start to end. The stimuli were adjusted to the same loudness and had a 1.5-second fade in and -out. None of the stimuli contained any lyrics, and all pieces were orchestral, neo-orchestral, or string-orchestral music. Stimuli were selected out of 15 relatively unknown musical pieces based on a pilot study with 43 participants (for details see Supplementary Text S2).

The music stimulus and questionnaires were presented using PsychoPy (version 1.85.3)42 on two Lenovo laptops (G500 and B560). All of the instructions and questions presented during the experiment were in Norwegian. The participants listened to the music through headphones (Sennheiser DT770 PRO or Sony MDR-1000X). They responded to the questionnaires with the numeric buttons, and the whole keyboard for answering the free response task. An electrocardiogram (ECG) was acquired from the extremity leads using a made-to-order ECG-device (Research and Transfer Center at the University of Applied Sciences Leipzig, Germany). The ECG served to assess heart rate as an objective measure of physiological arousal during the music exposure.

Procedure

Participants were seated in a comfortable chair with pillows for neck and lumbar support. The chairs were placed inside a semi-enclosed booth, so that the participants could not see the experimenters during the experiment. A footstool was provided to support and elevate their legs. The laptop was placed on a removable table over the participant’s lap.

Participants were first informed about the experiment and gave signed informed consent, then they answered a questionnaire concerning their musical background. Before the experiment, four electrodes for ECG-measurements were fitted on the extremities (left and right upper arm and left and right shinbone). The participants were then informed that the experiment was about music and relaxation, and instructed on how to use the keyboard and the different rating scales. They were asked to relax, follow the instructions on the screen, listen to the music passively, and remain still during the music session and the resting period to reduce artifacts in the ECG-recordings. Before the first musical excerpts was presented, participants answered the 10-items International positive and negative affect schedule short-form (I-PANAS-SF25; translated to Norwegian, and referred to as PANAS throughout this article) to acquire a baseline of the participants’ affective state.

The experiment encompassed 6 trials with three pairs of heroic and sad pieces. Before the presentation of each musical excerpt within a trial, the participants were instructed to sit back, relax and close their eyes to promote relaxation and perceptual decoupling. After two minutes of music-listening, a thought sampling probe was obtained, for which the participants were instructed to hold on to the last thought they had before the music stopped. They were then asked to answer items of a questionnaire designed to measure: if (or to what degree) the participants were mind-wandering; controlled focus of attention; meta-awareness; self-involvement; involvement of others; valence; arousal; temporal orientation of thoughts; relevance to current life-reality; as well as the constructiveness and motivational properties of the thought-content (the wording of all items is listed in Supplementary Table S5). While the first nine items were taken from our previous study20, the items on constructiveness and motivation were added to assess possible empowering effects of the heroic music on the contents of thoughts. Then, the participants were asked to write a short description of what they were thinking about (“free writing task”). Each trial ended with the 10-item PANAS, to examine (1) whether the different musical stimuli were able to elicit different affective states, and (2) whether thought-contents reflected the participant’s affective state. Excerpts with either heroic or sad emotional expression were presented in alternation in each new trial. The experiment finished with a five-minute resting period to measure mean heart rate and heart rate variability43, before the participants were debriefed and given contact information, in case of any inquiries or concerns. The total duration of the experimental session was approximately 45 minutes.

Data analysis

Four different classes of analyses were carried out, assessing: (a) the occurrence of mind wandering, (b) physiological arousal (assessed using the mean heart rate), (c) thought content and (d) positive vs. negative affect. Before analyzing the data, the scales for the items assessing thought content (which ranged initially from 1 to 7) were recoded from −3 to +3 to better differentiate between negative, neutral and positive responses. For the scales assessing positive vs. negative affect, the baseline assessment of each scale (acquired before listening to the musical excerpts) was subtracted from the assessments after listening to each excerpt.

All data were analyzed using analyses of variance (ANOVA) employing the General Linear Model. Whenever these comparisons involved several items, appropriate Bonferroni-correction was used. The threshold depended on the number of items: For the ten items assessing thought-content the significance threshold was set to p = 0.0050 (p = 0.05/10). For the two subscales and the ten individual items assessing positive-negative affect it was set to p = 0.0042 (p = 0.05/12; because the correction diminished the significance thresholds by a factor of around 10, we report four decimal numbers for the p-values). An ANOVA for repeated measurements was used to assess what influence the emotional expression, and the tempo, of the musical excerpts had on the occurrence of mind-wandering. The model used the emotional expression (heroic vs. sad) and the tempo of the music (slow, medium, and fast) as within-subject factors, and the presentation order (heroic first vs. sad first) as between-subjects factor. A repeated-measurement ANOVA with the same factors (emotional expression, tempo and presentation order) was used to explore the physiological arousal (assessed via the mean heart rate).

Analyses evaluating the thought-content and PANAS items were restricted to trials where mind-wandering occurred, responses within trials without mind-wandering were excluded. The items eliciting mind-wandering were reorganized so that each item represented one measurement point. The reorganization required to control for the participants’ mean, given that there were six measurements for each participant (two emotional expressions × three tempi). Therefore, in the ANOVAs assessing thought-content, “participant-ID” served as random factor of no interest. In the ANOVAs assessing positive vs. negative affect the subtraction of the individual baseline measure served as control for the participants’ mean.

For the items assessing thought-content, two sets of models were used in the analyses; either of them using univariate ANOVAs. The first set of models used the within-subject factors emotional expression and tempo, while the participant-ID served as random factor of no interest. A second set of models evaluated whether the thought-content was modulated by physiological arousal. Given that tempo and physiological arousal covaried strongly, it was not possible to include both factors. Therefore, the second set of models employed the within-subject factors emotional expression and physiological arousal as well as participant-ID as random factor of no interest. For the items assessing positive vs. negative affect, only one set of models controlling for physiological arousal was used. These univariate ANOVAs used the within-subject factors emotional expression and physiological arousal.

Whenever a significant main effect for items assessing thought-content or affect (as measured with the PANAS) was obtained, follow-up analyses using one-sample t-tests were employed. For the items assessing thought-content, these tests evaluated whether values differed from a neutral response (i.e. zero), and for PANAS-items the tests evaluated whether values differed from the baseline values. For these models, appropriate Bonferroni-correction served to control for multiple comparisons. For items assessing thought-content (with four significant main effects × two conditions = 8 comparisons) the Bonferroni-corrected significance level was set to 0.0063 (0.05/8); for items assessing positive and negative affect (with seven significant main effects × two conditions = 14 comparisons) it was set to 0.0035 (0.05/14).

Similar to our previous study20, a word-cloud was created depicting the most common words in the participants’ free responses44,45. To this end, the frequency of word occurrences in the texts of the free writing task was computed. Misspelled words were corrected, and synonyms or related terms combined into a common denominator (e.g., “father”, “dad”, “mom”, and “mother” were combined into “parents”). Afterwards, words were removed if they were contained in the presented items or the instructions (e.g., “music”, “thinking”, “I”, “me”) or non-content words (e.g., “this”, “with”). For the final list of words used to create the word cloud, only words with at least six occurrences were chosen. The list contained (a) the word, (b) its frequency of occurrence (determining the word size in Fig. 2; sum of both conditions), and (c) whether the word appeared more often in one than the other condition (occurrences after heroic music divided by the total occurrences, determining the word color in Fig. 2).