Participants

Thirty participants were recruited through University mailing lists. Those with previously diagnosed mental disorders and those currently under medication for any mental health or mood disorders were excluded. The participants were asked to report about such disorders and/or medication in the beginning of recruitment process, and again in the questionnaire that they filled before the laboratory experiment. Regarding the sample, 77% were women and the mean age was 29 years (SD = 5.84), ranging from 21 to 45 years. The participants were given two movie tickets for their participation and provided informed consent prior to the study.

Stimuli

Stimulus selection strategies were designed to optimise ecological validity and experimental control. Participants were instructed to choose eight stimuli before the experiment: Four pictures and four pieces of music. The emotion induction mechanism was manipulated by altering the stimulus selection principle to rely on (a) personal memory or (b) purely stimulus properties. Valence was varied by asking them to choose either (a) pleasurable or (b) unpleasant stimuli. Both requisites were applied to music and pictures. Therefore, the stimulus instructions were as follows:

Choose a piece of music….

that evokes pleasure which is based on your personal memories

that evokes pleasure which is based purely on how it sounds

that evokes unpleasantness/aversive emotions based on your personal memories

that evokes unpleasantness/aversive emotions based purely on how it sounds

Choose a picture…

that evokes pleasure which is based on your personal memories

that evokes pleasure which is based purely on how it looks

that evokes unpleasantness/aversive emotions based on your personal memories

that evokes unpleasantness/aversive emotions based purely on how it looks

These materials had to represent an object that could be available for anyone, excluding, for instance, personal photos, and non-published musical productions. The difference between perceived and experienced emotions was explained to the participants before the stimulus selection and again at the beginning of the experiment. They were instructed to focus on their experience rather than on what the music or picture is representing. Participants were instructed to give detailed information about the stimuli to ensure that the researchers found the exact same stimuli. Musical excerpts were downloaded by the experimenters and edited into 40-second excerpts. The segments were chosen by a music expert to be a representative and memorable section of the song. Visual materials were downloaded and each picture was presented in the experiment for 40 seconds to eliminate confounds created by varying durations between the two domains.

Stimulus self-reports for validating stimulus selection concerning valence

In the experiment, participants rated verbally the valence of each stimuli during the breaks between stimulus presentation. A scale ranging from −3 (extremely unpleasant) to 3 (extremely pleasant) was used. The ratings were subjected to repeated-measures ANOVA to check assumed pleasantness and unpleasantness across the Valence, Mechanisms, and Modality. Significant effects of Valence (F (1,29) = 435.5, p < 0.001), Modality (F = 9.68, p < 0.05), and Mechanism (F = 9.00, p < 0.05) were observed. Ratings for positively valenced stimuli were higher (M = 1.98, SD = 0.83) than for negatively valenced stimuli (M = −1.47, SD = 1.14). These analyses were essentially a manipulation check. The results corroborate that the stimulus selection principles operated according to intentions.

Stimulus properties

The stimulus properties were analysed with computational models. The musical and acoustic qualities of the excerpts were estimated using MIR toolbox68 by focusing on a limited set of acoustic features (mean and standard deviation of dynamics, tempo, pulse clarity, register, major-minor, rhythmic fluctuation) that have been shown to be relevant for emotional expression (e.g.69). The extraction parameters were like those reported in past studies and the windowed analyses of the features were aggregated using means and standard deviations; these were entered into a two-way repeated-measures ANOVA with Valence and Mechanisms as the factors. Only one feature, spectral flux, showed significant main effects of Valence (F (1,29) = 4.34, p < 0.05) and Mechanism, F (1,29) = 4.47, p < 0.05). The excerpts chosen to represent stimulus feature category (in contrast to memory) in the unpleasant condition were considerably higher in spectral flux; this was also evident in the type of music genres and bands chosen for the unpleasant and unfamiliar condition (typically heavy rock and related genres). The rest of the features did not differ across the selections.

The 120 images were analysed in terms of their overall colour profiles and its complexity. For this, the perceptual hue, saturation, and brightness (HSV) were extracted from the RBG values. The HSV profiles of the images were summarised by the mean and entropy for saturation and brightness histograms, which were binned to 50 values within saturation and brightness distributions. The mean hue values were estimated after conversion into degrees and taking the mean angle of the resulting distribution. The means and entropies of the profiles were subjected to repeated-measures ANOVA across Valence and Mechanism factors, yielding non-significant main effects (F < 2.06 for all six features). In sum, the surface characteristics of the images do not distinguish the sample categories. Next, the images were subjected to automatic content analysis where the tags describing the images were retrieved using an online service (Clarifai). The top 10 tags for each image were then subjected to sentiment analysis using vocabulary approach70 based on 13,915 English words rated for Valence, Arousal and Dominance. Each image obtained a score in each dimension based on the average of the matching keywords (8–11 keywords/image). A repeated-measures ANOVA for Valence score revealed significant effects for Valence (F (1,29) = 7.93, p < 0.01) but not for Mechanism (F = 0.46), showing higher Valence scores (M = 6.32, SD = 0.44) for Pleasant images than for Unpleasant images (M = 6.05, SD = 0.54). The Arousal scores exhibited a similar pattern where unpleasant images scored significantly (F = 5.89, p < 0.05) higher Arousal scores (M = 4.22, SD = 0.36) than pleasant images (M = 4.07, SD = 0.34). The dominance scores did not differ across the Valence and Mechanism (F < 1.7). Although the analysis provides an expected summary of the affective themes in the images, it was not sensitive to all culturally appropriate meanings in the images.

Measures

Questionnaire

Self-reports of emotions evoked were collected through a questionnaire prior to the laboratory experiment using 10 emotion concepts for each picture and music excerpt. The selection of concepts was based on factors that resulted from an analysis of both a pilot survey (N = 109) designed to explore the affective characteristics of everyday emotions of music and pictures, and previous studies on emotional responses to music71,72. Factor analyses were executed for music and pictures separately, and factors that occurred for both modalities were included in the present study. The factors were labeled as joy, strength, sadness, relaxation, tenderness, eroticism, melancholia, spiritualness, curiosity, and kinship. Participants were instructed to rate the felt intensity of each emotion on a 7-point Likert scale for each of the eight stimuli. High values are indicative of increased intensity of emotional experience (Appendix 1).

EEG recordings and analyses

64-channel EEG (BioSemi Active II amplifier system utilizing Active View 6.05 recording software) with a sampling rate of 500 Hz was recorded continuously using active Ag/AGCl electrodes (BioSemi Headcap). As per recommendations of the manufacturer, the electrode voltage offsets were kept below 25 mV. Analyses were executed with Brain Vision Analyzer software (Brain Products) and custom written Matlab scripts. Bad channels were removed and the remaining EEG signals were first re-referenced to the average of all channels and then band-pass filtered from 0.5 to 30 Hz. A segment of 60 s of spontaneous EEG from the early phase of the session was fed into independent components analysis (ICA) to recalculate the data, and minimize the effects of eye blinks and eye movements. ICA (Infomax algorithm) was set to produce as many components as there were channels. Among the first components, there were, in most cases, two components corresponding to eye blinks (clear frontal distribution) and lateral eye movements (sinks and source bilaterally in the anterior part of the head); their contribution to the data were calculated away. The obtained signal was then visually compared with the raw signal to verify that this procedure removed stereotypical artefacts.

The EEG signal was segmented based on the event types. Fast Fourier transformation was applied to an EEG epoch consisting the time window of 2–28 s after the onset of the stimulus. Fourier transformed signals were then averaged across event types. For further analyses, frontomedial (C1, C2, Cz, F1, F2, FC1, FC2, Fz), left frontal (C3, C5, F3, F5, F7, FC3, FC5, FT7) and right frontal (C4, C6, F4, F6, F8, FC4, FC6, FT8) electrode pools were formed by averaging the frequency distributions of these signals. The mean magnitude of theta (4–8 Hz) and alpha (9–13 Hz) frequency band activity was calculated for each participant during each event type. The data of two participants were excluded due to problems related to EEG-recordings.

Procedure

All experimental protocols were approved by the University of Jyvaskyla Ethics Committee. The methods were carried out in accordance with the ethical principles of research in the humanities and social and behavioural sciences defined by the National Advisory Board on Research Ethics in Finland (TENK). Informed consent was obtained from all participants. Participants were given a personal identification code after they provided the stimulus materials; with this code they received access to the questionnaire that was asked to be completed in the two days before neurophysiological measures. Upon arriving for the experiment, participants signed the consent form. They were then seated, facing a monitor. Audio stimuli were presented through stereo speakers and visual stimuli on a PC monitor. All the tasks and instructions were provided through e-Prime 2.0 Professional software operated by the researcher. EEG-data were collected using ActiView lab software. The experiment comprised eight blocks that contained 40-sec presentations of each stimulus type; within each block the order of stimulus presentation was randomized. After each 40-sec excerpt the participants reported orally the level of valence evoked by the stimuli to minimally disrupt ongoing EEG-measurement using scales ranging from −3 (extremely unpleasant) to 3 (extremely pleasant). Each stimulus type was evaluated eight times during the experiment.

Appendix

Research data is available at Harvard Dataverse open access https://doi.org/10.7910/DVN/ZZR7WX.