Four separate types of experiments were performed. In the first experiment, participants said /ɑ/ (the vowel sound in ‘saw’) for five seconds, followed by 15 seconds of nose breathing, repeated six times in succession. This procedure mimics previous experimental measurements of particle emission during vocalization21, but here the participants also systematically repeated the experiment at different voice amplitudes. Representative raw data for a single participant performing a series of six successive /ɑ/ vocalizations, at approximately the same loudness, are shown in Fig. 1. The simultaneous microphone recording (Fig. 1A) and APS measurements (Fig. 1B) demonstrate that the dynamics of particle release are highly correlated with the vocalization. Prior to and between vocalizations, during nose breathing in which exhaled air is directed away from the APS, the particle count is negligible, as is expected for the HEPA filtered air inside the laminar flow hood. Shortly after the vocalization commences, the number of particles rapidly increases and peaks, then decreases back to zero as the participant resumes nose breathing; the process then repeats at the next five-second vocalization. The approximately two-second lag between onset of vocalization and the observed increase in particle count is due to the time necessary for the released particles to reach the sensor in the APS. We emphasize that by design an APS does not measure 100% of the particles drawn into it, so the particle emission rates reported here do not represent the absolute number of particles emitted by the participant; the emission rates are best understood in relative terms, or in terms of the equivalent instantaneous concentrations of particles sampled from the funnel. As shown in the secondary axis of Fig. 1B, the instantaneous concentration of particles for this particular experiment was approximately 2 per cm3 of sampled air.

Figure 1 Representative raw data in which a participant (F4) said /ɑ/ for 5 seconds, followed by 15 seconds of nose breathing, repeated 6 times at approximately the same loudness. (A) The amplitude (arb. units) recorded by the microphone versus time. Magnification shows 13 ms of the waveform with fundamental frequency of F 0 . (B) The corresponding number/concentration of particles measured by the APS versus time. Full size image

The six vocalizations shown in Fig. 1A were made, to the best of the participant’s ability, at the same loudness. Each participant then repeated a similar series of /ɑ/ vocalizations at different self-regulated voice amplitudes. Representative results for a single participant (F4) show that the particle emission rate (N), defined as the total number of particles emitted during a single vocalization divided by the measured duration (in seconds) of that vocalization, also correlates with the root mean square amplitude (A rms ) of the vocalization (Fig. 2A). In our set-up A rms = 0.45 corresponds to an extremely loud conversational voice, as loud as comfortable without yelling (~98 decibels measured 6.5 cm from the participant’s mouth, measured over background noise of approximately 65 decibels), while A rms = 0.02 corresponds to a quiet vocalization just above whispering (~70 decibels; cf. Supplementary Fig. S1). As shown in Fig. 2A, the particle emission rate is linearly correlated with A rms over this entire range of vocalization amplitudes, with the particle emission rate increasing from 6 to 53 particles per second at the quietest and loudest vocalizations respectively.

Figure 2 Particle emission rate/concentration while saying /ɑ/ at 8 different amplitudes, repeated 6 times at each amplitude. (A) Particle emission rate/concentration versus root mean square amplitude, A rms (arb. units) for a representative participant (F4). Solid line is the best linear fit, with correlation coefficient ρ = 0.932 and Pearson’s p value = 5.9 × 10−22. (B) Corresponding particle size distribution for the data presented in (A). (C) Aggregated particle emission rate/concentration versus root mean square amplitude, A rms (arb. units) for 10 participants, 5 males (denoted as M1 to M5) and 5 females (denoted as F1 to F5). There are 8 data points for each participant, each representing the average of repeating /ɑ/ six times at approximately the same voice amplitude (cf. Fig. 1). Solid line is a power law fit with exponent 1.004, correlation coefficient ρ = 0.774 and Pearson’s p value = 3.8 × 10−17. Full size image

Although the particle emission rate increased with amplitude, the size distribution of the particles was not affected significantly (Fig. 2B), with the geometric mean particle diameter remaining near 1 μm regardless of voice amplitude (Supplementary Fig. S2A). Because the particle size remains similar regardless of amplitude, the increased particle counts shown in Fig. 2 indicate that the total volume of emitted respiratory fluid (i.e., the proteinaceous liquid droplets aerosolized from the serous and mucoid layers lining the respiratory tract) increases considerably with the vocalization loudness. Note that the characteristic time scale for evaporative drying of 1-micron diameter droplets is on the order of 100 milliseconds26, which is much less than the time required for the particles to move from the participant’s mouth into the detection module within the APS, suggesting that the particles measured here had fully dried into droplet nuclei prior to measurement (see methods and Supplementary Fig. S3).

Experiments with multiple participants indicated that these trends are conserved over a larger sample size (Fig. 2C). The particle emission rate increased approximately linearly with A rms for each of the study participants, although the absolute magnitude varied between individuals. One participant (F3) released as many as 200 particles per second at higher amplitudes; another (F2) released as few as 1 particle per second at lower amplitudes. Notably, the data with this cohort of non-elderly adults reveal no obvious trends with gender or age (Supplementary Figs S4A, B). Similarly, no clear correlation was observed with the body mass index (BMI) of the participants (Supplementary Figs S4C, D).

To more closely represent normal conversational speech, the participants read aloud a short passage of text in English at varied loudness (quiet, intermediate, or loud). Representative raw data for a single participant (F4) indicate that the particle emission rate also correlates with voice amplitude for normal speech (Fig. 3A,B). To quantify the loudness, we take A rms here as the average over the entire approximately two-minute duration of the vocalization, excluding pauses between words. Aggregated data for 10 participants confirms that the particle emission rate for normal English speech correlates linearly with A rms (Fig. 3C); speaking loudly yielded on average a 10-fold increase in the emission rate compared to speaking the same series of words quietly. Again, the size distributions (Fig. 3D) and geometric mean diameter of particles (Supplementary Fig. S2B) were insensitive to voice amplitude. The reading experiment also was repeated in different languages to test whether choice of language matters; the results (Supplementary Fig. S5) confirmed the increasing trend between particle emission rate and amplitude, but exhibited no significant difference in the particle emission rate among the languages tested (Supplementary Fig. S6). Likewise, we measured the temperature and humidity during the experiments, and found no significant impact of temperature or humidity on either the particle emission rate or the mean particle size (Supplementary Figs S7 and S8).

Figure 3 Particle emission rate/concentration while reading a passage of text aloud (the “Rainbow” passage), at three different loudness levels. (A) Superimposed representative recordings of amplitude (arb. units) for an individual (F4) reading the passage at three different voice amplitudes, and (B) the corresponding number/concentration of particles measured by the APS versus time. Color code same as in (A). (C) Particle emission rate/concentration as a function of root mean square amplitude, A rms , for 10 participants. There are 3 points for each person, representing 3 voice amplitudes, color code same as Fig. 2C. Solid line is a power law fit with exponent 0.96, correlation coefficient ρ = 0.865 and Pearson’s p value = 6.8 × 10−10. (D) Representative particle size distribution for the one individual (F4). Full size image

A key recurring feature of the data is that some individual participants emitted many more particles than others. Because all participants spoke at slightly different amplitudes, we used linear regressions of the particle emission rate versus amplitude for each individual (cf. Fig. 2A) to calculate a normalized particle emission rate at the loudness amplitude of 0.1 (approximately 85 dB). Using this approach, the results for 40 people show that the particle emission rate for different individuals follows a long-tailed distribution for both vocalization of /ɑ/ (Fig. 4A) and reading of English text aloud (Fig. 4B). At this loudness, the normalized particle emission rates ranged from approximately 1 to 14 particles per second between different individuals, with an average of approximately 4 particles per second. Notably, the rates have a sizeable standard deviation well approximated by a lognormal fit (red curves in Fig. 4). In other words, although half of the participants emitted fewer than 3 particles per second, a small fraction of individuals (8 out of 40) emitted considerably more. These “speech superemitters,” whose individual particle emission rate exceeded the group mean by one standard deviation or more, consistently released an order of magnitude more particles than their peers. For vocalizing /ɑ/, Fig. 4A shows that 15% of the participants emitted 32% of the total particles, while Fig. 4B shows that, for reading aloud in English, 12.5% of the participants emitted 40% of the total particles. Supplementary Fig. S9A shows that 4 out of these 8 individuals are superemitters for both saying /ɑ/ and passage reading activities, while 2 of them are only superemitters while saying /ɑ/, and 2 of them are superemitters while reading a text passage. We repeated the passage reading experiment for two of the participants (M5 and F4) on three different days separated by several months (Supplementary Fig. S9B), and the results show that the particle emission rates remained almost unchanged for at least these two individuals (F4, a superemitter, and M5, a non-superemitter) despite the long time period between measurements.

Figure 4 Histogram of particle emission rate/concentration at voice amplitude of 0.1 (approximately 85 dB). (A) For saying /ɑ/, with median of M = 4.3 particles/s, mean of m = 4.8 particles/s and standard deviation of σ = 3.0 particles/s. (B) For reading an English passage (10 people read the “Rainbow” passage and 30 people read chapter 24 of “The Little Prince”) with median of M = 2.5 particles/s, mean of m = 3.4 particles/s and standard deviation of σ = 2.7 particles/s. Particle emission rates larger than m + σ are labeled superemitters. Red curves are lognormal fits found via nonlinear regression. Full size image

To help interpret our findings we also compared the particle emission rates of four different types of breathing with speech at three levels of loudness using the same experimental set-up. The breathing experiments included nose breathing, mouth breathing, a “deep-fast” mode, and a “fast-deep” mode (see methods for details). The results show that the particle emission rate for speech is significantly higher than all types of breathing tested here (Fig. 5A). Furthermore, the corresponding geometric mean diameters of the particles generated during speech are slightly larger on average than those generated during breathing (Fig. 5B), consistent with prior work and the hypothesis that vocalization activates laryngeal particle generation21. Note that in Fig. 5A the speech outliers correspond to a single participant who is a speech superemitter (F4), but this individual was not also responsible for the observed outliers of “fast-deep” and “nose” breathing activities. In other words, the “breathing high producers” as defined by Edwards et al.15 are not necessarily also speech superemitters.