The rest of this article is organized as follows. In Section 2 , we briefly present some relevant features of the considered study area, and describe the convective thunderstorm events dataset. Section 3 reports the results of the statistical and graphical analysis performed, focusing on the spatial and temporal dimensions, and on the thunderstorm life cycle. In Section 4 , we discuss the results obtained, and we draw some concluding remarks.

The dataset considered in this paper has been acquired using the Thunderstorm Radar Tracking (TRT) algorithm developed by MeteoSwiss [ 35 36 ]. This algorithm couples traditional and innovative techniques and allows us to acquire the position of convective cells and follow their evolution through time. The result of the data acquisition process is a sequence of snapshots, one every five minutes, containing the convective cells characterized by a radar reflectivity exceeding a given threshold. Once acquired, the dataset can be used by the meteorologists to perform a visual analysis in order to examine the behavior of the convective cells and the relationships among different weather features [ 37 ]. Visualizing the sequence of snapshots is useful but working in this way the spatial and temporal dimensions of the process evolve at the same time. The main goal of our work consists of synthesizing general features of the behavior of the convective cells that occurred in the Lambro, Seveso, and Olona catchments in Northern Italy from 2012 to 2018. To do that, we performed statistical analyses splitting the spatial and the temporal aspects in order to focus on a single dimension. The spatial analysis allows us to visualize the high-risk areas, which could provide useful insights on how to deploy a meteorological monitoring network. It also allows us to obtain the climatology of the area of interest [ 38 39 ]. Conversely, the temporal dimension helps to visualize how the events are distributed within the month of the year, or within the hour of the day [ 24 ]. Coupling a visual and statistical exploration of the big data available can open an innovative perspective in the comprehension of meteorological aspects behind intense rain events [ 40 ].

Researchers are trying to enhance the forecasting performances of convective rainfall events, integrating information extracted by the dataset available nowadays which couple traditional and innovative meteorological variables [ 28 29 ]. Other authors proposed to adopt a different paradigm, switching from the conventional physically-based meteorological model to more recent machine learning techniques [ 30 34 ] to exploit the huge amount of available data. A better understanding of the physical processes and the features characterizing convective events is important, to better define the equation of the physically based models on the one hand, and to properly select inputs and structure of the machine learning tools on the other.

The convective storms usually interest limited areas, and their duration is almost always less than two hours [ 25 ]. Despite that, they often reach higher intensity in less time than orographic and frontal storms, originating fast but intense rainfalls, which are the worst case for the flood protection systems and other hydraulic infrastructures (e.g., sewer and subterranean urban streams) [ 26 27 ].

It is thus fundamental to gain a better understanding of the physical processes behind convective storms. This is usually done by investigating the causes and the atmospheric mechanisms which originate these phenomena [ 10 12 ] and identifying the most critical areas by monitoring past events [ 13 15 ]. Among all the drivers that concur in the formation of such convective events, the temperature has a key role [ 16 17 ]. For this reason, global warming caused by climate change [ 18 24 ] could potentially strongly affect the behavior of such storms.

Intense convective storms can cause adverse effect such as floods, landslides, hail, and strong winds. These extreme natural events, that will increase in frequency due to climate change [ 7 8 ], cause not only huge economic damages, but also significant human losses. In the period 1994–2013, 6,863 natural disasters occurred worldwide, and more than 70% have been strictly related to severe rainfall events: 2,937 floods, and 1,942 storms. These phenomena affected 3 billion people, causing almost 310,000 deaths (30% of the human losses due to natural disasters), and more than 1,500 billion USof economic damages (60% of the total) [ 9 ].

Convective storms are a typical source of rainfall in many regions, especially during summer periods. The causes behind these phenomena are not completely known due to the complexity of the physical processes behind them, that jointly involves spatial and temporal aspects at different scales [ 1 ]. Several research studies tried to improve the understanding of convective storm mechanisms by collecting and analyzing observations of the past few decades. Radar-based climatology has been developed to clarify convective storm behaviors, distributions, and frequencies in different domains and environments [ 2 6 ].

For this reason, a well-designed data visualization approach, able to highlight different aspects of the ongoing process, is the key to retrieving useful information hidden in the data. This task has been faced using Python libraries for data visualization (Matplotlib [ 44 ], Seaborn [ 45 ]) and for spatial analysis (GeoPandas [ 46 ], Geoplot [ 47 ], Cartopy [ 48 ], GDAL [ 49 ], Shapely [ 50 ]), together with QGIS software [ 51 ].

A viable alternative consists in showing all the center of gravity trajectories on the same map ( Figure 3 ). This procedure could be useful in visualizing a limited number of events, but it turns out to be inappropriate due to the high number of records stored in the dataset. The plot reported in Figure 3 does not help to synthesize any general features of the convective storms paths.

Adopting this technique, thunderstorms can be tracked very early during their growing phase as well as in the mature stage, and trajectories are created from a sequence of radar images [ 34 ]. For each convective cell and each time step, the dataset consists of a unique identification number, date and time of detection, coordinates of the center of gravity of the cell, velocity of the cell, area covered by the cell as a polygon feature, and mean and maximum reflectivity recorded.

The dataset employed in this study spans the seven years from 2012 to 2018. The time resolution at which data were recorded was five minutes, while the spatial resolution was 2 km. The algorithm was designed to identify convective cells, and it is based on an adaptive thresholding scheme which allows the detection of convective cells depending on their development phase [ 34 ]. Convective cells are considered intense and thus detected if they reach an area of 16 kmat a minimum reflectivity of 36 dBZ and at least one pixel inside them attains 42 dBZ. These events are thus detected at a variable threshold, depending on their development phase. This allows the detection of thunderstorms at an early stage of their life cycles at the lowest possible reflectivity threshold, i.e. 36 dBZ, as well as in the mature phase at a higher threshold [ 34 ].

The convective rain dataset analyzed in this paper has been sampled with the MeteoSwiss TRT algorithm. This algorithm is fed with data recorded by the Swiss radar network. The closest radar to the region of study is located at Monte Lema (1626 m a.s.l.) in the Canton Ticino. Due to the orography of the area of interest and the radar position, there are no limitations in terms of coverage and blockage [ 42 43 ].

According to the Koppen classification [ 41 ], the study area is characterized by a temperate climate with typical patterns of the mid-latitude regions (Cfa in the Koppen notation). It is dominated by the presence of the Italian Prealps in the northern part, Milan municipality in the southern part, and an almost flat area in the middle with a strong connotation of urban sprawl. It is frequently affected by convective cells, especially during summer, due to its peculiar morphology. Quite often, humid air remains for long periods in the Po’ Valley and is dragged by an uplift mechanism triggered by surface heating by the sun. The northern part of the considered area is windier (quite often during winter foehn rises) than the southern part, which is characterized by frequent foggy days and strong thermal inversions.

An increase in the duration of convective events, leads to an increase in the mean and maximum reflectivity and of the area involved in the convection process. The combination of a synchronized growth of the reflectivity, which is strictly related to the rainfall intensity, and of the areal extension of the convective cells, strongly increases the rainfall volume. Consequently, the probability of facing dangerous situations, i.e. intense rainfall over a large area, is much higher in long-lasting storms than in short-lasting ones.

Conversely, the area interested by convective cells ( Figure 12 , right) is highly dependent on the duration of the thunderstorm event. Another peculiar feature characterizing this distribution is that it is not symmetric. The median is closer to the 0.25 quantile than to the 0.75 quantile. The latter is a critical feature, because there is a higher probability of having an event characterized by an unusually large areal extension, than one with an unusually small extension.

The trend of the maximum reflectivity ( Figure 12 , center) is quite different. Half of the data (between the 0.25 and 0.75 quantile) are in the range 46–53 dBZ for short events (20–25 minutes). The range between the 0.25 and 0.75 quantile shifts to higher values if we consider events that last for a longer period. For instance, events lasting for one hour usually assume reflectivity values in the range of 50–56 dBZ. The variability is not affected by the duration.

Other interesting insights on the convective events lifecycle can be obtained by looking at how the distribution of the spatial statistics of the thunderstorm (mean and maximum reflectivity and area) varies as a function of the event duration. Note that we limit our analysis between 20 and 150 minutes (95% of the events are within this range) in order to guarantee the statistical significance of the results.

An additional analysis relative to the convective thunderstorm life cycle has been performed. Figure 11 reports the distribution of the events’ duration. The duration of the convective events has an exponential distribution. Ninety percent of the events last from 20 minutes to two hours.

This representation is particularly interesting from a meteorological point of view, since it suggests the main directions of the storm trajectories for each month. Most of the cells that enter the basin come from the South-West. It was interesting to notice that the peak number of the IN category in sector 1 happens in August, while for sector 2 it is in June. This seems to suggest that the circulation patterns that are causing these events are slightly different, probably due to a seasonal related effect. Again, according to the findings in Figure 5 , the majority of the events belonging to the OUT category crosses sectors 3, 4, and 5. In particular, for sectors 3, 4, 5, and 6, the OUT peaks happen in June. As a general consideration, the number of events born inside the watershed is slightly less than those coming from outside. In sectors 1 and 2, the number of the IN events is higher than the OUT events; in sector 3 and 4, the number of the OUT events is always higher than the IN ones; while in sectors 5 and 6, the number of OUT events are larger than IN ones, except for April and May. This is probably due to the location of this sector, with respect to the preferential direction of the convective thunderstorms. It is oriented through the North-West where, additionally, there are the Prealps mountains. When analyzing these mesoscale phenomena, local factors and the interaction between cells themselves plays a key role in the dynamics of cell displacements and triggering.

The effect of the temperature on the occurrence of convective cells is suggested also at the daily scale, as reported in Figure 9 . Most of the convective cells occur in the afternoon, with a peak between 14:00 and 16:00 local time, where the highest number of convective events was recorded (about 400). Given the fact that, in the considered area, the values of daily surface temperature usually reach its maximum from 13:00 to 14:00, the time lag between the peaks of temperatures and the number of convective events could be related to the thermal inertia of the soil surface. Moreover, once the temperature is high enough to allow the breaking of the boundary layer inversion, thermals can reach the lifting condensation level, which represents the limit over which the atmosphere becomes unstable [ 24 ].

The half-violin plot in Figure 8 confirms the seasonality of convective events but shows that the distribution across years is quite different. It sometimes has a quasi-bimodal shape. The first peak is in late spring/early summer, while the second is at the end of the summer. In other cases, the distribution is unimodal, as in 2014 and 2016, with a high number of convective events lumped in a single period in the middle of summer. The reasons behind these behaviors should be investigated in future research making use of a longer time series.

As expected, the convective events also exhibit a seasonal periodicity ( Figure 7 ). Convective events occur only during spring and summer (from April to September), suggesting that they are strongly related to high-temperature values. Of course, it is during these two seasons, especially summer, that solar radiation reaches its peak. The diurnal temperature during these months leads to a strong increase in surface temperature that affects the lower atmospheric layer by warming up atmospheric masses. This is a typical unstable situation favorable to convection processes, where air with low temperature is above air with high temperature. Other research on convective rain events show similar results for other locations [ 53 56 ].

While considering meteorological phenomena, it is also important to evaluate the temporal aspect of their dynamics. A visual analysis would help to spot some temporal structures, such as daily and annual periodicity, and possible increasing/decreasing trends. Figure 6 reports the annual distribution of convective events. The maximum number of convective cells (1119) was recorded during 2014, while values for the remaining years span from 600 to 900.

The primary focus of this work was to visualize the key characteristics of the data and eliminate misleading interpretations. With this aim, the temporal and spatial scales were considered separately. In the following two subsections, a couple of different sets of figures are presented: the first deals with the spatial distribution of the data, and the second focuses on the temporal dimension of these phenomena. The last subsection provides an analysis of the storm’s life cycle considering the duration of the events and how it is related to the reflectivity and the area of the convective thunderstorms.

4. Discussion

In this paper, we performed a comprehensive analysis of the convective rainfall events that occurred in the seven years from 2012 to 2018 over the Seveso, Olona, and Lambro rivers basin. The convective events composing the considered dataset were detected and tracked by the MeteoSwiss TRT algorithm, which is fed with radar data collected from the Swiss radar network. The huge amount of information sampled had to be visualized in a way that points out the relevant properties of the dataset. We thus decoupled the geospatial and temporal aspects of the convective phenomena, proposing several visual and statistical techniques in order to explore the available data extensively. Further analysis was dedicated to the thunderstorm life cycle.

The visualization of the geospatial distribution allowed us to highlight areas where convective thunderstorms were more frequent. This static distribution highly correlated with the orography of the region: frequencies were higher near the Prealps and lower in the southern area of the basin, characterized by a flat landscape. Additional analysis of the trajectories, followed by the convective cells’ centers of gravity, shows that the preferable direction is oriented from South-West to North-East. This is due to the meteorological phenomena characterizing the region. South-West winds carry air masses rich in water vapor originating from the Mediterranean Sea, generating favorable conditions for convection processes. Conversely, the presence of the Alps does not allow a considerable number of events coming from the North to come across the region.

The analyses of the temporal patterns confirm that the occurrence of convective events is strongly correlated to the surface temperature. The latter can thus be considered as one of the main drivers of the atmospheric mechanism generating convective cells. The fact is demonstrated by the distribution of the occurrences in months, and by the intraday average pattern. The events take place only from April to September, with a daily peak after midday in correspondence with the temperature peaks. The annual distributions of the convective events are different year by year, and there is no evidence of an increasing trend in the number of events occurring per year. In this regard, the seven year dataset available is probably too limited to robustly investigate the presence or absence of increasing trends, which could be induced by climate change.

Lastly, we performed an analysis of the main features of the convective cells’ life cycle. The duration of the storm life follows an exponential distribution. The events usually last less than 30 minutes, and only a limited number of cells persist on the basin for more than 2 hours. Long-lasting cells usually reach higher values of reflectivity (the average reflectivity is slightly higher, while its maximum exhibits a more substantial growth) and cover more extended areas. As a consequence, the impact of long-lasting cells is notably greater than that of short-lasting ones, due to the combined effect of reflectivity (intimately connected with the rainfall intensity) and areal extension of the convective cells on the rainfall volume.

The visual and statistical analyses performed in this paper experimentally confirm well-known characteristics of the convective phenomena and provide new findings regarding the mesoscale and local processes behind these events, showing the high potential of radar-sampled thunderstorm data. Several applications can, therefore, benefit from the results presented. For instance, the convective parameterization scheme of physically based models can be improved, and the feature selection process necessary for machine learning applications can be enhanced. Moreover, potentially dangerous patterns could be identified and exploited by the meteorological units of civil protection agencies to try to mitigate the impacts of these events. Early warning systems could gain some benefits, together with the planning and design of hydraulics infrastructures, and also the deployment of meteorological monitoring sensors. This work could also constitute the starting point to improve the knowledge of severe thunderstorms in the region of study. Considering only severe events, it is possible to identify distinctive factors that concur to their severity. A dataset covering a longer time period would provide the opportunity to evaluate more reliably the effects of climate change on the occurrence, intensity, spatial distribution, and duration of convective rain phenomena.

The framework proposed in this paper can be applied to any area where a time series of snapshots recorded by a radar storm tracking algorithm is available. The statistical analysis and the visualizations obtained can be used as a general-purpose support tool for both nowcasting and climatological studies at the basin scale.