Mae Fah Luang University Campus on March 2019. (Photo by MFU Photoclub with permission)

In the previous blog, I looked at the winter air pollution in Bangkok. The main source of pollution comes from particles smaller than 2.5 micrometer (PM 2.5 particles). These particles are smaller than the width of a human hair and can easily enter our bodies, even making their way into our blood. Last week (March 17, 2019), many provinces in the northern part of Thailand had the worst Air Quality Index (AQI) in the world due to particle pollution. So far, no long term solution has been proposed because the source of the PM 2.5 particle pollution has not been clearly pinpointed. In this notebook, I identify the sources of high PM 2.5 particles in Bangkok through a machine learning model. The code can be found in my GitHub page.

High PM2.5, Who Are the Culprits ?

There are three major theories regarding the source of air pollution in Bangkok: (1) The temperature inversion effect where cold air along with pollution is trapped close to the surface of the Earth. This theory was proposed by the government at the beginning of the 2019 winter season. The government blamed emission from old diesel engines for the pollution. (2) Agricultural burning, either locally or from surrounding provinces. During winter, a lot of open agricultural burning occurs throughout the country. Some officials have tried to tackle the air pollution problem by reducing open agricultural burning. (3) Pollution from other provinces or countries. Some NGOs blamed the pollution on near by power plants.

My analysis procedure is as follows: Build a machine learning model(ML) to predict the air pollution level in Bangkok using environmental factor such as weather, traffic index, and fire maps. Include date-time features such as local hour, and weekday versus weekend in the model to capture other effects from human activities. Identify dominant sources of pollution using the feature of importance provided by the ML model.

If the source of the pollution is local, then the AQI will depend on factors such as weather patterns (wind speed, humidity, average temperature), local traffic, and hour of day. If the pollution is from agricultural burning, the AQI will depend on active fires with some time lag to account for geographical separation. Fire activities are included based on the distance from Bangkok. On the other hand, if the pollution not correlated with the fire map, then the model should put more weight on weather patterns, such as wind direction and wind speed.

Here are a list of features I considered and their data sources:

Active fire information from NASA’s FIRMS project

Weather pattern: temperature, wind speed, humidity, and rain, scraped from the Weather Underground website

Traffic index from Longdo Traffic

Date time features: hour of day, time of day, and holiday patterns (explored in the Part I blog post)

Let me first walk through all the features included in the model.

Agricultural Burning is a Major Problem !

Farmers in Southeast Asia pick January — March as their burning season. For the north and northeastern provinces in Thailand, these burning activities are large enough to make these provinces among the most polluted places in the world during this time. For Bangkok, one might argue that because the region is heavily industrial rather than agricultural, it may not be affected as much by agricultural burning. But this is not the case.

Because of the tiny size of PM 2.5 particles, they remain suspended in the atmosphere for prolonged periods and can travel over very long distances. From the weather data, the average wind speed is 10 km/hour. The reported PM 2.5 level is a rolling average over 24 hours. A rough estimate is that the current PM 2.5 reading may be from sources as far as 240 km away. The picture below shows the fire map measured by NASA’s satellites, indicative of agricultural burning, on Jan 8, 2018 and on Feb 8, 2018. The yellow circle indicates the area within 240 km of Bangkok. The number of fires on Jan 8, which has an acceptable level of pollution, is much lower than the number of fires on Feb 8, which has an unhealthy level of pollution.

Fire spots from NASA’s satellites

In fact, the fire pattern closely aligns with the PM 2.5 pattern.

The number of fires aligns with spikes in PM 2.5 levels

Weather Patterns

The temperature inversion effect often occurs during winter because the temperature is cooler near the ground. The hotter air on top traps the cool air from flowing. This stagnant atmospheric condition allows the PM 2.5 particles to remain suspended in the air for longer. On the other hand, higher humidity or rain will help remove particles from the atmosphere. This is one reason why in the past when the air pollution was high, the government has sprayed water in the air. Unfortunately, this mitigation does not appear to be effective, since the volume of water is minuscule compared to actual rain. How much influence does weather pattern have on air pollution? Let’s compare the weather in winter versus other seasons.

compare the weather pattern in winter and other seasons

Temperature, wind speed and humidity are all lower in winter, but not by a large amount. Now, let’s look at the relationship of each of these with the PM 2.5 level.

Effect of temperature, wind speed, and humidity on PM 2.5 level in winter

Higher temperature (which disrupts the temperature inversion effect), wind speed and humidity have a negative correlation with the pollution level.

Effect of wind on PM 2.5 level in winter

On windy days, the pollution is clearly better. The median of the distribution for PM 2.5 levels is lower on windy days compared to on days without wind.

In fact, the pollution level also depends on the wind direction, as seen in this plot. I selected only four major wind directions for simplicity.

PM2.5 relationship with the wind direction in winter

On the days where the wind comes from the south, the pollution level is lower likely because the Thai gulf is to the south of Bangkok. The clean ocean wind improves the air quality. Wind from the other three directions pass overland. However, having any wind is better than the stagnant atmospheric conditions on calm days.

The shift in the median PM 2.5 level is smaller between rainy days and days with no rain. There are fewer rainy days during the winter season, so the data is somewhat noisy, but a difference can be observed in the cumulative density function.