The system for near-real time air pollution monitoring over cities based on the Sentinel-5P satellite data

Introduction. Air pollution heterogeneity and rapid urbanization impose numerous constraints on available near-surface air quality monitoring. The solution for effective warning comes with the integration of different data, including remote sensing. Satellite data cannot answer whether dangerous pollution levels are observed; however, it provides a complete picture and may detect air pollution transportation towards or away from cities. The possibilities for effective near-real time (NRTI) monitoring have significantly improved with the launch of the Sentinel-5P satellite. The study aimed to describe the developed system for NRTI air pollution monitoring over Kharkiv, Kryvyi Rig, Kyiv, and Odesa based on NO 2 and CO data derived from the Sentinel-5P satellite. Data and methodology . The NRTI System was developed for tropospheric NO 2 and total CO column number densities based on the Sentinel-5P NRTI products. After satellite scanning of Ukrainian territory, the NRTI System goes live in 2-3 hours. It is fully automatic, and modules were written using Python, VB.NET, and batch-scripting. Results. The NRTI System includes four main phases: preparatory, source data downloading, processing and post-processing with visualization, archiving, and result distribution among users. Source data filtering with a quality assurance index and downscaling with linear kriging interpolation were developed. The output of the NRTI System is data in regular grids with a spatial resolution of 0.02 o ×0.02 o . Based on the NRTI System work during October – December 2021, we conducted preliminary analyses to understand the possibilities of data usage. Higher NO 2 content was observed in Kyiv and Kharkiv, where traffic emissions play a crucial role in air quality worsening. The use of daily time series allowed the detection of an increase in NO 2 variance during the heating season, as well as plume distribution from cities to rural areas due to the prevailing wind. CO content is more homogeneous; however, higher values were observed in industrial Kryvyi Rig and Odesa. It is emphasized the huge impact of shipping CO emissions on air quality in Odesa. The temporal averaging of the NRTI System output allowed us to define the most polluted districts within the cities of interest. We intend to continue developing the presented NRTI System and develop the same algorithms for all cities with populations greater than 500 000 people in order to provide operational air pollution monitoring based on satellite data.

Introduction. The problem of atmospheric air pollution in cities and industrial regions requires activities in two main directions: 1) the development of effective monitoring by using data integration; and 2) the implementation of measures for air quality improvement. Atmospheric air pollution monitoring should be based on near-surface observations. It enables us to determine whether or not dangerous levels of air pollution exist. However, the spatial distribution of chemical substances in the urban atmosphere is highly heterogeneous [1,2]. Consequently, groundbased air quality network could not provide a full picture of processes that might cause elevated air pollution episodes [3]. Particularly if the emission sources are located outside of cities or if transboundary air pollution is present. Rapid growth of urban areas complicates the situation, and fixed locations of ground-based sensors very often become unable to capture the influence of new emission sources.
The solution for point-based ground-level measurements comes together with the integration of remote sensing data by using satellites that observe atmospheric composition. Satellite data provide us with close to full global coverage [4][5][6][7]. It is obvious that cloudiness remains challenging [8], but its negative influence could be minimized with temporal averaging. Near-real time remote sensing is more sensitive to cloudiness. Nevertheless, very often torn in the clouds are observed; hence, the overall picture is not significantly distorted.
Remote sensing of atmospheric composition plays a significant role in our knowledge about the emission sources [9][10][11], air pollution transport [12,13], chemicals dispersion and accumulation [14][15][16], which were shown at regional and synoptic scales. Despite available satellite data, it was not possible for all satellite missions before 2017 to provide any downscaling in order to analyze sub-urban scale due to coarse spatial resolution [4,17]. At the end of 2017, a new satellite mission started its operation after the launch of the TROPOspheric Monitoring Instrument (TROPOMI) onboard the Sentinel-5 Precursor (Sentinel-5P) [18]. The spatial resolution was increased to 7.0×3.5 km (and later to 5.5×3.5 km [19]) allowing for detailed exploration at the sub-city scale. As Sentinel-5P covers the entire Ukrainian territory every day, its data could be used not only for scientific research based on offline (OFFL) information but also to provide near-real time (NRTI) air pollution monitoring.
Wide opportunities for Sentinel-5P data usage in Ukraine were discussed in studies [20][21][22][23][24]. The test version of a fully automatic atmospheric air quality monitoring system was developed by the Ukrainian Hydrometeorological Institute of State Emergency Service of Ukraine and the National Academy of Sciences of Ukraine (UHMI) in 2020. The first assessment was conducted during the wildfire events in April 2020 in the northern part of Ukraine [23]. The system was used to analyze the air pollution transport by the State Emergency Service of Ukraine for decision-making. Since May 2020, the System has been processing Sentinel-5P data every day in the NRTI regime, providing interested users with detailed information and data on nitrogen dioxide (NO2), carbon monoxide (CO), sulfur dioxide (SO2), ozone (O3), formaldehyde (HCHO), and cloudiness over the territory of Ukraine [21]. At the end of 2020, the System was developed by adding testing programming modules for NRTI NO2 data processing and downscaling over the cities of Kyiv and Kryvyi Rig [21].
After being successfully tested in the first half of 2021, the System for atmospheric air quality monitoring over Ukraine in the NRTI regime was ready for further improvements. Now it is being developed in several branches. One of the branches is the system for NRTI air pollution monitoring over cities presented in this study.
The aim of this study is to describe the developed system for NRTI air pollution monitoring over the cities of Kharkiv, Kryvyi Rig, Kyiv, and Odesa based on NO2 and CO data derived from Sentinel-5P (the NRTI System).
As this study is being done as the first stage of the project within the framework of the ERA-PLANET/UA project [25], the overall aim is to develop the NRTI System for all Ukrainian cities with a population of more than 500 000 inhabitants (Dnipro, Donetsk, Kharkiv, Kryvyi Rig, Kyiv, Lviv, Mykolaiv, Odesa, and Zaporizhzhia).
Data and key methodological aspects. Source Sentinel-5P data is represented at different levels, which correspond to different levels of data processing [26]. 0-level contains the initial signal data from four spectrometers. This data is not available to the public. 1-level contains calibrated irradiance data for the following spectral bands: 270-300 nm, 300-320 nm, 320-405 nm, 405-500 nm, 675-725 nm, 725-775 nm, 2305-2345 nm, and 2345-2385 nm [27]. Moreover, calibrated solar irradiance data of 270-775 nm, and 2305-2385 nm is available. 1-level data is available to the public.
2-level data contains information about the atmospheric composition for the following chemical species: total and tropospheric column of ozone (O3), nitrogen dioxide (NO2), sulfur dioxide (SO2), carbon monoxide (CO), methane (CH4), formaldehyde (HCHO), and aerosol [27]. Using NO2 and CO, the influence of the majority of anthropogenic emission sources could be estimated [28]. Moreover, its relationship could be used as a predictor of combustion efficiency (and as a result of gasoline type) [29][30][31]. Therefore, NO2 tropospheric column density and CO total column density were selected as the basic pollutants for the NRTI System described in this study.
Sentinel-5P data is available as OFFL and NRTI products. OFFL data is more accurate, but its late ingestion makes impossible the operational air quality monitoring. NRTI data was chosen for the NRTI System because it is available within 2-3 hours of sensing. Sentinel-5P covers the territory of Ukraine at 12:30-14:00 EEST, hence the final output (considering the source data ingestion, further downloading to local servers, full processing, and visualization) from the developed NRTI System could be distributed among interested users in Ukraine at 17:30-18:00 EEST.
The NRTI System works with an initial spatial resolution of source data, which is crucial for NRTI monitoring and further downscaling. The coordinates of pixels change from day to day, forming a 16-day cycle. NRTI data ingests in separate blocks with corresponding files. As Ukrainian territory is rather large, up to 10 files are needed to cover it every day. It is obvious that the processing of all files is too timeconsuming for NRTI monitoring and operative warnings concerning elevated pollution levels. The time for files to download exceeds the time for data processing. As a result, preliminary analyses revealed that the most effective method was to download 2-5 files that covered the entire territory, depending on the date. All other blocks cross each other and could be omitted to save time.
The NRTI System was written using Python, VB.NET, and batch-scripting.

Results. Preparatory phase and data downloading.
Despite the simplicity of the preparatory phase, it is the most time-consuming stage. The NRTI System starts working at 15:40 EEST with a basic check of local computer availability. As the NRTI System is fully automatic and works every day, it is possible that unforeseen events may happen. For example, a power cut and draining the uninterruptible power supply (UPS), resulting in a computer shutdown. If the computer is not available after the shutdown, the NRTI System automatically restarts it. This procedure ensures everyday work.
Some unforeseen events may occur during the data processing, e.g., prolonged internet disconnection during the data downloading. As a result, a number of temporary files were not removed. Therefore, at 16:00 EEST, the NRTI System checks the availability of unremoved temporary files. If they are found, the NRTI System removes them and prepares working directories for the new run.
At 16:20 EEST, the NRTI System launches the modules that are responsible for Sentinel-5P data downloading. All the data is being stored at the Sentinel-5P Pre-operations Data Hub [32]. Data downloading requires the indication of a special UUID (Universally Unique IDentifier) code, which consists of 32 randomly generated symbols. The UUID is taken from the output file after a special request to the Hub. The output list is limited to 50 files. Considering the large area of Ukraine, the total number of files may exceed the limit. Therefore, the request must be well-thoughtout. To meet the limit, our request is based on the Sentinel-5P sensing time. It was calculated to within one minute for each day ahead. So, the NRTI System reads the file with sensing time (from hh.mm to hh.mm) and creates a request to the Hub: https://s5phub.copernicus.eu/dhus/odata/v1/Products?$select=[sensing time] The request is being sent using the "wget.exe" utility (https://www.gnu.org/software/wget/). The output is an XML file with a list of available data files. The list contains descriptions and UUIDs. It must be noted that the NRTI System downloads data for the entire Ukrainian territory, not just over the cities of interest. Knowing the UUIDs, the data files are downloaded using the following "wget" command: https://s5phub.copernicus.eu/dhus/odata/v1/Products('" & [UUID] & "')/$value After downloading the NetCDF file, it is renamed as "XXX_I.nc", where "XXX" -chemical formula of the pollutant, "I" -the file index (e.g., "NO2_2.nc"). These files are temporary and are being removed after data processing.
Data processing. During its development, the NRTI System algorithm was divided into several branches for more effective work. After the data has been downloaded, the branches start working independently. A separate branch was developed for the purpose of NRTI air pollution monitoring over cities.
First and foremost, the data are filtered in order to identify statistically unreliable data. Cloudiness has the most negative impact on remote sensing retrievals. The filtering uses a quality assurance index (QA), which is dimensionless and varies from 0 to 1. QA=0 corresponds to totally unreliable data, whereas QA=1 -totally reliable data. A QA value is given for each grid of satellite data. Despite the ability to use values with QA>=0.5, it is better to filter Sentinel-5P data with QA>=0.75, saving those NRTI values that are the most statistically significant. Therefore, if a certain value is characterized by QA<0.75, it is removed from the datasets.
After the filtering, all values are binned by longitude/ latitude which allowed us to arrange the data into regular grids. This procedure is necessary due to the differences in the area covered by Sentinel-5P. As it was mentioned, the same grid appears only once every 16 days. This is not necessary for NRTI monitoring itself; however, we would like to create regular grids in order to create time series. The procedure of binning is described in [21]. For the sub-city scale, the regular grid is 0.02⁰× 0.02⁰ that is finer than the original spatial resolution of TROPOMI. Our regular grids are designed for downscaling. All centers of source TROPOMI pixels are being overlaid on regular grids. If the center of the pixel falls into the grid cell -the cell of the regular grid is filled with the original pollutant's content value. Those grids that do not coincide with the centers of pixels remain blank at this phase. Linear Kriging interpolation is used to fill in the blank grids [33]. As a result, the territory of the city has better visible maxima of air pollutants. The regular grid is divided into particular domains for certain cities ( Table 1). The NRTI System generates new NetCDF files with localized NO2 and CO content over cities after filtering and binning.
Post-processing phase. The output visualization is based on Python modules. Each city and each pollutant used their own post-processing modules, which were combined into a joint block. As a result, there are 8 separate visualization modules (2 pollutants in 4 cities). The figures 1-4 illustrate an example of the NRTI System output.
After the visualization, the NRTI System removes all temporary files that were created during the run. While it is finishing, another module provides the distribution of results among users. Nowadays, the NRTI System output is distributed via e-mail; however, the other types of results distribution could be implemented, e.g., FTP, Telegram, etc. Fig. 5 represents the whole scheme of the NRTI System's work.   First results derived from the NRTI System. The NRTI System for selected cities was launched in October 2021. Further improvements and corrections were not critical for the NRTI System's work; hence, the total period for preliminary analyses covers three months until the end of 2021. Overall, the NRTI System rather well reflects NO2 and CO variability and coincides with pollutants' emissions. Very often, we observed plumes distributed from the cities to rural areas by the prevailing wind, which allowed us to estimate potentially endangered areas (examples of plume distribution can be seen in fig. 1-3). The NRTI System output also shows the spatial distribution of pollutants during unfavorable meteorological conditions (e.g., air temperature inversions and low wind speeds).
The highest NO2 content during October -December 2021 was observed in Kyiv and Kharkiv, reaching 7.9·10 -5 mol/m 2 and 7.1·10 -5 mol/m 2 respectively ( fig. 6B). Traffic emissions in these cities play a significant role in the formation of higher NO2 levels. Despite huge industrial emissions in Kryvyi Rig, NO2 content was lower, reaching 5.7·10 -5 mol/m 2 which is comparable to observed values in Odesa ( fig. 6B). It showed that traffic plays an important role in urban air quality nowadays and that the consequences of traffic emissions could be more serious than industrial ones.
The heating season impacts the day-to-day NO2 variability in cities. It is seen from the NO2 time series in Fig. 6A how the variance increases in November -December after the heating season starts.
In contrast to NO2, CO content is mostly formed under the influence of industrial emissions. The highest CO average values were observed in Odesa and Kryvyi Rig, reaching 3.39·10 -2 mol/m 2 and 3.38·10 -2 mol/m 2 respectively (Fig. 7B). CO is emitted into the atmosphere after the solid fuel is burned, which is still used in the majority of industrial processes in Ukraine. Moreover, active shipping causes SO2 and CO emissions. Increased CO content in Odesa is frequently caused by the influence of shipping emissions during the sea-to-land wind, which transports polluted air masses towards the city. CO content in Kyiv and Kharkiv was lower, reaching 3.37·10 -2 mol/m 2 and 3.36·10 -2 mol/m 2 respectively (Fig. 7B). The variance of CO values was low during October -December 2021 (Fig. 7A). However, this period coincides with the lowest CO content in the annual cycle in Ukraine [24], hence the day-to-day variability will become more heterogeneous at the end of the cold season.
The NRTI System allowed us to define the most polluted districts in the cities during October -December 2021. Overall, in Kyiv, three maxima were observed in Shevchenkivskyi, Pecherskyi, and Solomianslyi Districts; in Kharkiv -Novobarskyi, Slobidskyi, and Osnovianskyi Districts; in Kryvyi Rig -Inguletskyi District. There was not a clear maximum observed in Odesa.

Conclusions.
A system for NRTI air pollution monitoring was developed for the cities of Kharkiv, Kryvyi Rig, Kyiv, and Odesa based on NO2 and CO data derived from Sentinel-5P. The algorithm of the NRTI System includes four main phases: preparatory, source data downloading, data processing, and postprocessing with visualization. The NRTI System output is statistically reliable and provides more detailed information than the source data due to the implementation of data filtering and downscaling. The NRTI System was integrated into the general, fully automatic system for air quality monitoring previously developed for the entire Ukrainian territory. It began providing daily NRTI data in October, based on fully automated algorithms. The NRTI System output allowed us to make preliminary analyses of NO2 and CO spatio-temporal distribution over Kyiv, Kharkiv, Odesa and Kryvyi Rig. There was detected the days with the maximal emission and plumes which were distributed out of cities by the prevailing wind. NO2 content is higher in Kyiv and Kharkiv with more intense traffic emissions, whereas increased CO content was observed in Odesa and Kryvyi Rig with prevailing shipping and industrial emissions, respectively. It was detected the increase of variance in NO2 time series as a response to the heating season. Temporal averaging enabled the identification of the most polluted districts within cities. Overall, the NRTI System showed acceptable results and the possibility for operative monitoring. The NRTI System will be expanded to cover all cities with populations greater than 500 000 inh. in order to provide detailed operational monitoring of air pollution.
Acknowledgements. This study is supported by the project 0121U111519 "Development of the system for atmospheric air pollution operational monito-ring over Ukrainian cities based on the satellite data" (2021-2023) within National Academy of Sciences of Ukraine Target programme for research "Aerospace Environment Observations towards sustainable development and safety (ERA-PLANET/UA project)". Some parts of this study is linked to two national governmental projects 0121U109319 "Current trends in the spatio-temporal distribution of pollutants in the atmosphere over the territory of Ukraine based on the integration of measurement data" (2021-2023) by order of State Emergency Service of Ukraine; and "Development of multi-purpose geo-portal for environmental monitoring and prediction" (2021-2025) by order of National Academy of Sciences of Ukraine.