ALLERGY, cilt.79, ss.2173-2185, 2024 (SCI-Expanded)
BackgroundThere is evidence that global anthropogenic climate change may be impacting floral phenology and the temporal and spatial characteristics of aero-allergenic pollen. Given the extent of current and future climate uncertainty, there is a need to strengthen predictive pollen forecasts.MethodsThe study aims to use CatBoost (CB) and deep learning (DL) models for predicting the daily total pollen concentration up to 14 days in advance for 23 cities, covering all five continents. The model includes the projected environmental parameters, recent concentrations (1, 2 and 4 weeks), and the past environmental explanatory variables, and their future values.ResultsThe best pollen forecasts include Mexico City (R2(DL_7) approximate to .7), and Santiago (R2(DL_7) approximate to .8) for the 7th forecast day, respectively; while the weakest pollen forecasts are made for Brisbane (R2(DL_7) approximate to .4) and Seoul (R2(DL_7) approximate to .1) for the 7th forecast day. The global order of the five most important environmental variables in determining the daily total pollen concentrations is, in decreasing order: the past daily total pollen concentration, future 2 m temperature, past 2 m temperature, past soil temperature in 28-100 cm depth, and past soil temperature in 0-7 cm depth. City-related clusters of the most similar distribution of feature importance values of the environmental variables only slightly change on consecutive forecast days for Caxias do Sul, Cape Town, Brisbane, and Mexico City, while they often change for Sydney, Santiago, and Busan.ConclusionsThis new knowledge of the ecological relationships of the most remarkable variables importance for pollen forecast models according to clusters, cities and forecast days is important for developing and improving the accuracy of airborne pollen forecasts. CatBoost is a preferable model for short-term forecasts, while Deep Learning is for longer ones, but there is no definite answer to what the better model is for every day or city. Past pollen trends are strong indicators of future pollen concentrations. CatBoost can be used to determine the importance of environmental variables in forecasting daily total pollen concentration. Abbreviations: 2mT, 2 m temperature; CB, CatBoost; DL, Deep Learning; DOY, day of the year; ERA5, the fifth generation ECMWF (European Centre for Medium-Range Weather Forecasts) atmospheric reanalysis dataset; pevap, potential evapotranspiration; st, soil temperature.image