نوع مقاله : مقاله پژوهشی
عنوان مقاله English
نویسندگان English
Introduction
Accurate 1-day (next-day) rainfall forecasts underpin water resources operations, smart agriculture, and early warning of hydro-meteorological hazards. Yet the short lead prediction of daily precipitation remains difficult because rainfall emerges from multiscale, nonlinear interactions that are only partly captured by single-source datasets. Machine learning (ML) can learn such relationships directly from data, but the relative value of surface observations versus upper air information—and their combination—has not been systematically assessed for Mashhad, Iran. This study addresses that gap by benchmarking several ensemble ML algorithms across three data scenarios and by applying feature selection to balance predictive skill and model simplicity.
Data and Study Area
We used two data sources for the Mashhad synoptic station during 2000–2023: (i) surface observations (maximum, minimum, and mean temperature; maximum, minimum, and mean relative humidity; wind speed; mean sea level pressure; sunshine hours; and daily rainfall), and (ii) ERA5 upper air reanalysis at pressure levels of 700, 500, and 300 hPa, including geopotential height, temperature, specific humidity, relative humidity, horizontal wind components (u, v) and vorticity. All predictors were used with a one-day lag to forecast next-day precipitation. The dataset was split into training (2000–2017) and testing (2018–2023) periods to enable out-of-sample evaluation.
Methodology
We designed three scenarios of S (surface only), U (upper air only), and S&U (combined). In scenarios S and U, each dataset was independently provided to five ensemble learning algorithms — Random Forest, AdaBoost, XGBoost, CatBoost, and LightGBM. Before model fitting, the Variance Inflation Factor (VIF) was computed to diagnose multicollinearity among predictors, and variables with VIF values above the acceptable threshold were excluded to ensure statistical independence and model stability. In the combined scenario, surface and upper air variables were merged into a unified feature matrix. To curb dimensionality, remove redundancy, and avoid overfitting, we applied Sequential Forward Floating Selection (SFFS), using five-fold cross-validated R² as the selection criterion. The six features retained by SFFS were v700 and v500 (meridional wind at 700/500 hPa), Spe_Hum500 (specific humidity at 500 hPa), Rel_Hum300 (relative humidity at 300 hPa), and two surface indicators (Umax and nm, representing near-surface wind and sunshine hours). Models were evaluated on the test set using Mean Absolute Error (MAE), Root Mean Square Error (RMSE), R², and Adjusted R².
Results and Discussion
In the surface-only scenario (S), the CatBoost model demonstrated the best performance with an R² of 0.171, Adjusted R² of 0.168, and the lowest RMSE of 2.309 in the test data. However, AdaBoost achieved the lowest MAE of 0.767, making it the best model in terms of minimizing mean absolute error, even though its R², Adjusted R², and RMSE were lower than those of CatBoost. In the upper-air-only scenario (U), similarly, CatBoost emerged as the top-performing model, achieving an R² of 0.182, Adjusted R² of 0.175, and the lowest RMSE of 2.294. Notably, the results in the upper-air scenario (U) showed better performance across all metrics compared to the surface-only scenario (S) for all algorithms. These results highlight the importance of upper air dynamics in improving model performance, particularly in terms of reducing error and enhancing explanatory power. In the combined scenario (S&U), which integrates both surface and upper-air data, CatBoost achieved the highest R² of 0.190, Adjusted R² of 0.188, and the lowest RMSE of 2.283 on the test data. Moreover, CatBoost achieved the second-lowest MAE of 0.795, just after AdaBoost, making it the best-performing model overall across multiple metrics. This suggests that the combination of surface and upper-air data with CatBoost provides the best balance of accuracy and model simplicity, making it the most effective model for predicting next-day rainfall.
Time series comparisons demonstrate that the selected CatBoost model accurately reproduces the sequence and magnitude of many light to moderate rainfall days, closely tracking day-to-day fluctuations. However, like most data-driven approaches trained on imbalanced samples dominated by zero/low rainfall, the model tends to under-estimate peaks during very heavy events. Two factors likely contribute to this: (i) class imbalance that downweights extremes in the loss landscape, and (ii) the one-day lag design that limits access to multi-day precursors (e.g., moisture build-up and synoptic persistence). Despite these limitations, the combined data approach delivers stable performance with a favorable accuracy–complexity trade-off and demonstrates the utility of integrating thermodynamic and dynamic information from different atmospheric layers.
Conclusion and Implications
The experiments confirm three key takeaways. First, upper air reanalysis fields provide distinct, complementary information to surface observations for next-day rainfall forecasting in Mashhad. Second, fusing surface and upper air predictors and then pruning with SFFS yields a compact feature set that preserves—or even improves—skill while enhancing parsimony, as reflected in Adjusted R². Third, tree-based gradient boosting methods, particularly CatBoost in the combined scenario, offer a practical balance between performance and simplicity for operational use.
Future work should target heavy rainfall underestimation by (a) enriching temporal context (multi day lags, moving averages, recent sum rainfall, dry/wet spell counters), (b) incorporating spatial context from neighboring stations and regional reanalysis tiles, (c) adopting two stage pipelines (occurrence classification followed by conditional amount regression) to mitigate zero inflation, and (d) testing sequence models (e.g., LSTM/GRU) or hybrid ML–NWP ensembles. Such extensions could elevate extreme event fidelity without sacrificing interpretability or operational feasibility.
کلیدواژهها English