An explainable two-stage machine learning approach for precipitation forecast
- 1. Middle East Tech Univ, Dept Civil Engn, Ankara, Turkiye
- 2. Middle East Tech Univ, Dept Comp Engn, Ankara, Turkiye
- 3. Natl Univ Sci & Technol, Islamabad, Pakistan
Description
A common post-processing approach to improve precipitation forecasts is to use machine learning models such as artificial neural networks (more specifically, multi-layer perceptrons) as black-box systems. These models utilize different sources of observations or predictors to generate an improved forecast in terms of desired metrics. However, most existing studies employ a single-stage regression model without considering explainability. The small number of studies with two-stage models that combine classification and regression utilize binary classi-fication and still lack explainable artificial intelligence. Therefore, this study proposes a precipitation prediction system which (i) is composed of two stages for better predictions, (ii) compares the utility of binary and multi-class classification over the regression, and (iii) is explainable, unlike prior studies, in that individual predictions of machine learning-based forecasts are interpretable by humans. The proposed two-stage model first estimates the precipitation intensity category using binary or multi-class classification as the first stage and later utilizes precipitation intensity category information in a regression model, which is the second stage, to obtain daily precipitation magnitude. The utilized approach is made humanly interpretable (i.e., explainable) by providing insight into the model-wide importance of predictors and generation processes of the individual predictions (instance-level explanation). The proposed two-stage approach is compared against single-stage and black-box approaches in terms of prediction quality and explainability, where daily station-based observations are used as ground truth datasets. Experiments show that the proposed two-stage approach yields significant improvement (on average, RMSE reduced by 10.50%, and the correlation between numerical precipitation estimates and observed precipitation values increased by 7.5%) compared to the best-performing physical predictor (ECMWF). Analysis of explainability provides insights into the decisions of our two-stage approach, e.g., the usefulness of seasonality-related parameters, multi-class precipitation intensity classification as a first stage, and the predictors for each task (regression or classification).
Files
bib-f639e451-a08f-46f4-a442-b5dea763ff2c.txt
Files
(188 Bytes)
| Name | Size | Download all |
|---|---|---|
|
md5:d19f42ff97b4f3094c21eac02da40af2
|
188 Bytes | Preview Download |