Author(s): Karen Schulz; Andre Niemann
Linked Author(s): Karen Schulz
Keywords: Water resources management; Sensor data; Imbalanced data; Precipitation; Data fusion; Data imputation; Sub-daily
Abstract: Data quality is fundamental to innovative, data-driven applications. The current advancements in artificial intelligence have not been fully exploited in terms of data quality assurance in the water sector. Machine learning can potentially lead to a higher automation degree of labor-intense tasks in quality assurance. This study aims to compare individual and generalized models over a 1,500 km2 catchment to correct sub-daily, highly imbalanced, rain gauge data. It uses a XGBoost model as standard machine learning method. It was found that the generalized model can have a superior performance. As a result, our study provides an indication for using catchment models instead of single-station models, which can be beneficial for practical applicability.
DOI: https://doi.org/10.64697/978-90-835589-7-4_41WC-P1877-cd
Year: 2025