Author(s): Karen Schulz; Thorsten Mietzel; Andre Niemann
Linked Author(s):
Keywords: Water resources management; Anomaly detection; Error detection; Unsupervised; Timeseries
Abstract: Data quality is fundamental to innovative, data-driven applications. Conventional data quality control methods often operate unsupervised and without the need for reference data to be considered reliable. Machine learning techniques have the potential to significantly improve data quality control in the water sector. A key challenge with current machine learning methods is their reliance on reference data to build trust, which can be difficult to obtain. In this work, we propose the concept of synthetic control of model accuracy to tackle this issue, which is applied to a real-world example dataset derived from low-cost water level sensors. Nevertheless, depending on the application scenario, the question remains as to whether these concept is sufficient for deploying models without having reference sensors in practice.
DOI: https://doi.org/10.64697/978-90-835589-7-4_41WC-P2118-cd
Year: 2025