Author(s): Anastasios Stamou; Aikaterini Vourka; Peter Rutschmann; Nikolaos Skoulikidis; Christos Theodoropoulos
Keywords: Habitat models; Random Forests; Fuzzy Bayesian; Fuzzy logic; Boosted Regression Trees; Environmental flows
Abstract: Although various methods are currently available for modelling the habitat preferences of aquatic biota, studies comparing the performance of data-driven habitat models are limited. In this study, we assembled a benthic-macroinvertebrate microhabitat-preference dataset and used it to evaluate the predictive accuracy of regression-based univariate Habitat Suitability Curves (HSC), Boosted Regression Trees (BRT), Random Forests (RF), fuzzy-logic-based models using the weighted average (FLWA), maximum membership (FLMM), mean of maximum (FLM) and centroid (FLC) defuzzification algorithms and fuzzy rule-based Bayesian inference (FRB). The results show that the BRT model was the most accurate, closely followed by RF, FRB, FLM and FLMM while the FLC and FLWA algorithms had the lowest performance. However, due to the imbalanced nature of the dataset and in contrast to the fuzzy rule-based algorithms, the HSC, BRT and RF models failed to accurately predict the habitat suitability in low-scored microhabitats. We conclude that, given balanced datasets, BRT and RF can be effectively used in habitat suitability modelling. For imbalanced datasets, a properly adjusted RF model can be applied but when the input dataset is large enough to provide sufficient data-driven IF–THEN rules to train an FRB, FLMM or FLM algorithm, these models will produce the most accurate predictions.