Understanding and predicting the feeding distribution of top marine predators is crucial for spatial planning and conservation, especially in dynamic and anthropogenically impacted ecosystems such as the Mediterranean Sea. However, to ensure the reliability of ecological models in such contexts, it is essential to explicitly address the issue of autocorrelation, as environmental predictors often exhibit spatial and temporal dependencies. This study proposes a two-step modeling framework to predict the distribution of feeding activities of Tursiops truncatus in the Northern Ionian Sea (Central-eastern Mediterranean). First, it assesses spatial and temporal autocorrelation in oceanographic predictors. Then, it implements three machine learning algorithms-Random Forest, XGBoost, and RUSBoost- to model presence and pseudo-absence data. To assess robustness in dynamic environments, the model’s performance is evaluated using both random k-fold and temporally structured cross-validation. The latter accounts for temporal variability, enabling a more realistic evaluation of generalizability. Results reveal significant temporal autocorrelation in all the environmental predictors and spatial clustering in deeper layers, highlighting the need for structured validation. XGBoost and RUSBoost achieve the highest accuracy (72%) under temporal validation, indicating that time-aware evaluation better captures ecological patterns in the data. The proposed approach enhances the robustness of species distribution models targeting conservation-relevant behaviors and has the potential to support more informed marine spatial planning.
Enhancing marine ecosystem monitoring using machine learning: addressing autocorrelation in cetacean feeding behavior prediction
Saccotelli L.;Cipriano G.;Dimauro G.;Carlucci R.;Maglietta R.
2025-01-01
Abstract
Understanding and predicting the feeding distribution of top marine predators is crucial for spatial planning and conservation, especially in dynamic and anthropogenically impacted ecosystems such as the Mediterranean Sea. However, to ensure the reliability of ecological models in such contexts, it is essential to explicitly address the issue of autocorrelation, as environmental predictors often exhibit spatial and temporal dependencies. This study proposes a two-step modeling framework to predict the distribution of feeding activities of Tursiops truncatus in the Northern Ionian Sea (Central-eastern Mediterranean). First, it assesses spatial and temporal autocorrelation in oceanographic predictors. Then, it implements three machine learning algorithms-Random Forest, XGBoost, and RUSBoost- to model presence and pseudo-absence data. To assess robustness in dynamic environments, the model’s performance is evaluated using both random k-fold and temporally structured cross-validation. The latter accounts for temporal variability, enabling a more realistic evaluation of generalizability. Results reveal significant temporal autocorrelation in all the environmental predictors and spatial clustering in deeper layers, highlighting the need for structured validation. XGBoost and RUSBoost achieve the highest accuracy (72%) under temporal validation, indicating that time-aware evaluation better captures ecological patterns in the data. The proposed approach enhances the robustness of species distribution models targeting conservation-relevant behaviors and has the potential to support more informed marine spatial planning.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


