Aim: The aim of this study was to define reef benthic habitat states and explore their spatial and temporal variability at a global scale using an innovative clustering pipeline.
Location: The study uses data on the transects surveyed on shallow (< 20m) reef ecosystems across the globe. Time period: Transects sampled between 2008 and 2021. Major taxa studied: Macroalgae, sessile invertebrates, hydrozoans, seagrass, corals.
Methods: Percentage cover was estimated for 24 functional groups of sessile biota and substratum from annotated underwater photoquadrats taken along 6,554 transects by scuba divers contributing to the Reef Life Survey dataset. A clustering pipeline combining a non-linear dimension-reduction technique (UMAP), with a density-based clustering approach (HDBSCAN), was used to identify benthic habitat states. Spatial and temporal variation in habitat distribution was then explored across ecoregions.
Results: The UMAP-HDBSCAN pipeline identified 17 distinct clusters representing different benthic habitats and gradients of ecological state. Certain habitat states displayed clear biogeographic patterns, predominantly occurring in temperate regions or tropical waters. Notably, some reefs dominated by turf algae were ubiquitous across latitudinal zones. Transition zones between temperate and tropical waters emerged as spatial hotspots of habitat state diversity. Temporal analyses revealed changes in the proportion of certain states over time, notably an increase in turf algae occurrence.
Main Conclusions: The UMAP-HDBSCAN clustering pipeline effectively characterised fine-scale benthic habitat states at a global scale, confirming known broader biogeographic patterns, including the importance of temperate-tropical transition zones as hotspots of habitat state diversity. This fine-scale, yet broadly-scalable habitat classification could be applied as a standardised template for tracking benthic habitat change across space and time at a global scale. The UMAP-HDBSCAN pipeline has proven to be a powerful and versatile approach for analysing complex biological datasets and can be applied in various ecological domains.