Workflow for the generation of expert-derived training and validation data: a view to global scale habitat mapping

Roelfsema, Chris M., Lyons, Mitchell, Murray, Nicholas, Kovacs, Eva M., Kennedy, Emma, Markey, Kathryn, Borrego-Acevedo, Rodney, Ordoñez Alvarez, Alexandra, Say, Chantel, Tudman, Paul, Roe, Meredith, Wolff, Jeremy, Traganos, Dimosthenis, Asner, Gregory P., Bambic, Brianna, Free, Brian, Fox, Helen E., Lieb, Zoe, and Phinn, Stuart R. (2021) Workflow for the generation of expert-derived training and validation data: a view to global scale habitat mapping. Frontiers in Marine Science, 8. 643381.

PDF (Published Version) - Published Version
Available under License Creative Commons Attribution.

Download (3MB) | Preview
View at Publisher Website:


Our ability to completely and repeatedly map natural environments at a global scale have increased significantly over the past decade. These advances are from delivery of a range of on-line global satellite image archives and global-scale processing capabilities, along with improved spatial and temporal resolution satellite imagery. The ability to accurately train and validate these global scale-mapping programs from what we will call “reference data sets” is challenging due to a lack of coordinated financial and personnel resourcing, and standardized methods to collate reference datasets at global spatial extents. Here, we present an expert-driven approach for generating training and validation data on a global scale, with the view to mapping the world’s coral reefs. Global reefs were first stratified into approximate biogeographic regions, then per region reference data sets were compiled that include existing point data or maps at various levels of accuracy. These reference data sets were compiled from new field surveys, literature review of published surveys, and from individually sourced contributions from the coral reef monitoring and management agencies. Reference data were overlaid on high spatial resolution satellite image mosaics (3.7 m × 3.7 m pixels; Planet Dove) for each region. Additionally, thirty to forty satellite image tiles; 20 km × 20 km) were selected for which reference data and/or expert knowledge was available and which covered a representative range of habitats. The satellite image tiles were segmented into interpretable groups of pixels which were manually labeled with a mapping category via expert interpretation. The labeled segments were used to generate points to train the mapping models, and to validate or assess accuracy. The workflow for desktop reference data creation that we present expands and up-scales traditional approaches of expert-driven interpretation for both manual habitat mapping and map training/validation. We apply the reference data creation methods in the context of global coral reef mapping, though our approach is broadly applicable to any environment. Transparent processes for training and validation are critical for usability as big data provide more opportunities for managers and scientists to use global mapping products for science and conservation of vulnerable and rapidly changing ecosystems.

Item ID: 70488
Item Type: Article (Research - C1)
ISSN: 2296-7745
Keywords: Allen Coral Atlas, calibration, coral reefs, habitat mapping, training, validation
Copyright Information: Copyright © 2021 Roelfsema, Lyons, Murray, Kovacs, Kennedy, Markey, Borrego-Acevedo, Ordoñez Alvarez, Say, Tudman, Roe, Wolff, Traganos, Asner, Bambic, Free, Fox, Lieb and Phinn. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
Funders: Vulcan Philanthropy, Australian Research Council (ARC)
Projects and Grants: ARC DE190100101
Research Data:
Date Deposited: 29 Nov 2021 05:13
FoR Codes: 31 BIOLOGICAL SCIENCES > 3103 Ecology > 310305 Marine and estuarine ecology (incl. marine ichthyology) @ 40%
46 INFORMATION AND COMPUTING SCIENCES > 4611 Machine learning > 461199 Machine learning not elsewhere classified @ 10%
41 ENVIRONMENTAL SCIENCES > 4104 Environmental management > 410401 Conservation and biodiversity @ 50%
SEO Codes: 18 ENVIRONMENTAL MANAGEMENT > 1802 Coastal and estuarine systems and management > 180201 Assessment and management of coastal and estuarine ecosystems @ 30%
18 ENVIRONMENTAL MANAGEMENT > 1805 Marine systems and management > 180501 Assessment and management of benthic marine ecosystems @ 70%
Downloads: Total: 774
Last 12 Months: 66
More Statistics

Actions (Repository Staff Only)

Item Control Page Item Control Page