avatarSinergise

Summary

The International Institute for Applied Systems Analysis (IIASA) and Sinergise are collaborating on a public initiative to improve Earth Observation (EO) training datasets for better cloud detection and land cover algorithms by using ESA's Sentinel-2 satellite imagery and engaging public participation to curate a large database of training samples.

Abstract

The IIASA and Sinergise have embarked on a project to enhance the accuracy of cloud detection in satellite imagery used for land cover change detection. Recognizing the limitations of existing algorithms, particularly in terms of false-positives, they aim to develop more suitable algorithms for this purpose. To achieve global applicability, they are leveraging the Sentinel Hub services to access the complete archive of Sentinel-2 data and have created a simple application for the public to contribute by classifying different types of clouds. This crowdsourced approach not only accelerates the collection of curated cloud classification samples but also sets the stage for gathering other types of datasets in the future, such as manually curated land cover classification data sets, which can serve as training data for machine learning algorithms. The data collected will be available for exploration and download through the Geopedia portal.

Opinions

  • The initiative acknowledges the importance of accurate cloud detection for land cover change detection and the inadequacy of simple algorithms due to high false-positive rates.
  • There is a strong emphasis on the need for a large, global database of training samples to improve EO algorithms, which has led to the decision to involve the public in data curation.
  • The use of Sentinel Hub services reflects a strategic choice for efficient access to a vast amount of satellite imagery data, facilitating the project's global scope.
  • The project is designed to be user-friendly, allowing participants to easily classify clouds using a paintbrush-like interface, indicating a focus on accessibility and user engagement.
  • The initiative is forward-thinking, with plans to extend the crowdsourcing model to other datasets, demonstrating a commitment to scalable and efficient data collection methods for various applications in Earth observation.

Crowdsourcing EO training datasets to improve cloud detection

The International Institute for Applied Systems Analysis (IIASA) has joined Sinergise to engage the public in an initiative involving ESA’s Sentinel-2 satellite imagery, and together we would like to improve EO training datasets to achieve better cloud detection and land cover algorithms.

A plethora of algorithms to distinguish clouds in multispectral satellite data are available. When using them as part of land cover change detection, simple (but fast) algorithms often fail due to many false-positives, which can sometimes have a significant impact on the end result. The ability to discriminate cloudy pixels is crucial for any automatic or semi-automatic solutions that detect land change. Therefore, we decided to try and develop algorithms that would be more suitable for this purpose. Because we want our services to be available globally, and because that means we need a very large database of training samples, we decided to engage the public’s help.

To obtain a large data resource of curated cloud classification ssamples we used a number of tools, developed at IIASA, and Sentinel Hub services, which provide fast access to the entire global archive of Sentinel-2 data.

How does it work?

It’s actually really simple. The application provides an image (e.g. 64x64 pixels), on which you delineate different types of clouds (opaque, thick, and thin clouds) in a paintbrush-like UI. The rest of the image will be implicitly cloud-free.

Help us collect the data and start using the application! The resulting data will be made available through the Geopedia portal, both for exploring and downloading.

Collecting other datasets

The approach will also allow us to collect other datasets in a rapid and efficient manner in the future. For example, using a slightly modified configuration, a similar workflow could be used to obtain a manually curated land cover classification data set, which could be used as training data for machine learning algorithms.

Manual cloud classification app

Originally published at sentinel-hub.com.

Machine Learning
Crowdsourcing
Cloud Detection
Eo Training Datasets
Earth Observation
Recommended from ReadMedium