The crowdsourcing campaign was organized as a competition with prizes offered to those who contributed the most, based on a combination of quality and quantity. The Geo-Wiki platform (www.geo-wiki.org), a web platform dedicated to engaging citizens in environmental monitoring, was used as the tool to perform the campaign. A customized user interface was prepared for the campaign (Fig. 2), where participants were shown a random location in the tropics (here broadly defined as the area between 30 degrees of latitude north and south of the equator, i.e., including part of the subtropics), where a blue 1 × 1 km box showed the location to be visually interpreted. The Global Forest Change (GFC) tree loss map (v1.7)10 was overlaid on the imagery to show all areas where tree loss was detected at any point between 2008 and 2019. The tree loss area was shaded in red and the map itself was aggregated to 100 m for fast rendering.
Customized Geo-Wiki interface for the ‘Drivers of Tropical Forest Loss’ crowdsourcing campaign showing: (a) Tools available to participants such as the NDVI and Sentinel time-series profiles, visualizing the location on Google Earth and exploring the imagery time-series, reviewing the quick-start guide and exploring examples to identify specific drivers of forest loss as well as contacting IIASA staff via chat or email; (b) country and continent of the location as well as dates of the imagery shown; (c) campaign statistics; (d) available background imagery; and (e) tasks to be undertaken by the participants along with buttons to submit or skip the location.
The year 2008 was selected as the start date because the RED states that date as the cut-off year for conversion from high-carbon areas, i.e., forest, to other land uses7. In order to capture the main drivers of forest loss, but also include potential additional drivers such as the existence of roads as precursors of deforestation, the participants were asked to complete three steps: 1) To select the predominant tree loss driver visible inside the tree loss pixels in the blue box from a list of nine specific drivers; 2) to select all other tree loss drivers visible inside the tree loss pixels in the blue box from a list of five more general drivers, and 3) to mark if roads, trails, or buildings were visible in the blue box. The list of specific and general drivers as well as their definitions is shown in Table 1. The Geo-Wiki interface allowed participants to switch between different background imagery such as ESRI, Google Maps, and Bing Maps as well as Sentinel 2 satellite imagery. The different sources of imagery allowed the participants to see the location at different resolutions and in different periods of time. It also provided participants with information about the current country and the continent as well as the dates of the background imagery. Furthermore, it provided the participants with links for displaying NDVI and Sentinel time series, and to see the location and explore the historical imagery using the Google Earth platform. All these tools were meant to help with easier identification of the forest loss drivers by allowing participants to look at the locations during different times and at different spatial resolutions.
At the beginning of the campaign, each participant was shown a quick start guide of the interface and the tasks requested. As shown in Fig. 2, this quick start guide could be accessed again at any point during the campaign. Figure 2 also shows that the interface had buttons for four further functions. The first was to see the gallery of examples with access to pre-loaded video-tutorials and examples of images describing each driver of forest loss and how to do visual interpretation and selection of each of these (available at https://application.geo-wiki.org/Application/modules/drivers_forest_change/drivers_forest_change_gallery.html). An illustration of the gallery of examples shown to participants is shown in Figure S1. The second function was to ask experts for help, which automatically sent IIASA experts an email regarding a specific location. The third was to join the expert chat, which led participants to a dedicated chat interface on the Discord messaging platform. Here participants could pose questions and interact with staff and other participants directly. Finally, there was a button to see the leader board as well as the aims, rules and prizes of the campaign (available at https://application.geo-wiki.org/Application/modules/drivers_forest_change/drivers_forest_change.html). When the participants started the campaign, they were shown 10 initial practice locations, where they could try out the user interface (UI) with control points, which showed the participants how to identify the different drivers of forest loss. This set of videos, the images and the training points, together with the gallery of images, were developed to train the participants before and during the campaign.
Campaign set-up and data quality
As the aim of the campaign was to determine the drivers of tree loss across the tropics, the sample locations were selected from the GFC tree loss layer10 for the tropics (between 30 degrees north and south of the equator). No stratification was used since a completely random sample across the tropics was deemed to be the fairest representation of tree loss and their corresponding drivers. The previous map of deforestation drivers6 used a 5 K sample of 10 × 10 km grid cells to produce a global map. Here the sample size was largely driven by the estimated capacity of the crowd. Hence, we aimed to validate ca. 150k 1 × 1 km locations across the tropics, which is a considerably larger sample size than that of Curtis et al.6. In order to reduce noise, the GFC tree loss layer10 was first aggregated to a 100 m resolution from the original 30 m, and 150 K centroids were then randomly selected. From these, a sub sample of 5000 random locations were selected for visual interpretation by six IIASA experts (with backgrounds in remote sensing, agronomy, forestry and geography). Due to time constraints, only 2001 locations were evaluated by at least three different experts. In these locations, agreement was discussed and once a consensus was reached, these locations became the final control or expert data set. The control locations were then used to produce quality scores for each participant as the campaign progressed in order to rank them and determine the final prize winners. The list of prizes offered to the top 30 participants is shown in Table S1 in the Supplementary Information (SI), and a list and rank of motivations mentioned by the participants is shown on Figure S2 in the SI.
The control locations were randomly shown to the participants at a ratio of approximately 2 control locations to every 20 non-control locations visited. If the participants correctly selected the predominant tree loss driver (in step 1), they were awarded 20 points; if they selected the wrong answer, they lost 15 points. If participants confused pasture and commercial agriculture or wildfire with other natural disturbances, they lost only 10 points instead of 15. Furthermore, they could win 8 additional points by selecting the correct secondary drivers in step 2. If a mixture of correct and incorrect answers were provided in step 2, the participants gained 2 points for every correct choice and lost 2 points for every incorrect one, with a minimum gain/loss of 0 points. Finally, participants could earn 2 additional points by correctly reporting the existence of roads, trails or buildings in step 3. The scoring system was based on previous Geo-Wiki campaign experiences and aimed to promote focus on the primary driver selection. The points were used to produce a leader board with the total number of points by participant. Additionally, a relative quality score (RQS) was derived from the score received by the users and the potential score that could have been obtained if all control points were correctly interpreted. This is shown in Eq. 1.
$${rm{RQS}}=(({{rm{NCP}}}^{ast }15+{rm{SumScore}})/{rm{NCP}})/45$$
(1)
where RQS ranges between 0 and 1, NCP is the number of control points visited and SumScore is the number of points obtained.
The RQS was crucial in understanding how each participant performed in terms of the quality of their visual interpretations, as this was independent of the number of locations interpreted. Once the campaign ended, an average RQS was used as a minimum criterion for participants to receive a prize, independent of where they were located on the leader board. Additionally, all users who submitted a substantial number of interpretations, i.e., more than 1000 with the minimum required RQS, were invited to become co-authors of the current manuscript, independent of whether they received a monetary prize or not. All these co-authors additionally contributed to the editing and revision of this manuscript. Furthermore, future users of the data set could use the RQS as a key data quality indicator.
After the campaign, the data post-processing included eliminating interpretations made by users who broke any of the competition rules. Additionally, during the campaign, some users communicated with IIASA staff using the “Ask Experts” button and pointed out that some control points were mistaken. Consequently, the corresponding points lost were added to the final score of those participants where the correction was made. A total of 18742 validations from 1 participant were removed before the end of the campaign and the user was disqualified since their account was deemed to be shared across several people and computers, which was not allowed. Validations from another user (38,502 out of 40,828) were also removed due to inconsistencies but the user remained in the competition. Before the prizes were awarded to the top 30 users, a questionnaire was administered to all users to gather information about participant characteristics and gauge their motivations. Participation was mandatory for the top 30 users. A summary of the participant backgrounds is provided in Figure S3 in the SI.
Source: Ecology - nature.com