Models of seasonal bird occurrence from citizen science big data

GIMA
M-GEO
STAMP
Topic description

Mapping the presence or abundance of an animal or plant species is a non-trivial challenge.  It often involves elaborate field surveys with strict data collection protocols and usually deep understanding of the species' phenology.  Timing is often important: plants are much harder to find and identify in winter, and birds become much more quiet outside the breeding season, or they may even migrate and not be present at all.  The various classical protocols for scientific surveys have matured in the last decades, and are used in various atlas projects.  While these provide trustworthy data for good-quality mapping outcomes, an important disadvantage is that they are labour-intensive.

In the last 20 years or so, the rise of the internet and mobile telephony has spurred the development of observer communities that collect the data of their observations.  This is now leading to substantial data collections, which comprise observations that have perhaps not been collected under strict scientific protocols, but that are so voluminous that (perhaps) they have compensating characteristics.  A natural problem to address with this type of data is the derivation of occurrence or even abundance maps. Machine learning is an obvious technique to apply, as it handles large amounts of data well.

The Netherlands presents an interesting case: its population density is very high, and it has a well-developed economy with good IT infrastructure. Also, the country has an exquisite range of open, ancillary spatial data available that help to characterize the landscapes and land use. Consequently, the density of well-equiped observers is high, and this has led to development of a number of citizen science portals that collect nature observations.  Waarneming.nl (WNL) is the biggest of these, measured by data volume collected annually. The premise under which we shall work in this proposed MSc project is that WNL data is good enough to reliably derive range maps from.

We propose that a machine learning approach will be developed on the basis of previous work.  Vijayudu Kondi [Kondi 2021] developed an interesting and quite successful approach in 2021 that is worth of study and follow-up.  We feel it can be improved.  The problems for which he aimed to develop solutions can be characterized in the following way:

  • the data only provides species occurrence information, and not species absence information
  • without a prescribed data collection protocol, one cannot know whether the observer has reported all plants and animals that s/he found on a day trip, or only those that were reported because of personal preferences
  • no information is available on species detectability, which is known to be seasonally variable

Kondi did not make use of information about species abundance, while it is known that common species are less often reported than rare species.

The candidate student will first study previous work, and then write a critique of that work. S/he will next develop a machine learning strategy that addresses the criticism and aims to improve on the [Kondi 2021] results. All illustrations used in the topic description come from that work.

Validation of the work done will be possible through comparisons with the recently published Dutch Bird Atlas [Sovon,2019]

Topic objectives and methodology

Via the citizen science nature observation platform of waarneming.nl, we have acquired millions of bird observations, collected since 2005. The research objective of this project is to develop a spatial machine-learning workflow that derives for arbitrary species good-quality seasonal range maps.

References for further reading
  • Vijayudu KondiA Big Data Approach to Model Bird Occurrence from Crowd-sourced Data, M.Sc. thesis, ITC, University of Twente, August 2021.

  • Kelling, S., Johnston, A., Fink, D., Ruiz-Gutierrez, V., Bonney, R., Bonn, A., Fernandez, M., Hochachka, W., Julliard, R., Kraemer, R., & Guralnick, R.Finding the signal in the noise of citizen science observations. bioRxiv, 2018.

  • Isaac, N., & Pocock, M.Bias and information in biological records. Biological Journal of The Linnean Society, 115, 522–531.

    Sovon Vogelonderzoek NederlandVogelatlas van Nederland. Broedvogels, wintervogels en 40 jaar verandering, 2019. Kosmos Uitgevers, Utrecht/Antwerpen.