doi:10.1038/nindia.2013.152 Published online 20 November 2013
Evolutionary ecologist Suhel Quader first saw the potential of crowdsourced data when he participated in UK's Breeding Bird Survey where amateur bird watchers were assigned one kilometre squares in a grid spanning the country. They were asked to walk transects, count all visible birds and submit the data. Ecologists then compared the annual bird counts from these surveys and were able to measure the change in bird populations from one year to the next.
Quader, who collated and analyzed the numbers, was impressed with the kind of data amateurs gathered. "The detail of information was remarkable," he told Nature India. The experience inspired him to begin an Indian crowdsourcing project to record bird migration times in 2007. Named Migrant Watch, it involves volunteers submitting data on where and when they spotted a bird for the first time in a season. "We weren't sure if it would work. We were testing the waters," he says.
Migrant Watch threw up some interesting data. For one, the folklore of the Pied Cuckoo in India heralding monsoon was proved true, although the degree by which the bird's arrival preceded the monsoon was variable. The ecologist was convinced enough to start another crowdsourcing project Season Watch to collect data on the fruiting and flowering patterns of trees across India. Quader is part of a growing tribe of Indian ecologists and conservationists using crowdsourcing to involve lay people in gathering data on biodiversity.
The benefits of this approach are many: amateurs often cover areas unreachable by a lone ecologist and can survey large areas at a fraction of the cost. But the biggest advantage is the richness of data since amateurs are omnipresent.
At the same time, crowdsourced data comes with inherent challenges. It may be unreliable unless the amateurs are trained. Not many Indian crowdsourcing projects employ trained volunteers yet, Qader says, rendering them unsuitable for use in policy decisions. In contrast, data from UK's Breeding Bird Survey forms part of forms a part of listings such as Wild Bird Indicators, an input in environmental policy decisions.
In 2008, five institutions led by Bangalore's Ashoka Trust for Research in Ecology and the Environment (ATREE) launched the major crowdsourcing initiative Indian Biodiversity Portal (IBD). Today, IBD has close to 3000 users who upload photographs and data on location of sightings of various species of birds, mammals, fish, molluscs and other organisms. Recently, this website was integrated with the Department of Biotechnology's biodiversity database, the Indian Bio-resource Information Network (IBIN).
IBD has surprised nature lovers several times with its records of rare sightings. Prabhakar Rajagopal, director of bioinformatics firm Strand Life Sciences and one of the people who built IBD, says that in June this year, a user from Kerala posted a photograph of a blue-throated bee-eater, the first recorded sighting of the Southeast Asian species in India.
Another source of data for ecologists is social media. For example, ATREE researchers Roshmi Rekha Sharma and Neelvara Ananthram Aravind analysed the distribution of frogs in the Western Ghats, 40 per cent of their data coming in from Facebook, Flickr and email groups. They found a concentration of frog species above and below the Palghat Gap, where there are few protected areas.
Vidya Athreya, an ecologist who studies leopard-human conflict at Bangalore-based Centre for Wildlife Studies, wanted to find how widely spread wild-carnivores were in human-inhabited regions. Since ecologists typically do not track carnivores in human-use areas, she created a website for lay people to report carnivore sightings.
Ecologists admit that unless crowdsourced data is collected in an extremely structured format, using trained volunteers, it will have gaps. Biases are common; for example, crowdsourced data is often concentrated in urban and semi-urban regions, where amateurs have access to cameras and the internet. Amateurs also often identify species wrongly or enter inaccurate location data.
Kotiganahally Narayana Ganeshaiah, who heads IBIN, says this is a major problem in integrating data from IBD into IBIN. IBD users enter data in different formats. No software exists to standardise these. If a user spells the name of a place differently, it is recorded as a different place altogether. Due to such data heterogeneity, nearly 60 percent of the data on IBD cannot be analysed for trends. "It's not a great challenge, but something we have to tackle," says Ganeshiah.
Another problem with crowdsourced data is that amateurs may not report less charismatic or common species, focusing only on rare ones, says ATREE's Aravind.
Inspired by the initial successes though, researchers are putting together more structured projects involving trained volunteers.
Kaberi Kar Gupta, an assistant professor of biology at California State University, has launched a survey to map the change in distribution of Slender Lorises in Bangalore across the years. The survey will assess if the animals can act as an indicator species. Since Lorises require very specific habitats with continuous canopies and connectivity, their dwellings could be home to many other species as well.
Her systematic approach will involve dividing the city into a grid and asking volunteers to walk transects in assigned locations. The data generated is expected to be policy-friendly. Quader's Season Watch will also provide basic training material to volunteers, and require more structured inputs than Migrant Watch did.
If ecologists are just beginning to explore the advantages of crowdsourcing, pharmaceutical and biotech industries are already reaping its benefits. This month, the SBV Improver, an initiative by IBM Research and Phillip Morris International released the results of an international crowdsourcing project to understand how far research into rodent models could be extrapolated to biological processes in humans. Six Indian teams were part of the 28 that participated in this project, which found that rodent models could predict biological processes in humans better than what could be expected based on species similarity alone.