Scientists are applying machine learning to identify human influence on the environment by literally listening to the environment — that is, by monitoring forest “soundscapes.”
The United Nations has called on the world to protect 30 percent of the planet from human activity to help protect ecosystems and slow down climate change. But conservation areas are often vulnerable to illegal logging, poaching, mining, and other activities that threaten biodiversity. How can land managers detect these kinds of human impacts on protected ecosystems? Scientists are applying machine learning to identify human influence on the environment by literally listening to the environment — that is, by monitoring forest “soundscapes.”
Every ecosystem has its own distinctive collection of sounds that change with the season and even the time of day. According to Bryan Pijanowski, soundscape ecologist and director of Purdue University’s Center for Global Soundscapes, “Sounds are part of the ecosystem, and they are signatures of that ecosystem.” The unique sound environment of an ecosystem is known as a soundscape, the aggregate of all the sounds — biological, geophysical, and anthropogenic — that make up a place.
Sound has long been used by soundscape ecologists to assess biodiversity and other metrics of ecosystem health. Pijanowski has his own, informal rule of thumb: “If I can tap my foot to a soundscape, I know it’s fairly healthy,” he says, because it means “the rhythmic animals — the frogs and the insects, the base of the food chain — are there.”
New research published in the Proceedings of the National Academy of Sciences applies tools from machine learning to these soundscapes to get a better picture of ecosystem health and human activity. The researchers built algorithms that taught themselves to predict habitat quality in different environments across the world, ranging from rainforests in Borneo and the Republic of Congo to temperate forests in New York, based only on sound data.
Detecting human activity that impacts ecosystem health, like illegal logging and poaching, has long been a challenge for land managers and scientists, often requiring expensive and time-consuming surveys in which specialists manually identify species. But this new method requires only basic audio equipment that allows for remote monitoring of the soundscape, which can be done in real time, and a machine learning algorithm that listens for sounds that aren’t typical in a forest environment. “Say that there’s weird things going on or illegal activity, like guns being shot, or chainsaws from illegal logging,” explained Sarab Sethi, a mathematician at Imperial College London and the lead author of the new paper. “We work under the assumption that illegal activity contains a lot of anomalous sounds that are different from whatever usual sounds are in the ecosystem.”
How does the computer identify strange sounds? The key is unsupervised machine learning, meaning machine learning that doesn’t require human input to “train” the model on pre-identified data. “The way that we measure similarities and differences in sound is really the technical advance from our work,” Sethi told Grist. This new method uses a neural network to compare the “fingerprints” of sounds — not only their frequencies, but the structure of how their frequencies change over time — to one another other. “Once we’ve got a fingerprint, like a bird calling — a bird calling is more similar to a different species of bird calling, in this fingerprint, than it is to, say, a gunshot,” says Sethi. The neural network learns which sounds are typical of a healthy forest environment, and which ones are out of the ordinary.
The unsupervised technique requires less work from humans to identify sound; it’s also more robust than so-called supervised machine learning. Unsupervised, the algorithm detects anomalous sounds on its own, without requiring a fallible human researcher to teach it what gunshots and chainsaws sound like. “If you use a supervised approach, your whole approach succeeds or fails based on how good your training data is, so how well labeled that data is,” said Sethi. “You don’t have that sort of reliance in unsupervised methods.”[…]