Decoding animal communication using AI

After writing so much about AI, I’m rarely surprised or, let’s say, inspired. But after listening to Aza Raskin from https://www.earthspecies.org/ I got so moved that I decided to write a piece on it.

Earth Species Project is a non-profit dedicated to using artificial intelligence to decode non-human communication. They believe that understanding non-human languages will transform our relationship with the rest of nature.

In today’s blog, we will peek into the future and look at how we would use AI to talk to our pets and all the nature around us. As of now, we know more about the lunar surface than about animal communication.

Introduction

No one knows how many more doors AI will open, some good and exciting, others dangerous and horrifying. From creating deadly viruses to understanding animal communication, the potential is limitless; all we need is a strong ethical framework to ensure that we sustainably move towards the future.

Let’s start with a small audio clip. See what a bearded seal sounds like; it’s almost alien-like. This is just one example; nature is filled with wonderful voices, and I’m sure that each of us tells something, but we haveyet to discover their meaning.

Unlocking communication with nature can unlock so many doors that enrich human lives and completely transform how we understand and look at life on this planet.

Take another look at Geledas monkeys; they are believed to have the largest vocabulary of all the primates except humans.

When we think about experiences, we only talk about human experiences; we have almost no idea how flora and fauna perceive the world around us. The human experience is limited and can vastly differ from that of plants and animals. We might not be able to understand or even grasp the extent of animal perception. Below is an image showing the difference between a human and a bee vision. Vision is only a part of perception; we have no clue how different animals perceive the world. It can be so different that it is almost unrecognizable to us. So, the nature of reality depends on the perception of the perceiver, and there is no correct or incorrect perception; it is just finer or coarse.

Another study showcased that dolphins can be trained to do something they have never done before. You can have a pair of dolphins make a gesture that tells them they have to do something unique; they go underwater and somehow communicate and do the same thing when coming out of the water. To do the same action, they must be able to communicate in some ways. I’m not saying that they are saying words, but they must communicate in some way that we still do not understand.

Okay, we get it that animals can talk in a sense and perceive the world differently than us, but you mentioned plants also; what about them?

Researchers from the University of Tel Aviv conducted a unique study utilizing a primrose flower to investigate the impact of various sounds on it. They exposed the flower to myriad auditory stimuli, including traffic noises, low and high-frequency sounds, bat noises, and, notably, the sounds of an approaching pollinator. Among all these sounds, only the sound of an approaching pollinator elicited a response from the primrose flower. Remarkably, upon hearing the sound of the approaching pollinator, the flower reacted by promptly producing more nectar, which was also sweeter in nature, within a few seconds. This fascinating reaction suggests that the flowers can “ hear “ the bees buzzing through their petals and respond by enhancing the sweetness and quantity of their nectar, showcasing a profound interaction between flora and fauna.

In other experiments, they showed that upon cutting branches and dehydrating them, plants make a noise of 50–60 Khz (barely above human hearing). Thus, we can say that even plants can speak. Their perception is not only limited to bee sounds, but even when the roots of a plant are put in plastic bags, the roots still move towards the sound of the water, and this is really fascinating, showcasing the richness of life around us.

And before we move on to the AI part, here is one more fascinating study on plants. There is this vine called Boquila trifoliolata, and it can mimic its host trees; what’s fascinating about this is that it mimics the tree even in the case of an artificial tree. So what would you call this if not eyes for the plants in some sense that we don’t understand?

Examples of Boquila trifoliolata mimicking the leaves of other plants. Yellow arrows point to leaves of Boquila trifoliolata that are mimicking other species in A-G. Image H shows what Boquila trifoliolata leaves look like when they’re not mimicking other plants. (Img Src)

Where and how does AI fit in?

I hope you are fascinated and surprised by the different species’ communications. But how does AI help in all of this? To understand the role of AI, we need to talk a little about the language and communication itself.

Every language has some semantic relationships; if it doesn’t, it’s not a language. Language has some form of structure, neither very rigid like binary nor very loose like DNA. The below diagram showcases how structured different types of data streams are.

Rigid or flexible structures in data streams (Img Src)

But how does one learn a language that no one else speaks now? We learned about the Ancient Greek language Hieroglyphics by using a rosetta stone — a stone containing the same text in different languages. But the problem with animal communication is that we don’t have any such rosetta stone. Then how do we decode animal communication.

So here’s a cool idea:

What if we could map all the different types of input streams (languages or sounds) to a geometrical space? And then use the patterns found in the geometrical space to map out the words from different input streams without needing a rosetta stone?

Let’s understand what we are talking about in more detail.

Word embedding

Word embeddings are a type of word representation that allows words to be represented as vectors in a continuous vector space. A word’s position within the vector space is learned from the text and is based on the words surrounding the word when it is used. Word embeddings can be used to capture semantic relationships between words computationally efficiently.

Let’s consider a simple example to explain this concept further. Assume we have word embeddings in a hypothetical 2-dimensional space for easier visualization, although word embeddings often have hundreds of dimensions in practice.

Suppose we have vectors for the words ‘King’, ‘Queen’, ‘Man’, and ‘Woman’, and they are placed in this 2D space as follows:

King: (3, 1)
Queen: (3, 2)
Man: (2, 1)
Woman: (2, 2)

Word embeddings show casing different semantic relationships (Img Src)

One of the properties of word embeddings is that semantically similar words are placed close to each other in this vector space. Additionally, relationships between words can often be captured by vector arithmetic.

In our example, we can capture the gender relationship by subtracting the vector for ‘Man’ from ‘King’ and adding it to ‘Queen’:

King — Man + Woman = (3, 1) — (2, 1) + (2, 2) = (3, 2) = Queen

This equation essentially captures the relationship “King is to Man as Queen is to Woman” in terms of vector arithmetic.

Similarly, the spatial relationship between words can be observed in this vector space. In our simplified example, the vectors for ‘King’ and ‘Man’ are closer to each other as they share a more similar gender-related semantic aspect, just as ‘Queen’ and ‘Woman’ are closer to each other.

In higher-dimensional spaces, these relationships can capture more complex semantic relationships, and the distances between vectors (often measured using cosine similarity) can provide a measure of semantic similarity between words. Word embeddings are a powerful tool for many natural language processing tasks, providing a dense, continuous, and semantically meaningful representation of words.

Word embeddings for top-10k English words. Also, showcasing the word embeddings closest to DOG. (Img Src)

But what’s so special about these embeddings?

Now, if you know a bit about AI, then I’m positive you already know about word embeddings. But do you know these word embeddings preserve similar structures when we change the languages? This is the core idea behind animal communication decoding.

Semantic relationships of dog in different languages (Img Src)

The above images show that even in different languages, a word’s semantic relationship with other words is preserved in the geometrical space, which is literally crazy. All they had to do was turn and twist these shapes to align. If you know a little about languages, you would know languages are vastly different in terms of semantic relationships, genders, verbs, and how each word affects the possibility of others. But they somehow occupy the same geometric space when converted into geometric representation. This is true for all the languages. For instance, English and Japanese have wildly different structures, yet somehow, they have similar patterns in geometric space.

If you want to know more about how we convert words into word embedding, you can look at this wonderful article: CLICK HERE

Now, the hypothesis is that even animal communication will preserve some of these structures, and using some similar technique, we can map out different animal sounds. This will not allow us to talk to animals because there will be large corpora of sound that we cannot map out to any human language, but whatever is mapped out will be very interesting.

Now, there will be two parts to animal communication:

The one that maps with human language and is translated.
The second one, which is completely different than the semantic structure found in human language.

Now, the task is solved if we find that most animal language maps out with human language. But if most of it lies in the second category, then we know how little we know about animal behavior and communication.

Humans have passed on culture for at most 200,000 years, whereas whales have passed on culture for 34 million years.

The animal experiences

The reality animal experience is vastly different than ours. One of the ways animal behavior is studied is by making animals self-identify themselves in a mirror. There are a lot of animals who can not only self-identify themselves but even have a very strong sense of who’s who in a group, like in a group of wolves and whales.

A step further, it has been studied that few of these animals even seek an altered state of consciousness like we do through drugs and alcohol. A state of transcendence, sometimes by hanging upside down through a vine and rapidly rotating and other times through eating psychedelic plants or insects.

Grief is another emotion that has been showcased by many animals when their young ones die. The entire animal kingdom is full of experiences and perceptions that we have barely understood or even know about.

What’s the status of animal communication?

Researchers have already started mapping out and collecting more and more samples, but still, 97% of the data is thrown out because of noise. We don’t know perfectly well how to separate different things in animal communication. Given below are the embeddings of Beluga whale communication.

Beluga whale communication embeddings (Img Src)

These are the following challenges in animal communication from an AI point of view.

Detection
Classification
Source separation
Denoising
Mapping to human languages

Till recently, we were looking into different AI for different tasks. But with the introduction of Transformers, we can treat everything as a form of language and truly develop a multi-modal capacity to transform any language to any other language — from fMRI to text, from text to video, and so on.

Given below is the image showing the conversion of fMRI data into images. We are literally making a movie from what humans think with their brain signals.

Converting brain signals to an image (Img Src)

Getting more animal data

We are already working on models to complete an animal sound with a small audio clip. But more than that, given the motion of the animals, we are asking our models to predict what kind of sound the animal will make.

Converting motion into sound for whales (Img Src)

We have a lot of models where we give a piano prompt, and it completes the piano piece or give 3 seconds of someone’s audio, and then it can continue to speak as if the same person is speaking. And that’s the same thing we can do for animal voices and sounds. The possibility are endless.

Using AI to generate synthetic animal voices (Img Src)

Caution

I know this sounds very exciting, but we must tread extremely carefully. It is very much possible that we might be able to communicate with animals in the next 30 years, but it is also possible that we might destroy their culture by introducing sounds that disturb their millions of years of culture and thus make them unable to hunt and survive. I’m excited for the future and a bit scared; we are already getting into science fiction, and no one truly knows how it will unfold.

This marks the end of this blog; this has been one of my favorite blogs I’ve written to date. Writing such articles is very time-consuming, so show love and respect by clapping and sharing the article. Happy learning ❤

REFERENCES

[1] https://www.youtube.com/watch?v=3tUXbbbMhvk

[2] https://arxiv.org/abs/2305.10740