avatarSalvatore Raieli

Summarize

Deep learning can tell if you are above the drinking limit

A new algorithm that can measure your alcohol consumption from your speech

image source

Artificial intelligence has exploded in medicine in recent years, with most studies focusing on cancer, but there are other potential applications as well.

A group of researchers at La Trobe University (Melbourne, Australia) has developed an algorithm that can recognize if a person is over the legal drinking limit by listening to 12-second audio. Why is this important-how does it work?

Alcohol consumption

mosaic depicting the vintage (image source)

Alcohol consumption has always played a social role (both in ancient Egypt and in Roman times). Alcohol consumption has been associated with traditional festivals and convivial times.

As long as consumed in moderate amounts, alcohol is not a problem. On the other hand, excess alcohol has adverse effects on health, crime, and traffic accidents and leads to addiction. Globally, excessive alcohol consumption causes 2.8 million premature deaths per year.

Alcohol consumption per person in 2016. image source (here)

Excessive alcohol consumption can lead to liver cirrhosis (which can develop into cancer) in the long run. In addition, it can lead to neuropathy. In addition, alcohol dependence can lead to dementia, psychosis, anxiety, and other serious disorders. Blood concentrations above a certain limit can lead to ethyl coma and death.

Alcohol intoxication is one of the leading causes of fatal accidents on the roads. The CDC estimates that one person every 45 minutes (11,000 a year) dies from accidents that are related to people driving under the influence of alcohol. In fact, consuming alcohol before driving alters perceptions and physical abilities.

is important for a person to be able to understand whether he or she is capable of driving or not. There are both empirical tests (odor of alcohol, pupil dilation or constriction, ability to stand and walk, and speech) and instrumental tests such as measuring blood alcohol concentration (BAC) with breathalyzers.

drunk pleasants (image source)

Although the law varies by nation, in most cases penalties are triggered with BAC levels greater than 0.05 g/kg. Beyond this level. A BAC above this limit significantly impairs cognitively demanding psychomotor tasks. A person therefore should test himself and not drive above this limit.

As the authors note:

Unfortunately, BAC measurement using breathalysers is costly and time and labour intensive since it requires trained personnel, and the purchase and regular recalibration of expensive equipment to obtain accurate measurements. In addition, there is a possibility that residual alcohol left in the mouth will affect the BAC measurement of the individual when tested immediately after the consumption — original article

Thus they noted that there is a need for alternative methods:

Therefore, alternative fast, consistent and affordable methods based on observable behavioural signs of intoxication such as red eyes, distortions of speech, impaired walking or gait, etc. (Rubenzer, 2011) are required. — original article

photo by Kobby Mendez at Unsplash.com

The model

The researchers used as their dataset, the German Alcohol Language Corpus dataset. 12,360 audio files of 162 individuals who were recorded twice (once when they were sober and once when they consumed alcohol). In addition, the level of intoxication by both breath and blood analysis.

Meanwhile, the data were processed to be represented as images (audio was transformed into spectrograms). This is because the researchers wanted to apply deep learning algorithms that work with images. The authors then used a convolutional network.

Typical CNN architecture (image source)

Note that the audio varied in length, but the authors used only 12 seconds per audio (for both sober and intoxicated). In addition, they used the addition of Gaussian noise to ‘dirty’ the audio (data augmentation technique).

Because the dataset is unbalanced the researchers used Unweighted Average Recall (UAR) instead of accuracy which might overstimulate the algorithm’s capabilities.

In 68% of cases, the model was able to identify from 12-second audio whether the speaker is intoxicated or not. The authors describe the results:

In this paper, we developed ADLAIA, a deep learning algorithm that can identify inebriated individuals (BAC > 0.05%) with an UAR of 68.09% and accuracy of 67.67%, based on a 12-seconds audio clip of their speech. This is important because, despite the various factors (such as age, fatigue, alcohol tolerance level) that can influence the speech of an individual, ADLAIA’s performance is greater than the average human discrimination rate of 63.1% — which was computed on the same dataset — original article

Besides the ‘interesting technical result, such a model could be useful in several cases. The model could be inserted into a cell phone application and be used for rapid screening of people.

Such an application maybe useful in environments such as emergency rooms, sports stadiums, night clubs, restaurants, and bars, in which an instant identification of inebriation is useful, but breathalysers are often always available as backup devices. — original article

On the other hand, having such an application can be useful for collecting statistics both for later studies and for personal interest. A person could monitor his or her alcohol consumption, potential health risks, and so on.

“A test that could simply rely on someone speaking into a microphone would be a game changer.” — Alberto Bonela, the first author (source)

As the authors note, the model is not yet perfect. It should be tested with other datasets (perhaps unbalanced) that also consider different conditions (environmental, other languages, greater inclusion, different alcohol concentrations, and so on).

Conclusions

This model shows how an AI algorithm is able after a few seconds to recognize whether a person is over the limit or not. Another demonstration of how algorithms can quickly identify patterns in biological data. Also, it is interesting how transforming sounds into spectrograms can apply a convolutional network to data that are typically complex to analyze (sounds). In addition, there has been an explosion of algorithms applied to the medical domain in recent years, and 2023 promises further advances.

What do you think about it? Let me know in the comments

If you have found it interesting:

You can look for my other articles, you can also subscribe to get notified when I publish articles, and you can also connect or reach me on LinkedIn. If you want to support me, please clap and share, or you can also sign up here at no additional cost to you.

Here is the link to my GitHub repository, where I am planning to collect code and many resources related to machine learning, artificial intelligence, and more.

you may also be interested in:

Artificial Intelligence
Deep Learning
Culture
Food
Data Science
Recommended from ReadMedium