Free AI web copilot to create summaries, insights and extended knowledge, download it at here

Abstract

, Speech Recognition, and Sentiment Analysis) to understand user needs and provide a personalized reply.</p><p id="091e">Examples:</p><ul><li>Messaging apps such as WeChat</li><li>Virtual Assistants such as Amazon Alexa or Baidu Xiaodu</li><li>Individual organizations’ apps such as Microsoft chatboy Tay</li></ul><p id="a5f5">Challenges: Chatbots can be trained from biased historical data. Microsoft’s chatbot Tay was an experiment which learned from the people it interacted with on Twitter and the messaging apps Kik and GroupMe. Within 24 hours, it started making racist remarks.</p><h2 id="598e">Classification</h2><p id="58fd">Classification categorizes a data instance into a similar group. It does this by training historical data with known labels, and then applying an algorithm to automatically find categories for new data. In order to best map examples of input data to specific class labels, the training dataset must be sufficiently representative of the problem and have many examples of each class label, especially the minority.</p><p id="5aca">Examples:</p><ul><li>Image classification for disease diagnosis in healthcare</li><li>Text classification</li><li>Sentimental Analysis on customer reviews</li></ul><p id="e77b">Challenges: One of the primary concerns with classification is that the sample of data may not be adequately representative by not ascribing enough importance to minority cases. Amazon tried building an artificial-intelligence tool to help with recruiting, but it found the tool was unfavorable toward female candidates because it had combed through male-dominated résumés to accrue its data.</p><h2 id="854b">Computer Vision</h2><p id="e06f">Computer Vision extracts information from images and recognizes specific concepts. It can therefore perform tasks such as recognizing faces, characters in an image, or the location of an object in an image. It does this by learning from a visual database that has been annotated manually, and then running new images through a neural network composed of successive layers of neurons which associate the new image to familiar images.</p><p id="6c89">Examples:</p><ul><li>Authorized login, surveillance</li><li>Measuring headcount or displayed emotions</li><li>Driverless cars</li></ul><p id="6f73">Challenges: Image factors like illumination, occlusion or low resolution, among others can limit the potential of facial recognition to work optimally. In Michigan, a faulty facial recognition match led to a man’s arrest for a crime he did not commit. There have been several cases in the

Options

U.S. where an algorithm-based bot has accidentally identified a regular citizen as a criminal and has automatically suspended their driving license. Similar cases have also occurred at airports all over the world, where innocent people have been “recognized” as terrorists. In another case, a Tesla engaged in autopilot mode collided with a tractor-trailer, due to its inability to quickly distinguish the color between the background sky and vehicle.</p><h2 id="eacc">Recommendation Systems</h2><p id="3d86">Recommendation systems use history and preferences to predict what users like. The recommendation function takes information about the user such as demographics, preferences and behavior (inferred through clicks, views and purchases) to predict the rating a user might assign to a product.</p><p id="5f11">Examples:</p><ul><li>Advertisements when surfing a website</li><li>Videos on Youtube, movies on Netflix or courses on Safaribooks</li><li>Social media posts</li></ul><p id="a04f">Challenges: The less data a recommendation system has about a user, the more likely it is to incorrectly assume their preferences. We have all been recommended content which is not personalized to us but simply the most popular.</p><h2 id="794f">Segmentation</h2><p id="8d78">Segmentation splits data into different clusters based on shared characteristics. Clustering uses algorithms to identify how different types of data are related and creates new segments when appropriate. Similarity between data is determined through distance measurements.</p><p id="47ff">Examples:</p><ul><li>Biomedical image processing of radiography, thermography or ultrasound to identify diseases</li><li>Segmenting businesses based on industry, size and profits</li><li>Using targeted marketing on customers separated based on behavior, demographic information, lifestyle</li></ul><p id="e0c2">Challenges: Due to both the lack of examples in biomedical image processing and complexity of images studied, segmentation can be incorrect and wrongly diagnose a patient. The classification of biomedical images almost always has to be verified by a certified professional after being segmented by ML.</p><p id="791b">The gravity of a single mistake in these applications varies. In cases where human lives are involved — arresting someone, identifying a terrorist, diagnosing a disease — responsible treatment of data and accurate results are of utmost importance. As AI increases its capabilities, we must ensure ethical standards for research, diverse data and proper checks.</p></article></body>

Challenges in Machine Learning Applications

Machine Learning adoption is widespread across all industries. Its ability to perform specific tasks better than humans creates tremendous opportunity to improve economic productivity. The global machine learning market is projected to attain a Compounding Annual Growth Rate (CAGR) of 43% between 2020 and 2024, growing from $7.3b to $30.6b. As practitioners of ML applications, we are all incentivized to focus on the benefits of this technology in order to procure investment and resources.

However, I think we need to pause and introspect on it’s shortcomings just as much as we tout its benefits. Recognizing that Machine Learning can impact lives at the press of a button — on such a large scale and in such rapid speed — we have a responsibility to make thoughtful decisions on its applications. We need to discuss the shortcomings of the models we build just as much as we espouse their benefits, to raise awareness and focus our resources towards building better, not necessarily more, artificial intelligence. After all, our children will live in this world we are creating.

In this blog, I’d like to broadly cover the use cases of Machine Learning, while also providing examples of where ML has gone wrong and the lessons we should learn from it.

Anomaly Detection

Successful anomaly detection hinges on an ability to accurately analyze time series data in real time. Time series data is typically a pair of two items: a timestamp for when a metric is measured, and the value associated with that metric. To detect outliers and spot anomalies real-time, a potential outlier is compared to peers, trends or a baseline.

Examples:

Detecting the appearance of an anomaly on earth’s surface or atmosphere using satellite imagery
Banks identifying unusual transactions and requesting customer verification
IT performance monitoring of resource consumption or network traffic

Challenges: Anomalous data is scarce and hard to define, both being large obstacles in collecting enough training data to improve a model’s ability to identify true anomalies.

Chatbot

Chatbots apply sophisticated Pattern Matching methods (i.e., Natural Language Understanding, Processing, Speech Recognition, and Sentiment Analysis) to understand user needs and provide a personalized reply.

Examples:

Messaging apps such as WeChat
Virtual Assistants such as Amazon Alexa or Baidu Xiaodu
Individual organizations’ apps such as Microsoft chatboy Tay

Challenges: Chatbots can be trained from biased historical data. Microsoft’s chatbot Tay was an experiment which learned from the people it interacted with on Twitter and the messaging apps Kik and GroupMe. Within 24 hours, it started making racist remarks.

Classification

Classification categorizes a data instance into a similar group. It does this by training historical data with known labels, and then applying an algorithm to automatically find categories for new data. In order to best map examples of input data to specific class labels, the training dataset must be sufficiently representative of the problem and have many examples of each class label, especially the minority.

Examples:

Image classification for disease diagnosis in healthcare
Text classification
Sentimental Analysis on customer reviews

Challenges: One of the primary concerns with classification is that the sample of data may not be adequately representative by not ascribing enough importance to minority cases. Amazon tried building an artificial-intelligence tool to help with recruiting, but it found the tool was unfavorable toward female candidates because it had combed through male-dominated résumés to accrue its data.

Computer Vision

Computer Vision extracts information from images and recognizes specific concepts. It can therefore perform tasks such as recognizing faces, characters in an image, or the location of an object in an image. It does this by learning from a visual database that has been annotated manually, and then running new images through a neural network composed of successive layers of neurons which associate the new image to familiar images.

Examples:

Authorized login, surveillance
Measuring headcount or displayed emotions
Driverless cars

Challenges: Image factors like illumination, occlusion or low resolution, among others can limit the potential of facial recognition to work optimally. In Michigan, a faulty facial recognition match led to a man’s arrest for a crime he did not commit. There have been several cases in the U.S. where an algorithm-based bot has accidentally identified a regular citizen as a criminal and has automatically suspended their driving license. Similar cases have also occurred at airports all over the world, where innocent people have been “recognized” as terrorists. In another case, a Tesla engaged in autopilot mode collided with a tractor-trailer, due to its inability to quickly distinguish the color between the background sky and vehicle.

Recommendation Systems

Recommendation systems use history and preferences to predict what users like. The recommendation function takes information about the user such as demographics, preferences and behavior (inferred through clicks, views and purchases) to predict the rating a user might assign to a product.

Examples:

Advertisements when surfing a website
Videos on Youtube, movies on Netflix or courses on Safaribooks
Social media posts

Challenges: The less data a recommendation system has about a user, the more likely it is to incorrectly assume their preferences. We have all been recommended content which is not personalized to us but simply the most popular.

Segmentation

Segmentation splits data into different clusters based on shared characteristics. Clustering uses algorithms to identify how different types of data are related and creates new segments when appropriate. Similarity between data is determined through distance measurements.

Examples:

Biomedical image processing of radiography, thermography or ultrasound to identify diseases
Segmenting businesses based on industry, size and profits
Using targeted marketing on customers separated based on behavior, demographic information, lifestyle

Challenges: Due to both the lack of examples in biomedical image processing and complexity of images studied, segmentation can be incorrect and wrongly diagnose a patient. The classification of biomedical images almost always has to be verified by a certified professional after being segmented by ML.

The gravity of a single mistake in these applications varies. In cases where human lives are involved — arresting someone, identifying a terrorist, diagnosing a disease — responsible treatment of data and accurate results are of utmost importance. As AI increases its capabilities, we must ensure ethical standards for research, diverse data and proper checks.