avatarChristianlauer

Summary

Google BigQuery ML now supports Explainable AI, allowing users to understand and explain the predictions made by their machine learning models, which is crucial for transparency, bias detection, and compliance with non-discrimination laws.

Abstract

The integration of Explainable AI into Google BigQuery ML provides users with insights into how individual features contribute to the predictions of their machine learning models. This feature is particularly useful for classification and regression tasks, enabling users to verify that models behave as expected, identify potential biases, and find ways to enhance model performance and data quality. The article emphasizes the importance of explainability in machine learning, not only for improving models but also for meeting legal requirements to prevent discrimination, such as in lending algorithms. Examples and tutorials are provided to guide users through the process of explaining model predictions, including a SQL-based approach for generating explanations from a linear model.

Opinions

  • The author views Explainable AI as a "super feature" that enhances the comprehensibility of models and aids in identifying areas for model improvement.
  • There is an appreciation for the legal necessity of explainability in machine learning models to avoid discrimination.
  • The article suggests that the ability to explain model predictions strengthens the trend of integrating machine learning directly with data.
  • The author highlights the practicality of Explainable AI through a tutorial on predicting penguin weight, showcasing how the feature can be applied in real-world scenarios.

Using Explainable AI in BigQuery ML

Google BigQuery now supports Explainable Artificial Intelligence for your Models

Photo by Universal Eye on Unsplash

BigQuery ML makes it easy to design machine learning models using SQL. You can find a Tutorial here [1]. With Explainable AI you can know let you explain the result.

Why you wanna know?

When you design machine learning models, you naturally want to know why you behave the way you do. Often, this may also be required by law, for example, discrimination must not occur. An example is lending, an algorithm that might discriminate against a gender or ethnicity.

Metrics of a BigQuery ML Model— Image by Author

How to use the Feature

The feature help you to understand the results that your predictive machine-learning model generates for classification and regression tasks by defining how each feature in a row of data contributed to the predicted result. This information can be used to control the model is behaving as expected, to recognize biases and to inform ways to improve your model and your training data [2].

After you

  • create
  • evaluate
  • predict

your model you can then let you explain the results. For explaining a linear model you would work with the following example statement:

SELECT
  *
FROM
  ML.EXPLAIN_PREDICT(MODEL `mydataset.mymodel`,
    (
    SELECT
      label,
      column1,
      column2,
      column3,
      column4,
      column5
    FROM
      `mydataset.mytable`), STRUCT(3 AS top_k_features))

A full example with predicting penguin weight and then using Explaining AI you can find on Google [3].

You when would get the outputs of the top feature attributes (in this case three) per row of the table:

Example — Image by Google [3]

This is a good way to find out which attributes have how strong an effect on your model.

Summary

All in all, this is a super feature. On the one hand, it makes models comprehensible and, on the other, it reveals potential for improvement. On the other hand, it may even be legally necessary to check the results in order to avoid discrimination, for example. It makes BigQuery and BigQuery ML a bit more powerful. It also strengthens the trend of Bring Machine Learning to the Data.

Sources and Further Readings

[1] Christian Lauer, Bring Machine Learning to the Data (2021)

[2] Google, BigQuery Explainable AI Overview (2022)

[3] Google, Using BigQuery ML to predict penguin weight (2022)

Data Science
Machine Learning
Artificial Intelligence
Bigquery
Sql
Recommended from ReadMedium