avatarChristianlauer

Summary

ChatGPT can assist in data analysis tasks, such as generating BigQuery SQL scripts, potentially alleviating the shortage of data professionals.

Abstract

The article discusses the potential of ChatGPT, a chatbot developed by OpenAI, to assist in data analysis tasks using BigQuery SQL. The author explores the possibility of ChatGPT replacing some jobs, but also emphasizes its potential to enrich companies and perform tasks that are currently difficult for humans. The author provides examples of ChatGPT generating scripts for pivoting data and creating random numbers in BigQuery SQL, demonstrating its usefulness for data analysts and business users. The article concludes that AI technologies like ChatGPT can make work easier, solve problems, and aid in learning programming languages.

Bullet points

  • ChatGPT is a chatbot developed by OpenAI, built on top of the GPT-3.5 family of large language models.
  • The author discusses the potential of ChatGPT to replace some jobs, but also emphasizes its potential to enrich companies and perform tasks that are currently difficult for humans.
  • The author provides examples of ChatGPT generating scripts for pivoting data and creating random numbers in BigQuery SQL.
  • ChatGPT can assist data analysts and business users in performing data analysis tasks.
  • AI technologies like ChatGPT can make work easier, solve problems, and aid in learning programming languages.

Can ChatGPT eliminate the Shortage of Data Professionals?

How you and Companies can benefit from AI — Use case with BigQuery SQL

Photo by Alexander Shatov on Unsplash

This is not an article about ChatGPT and how it will make you rich or destroy the world, but a practical example how such AI systems can enrich especially companies and can actually do amazing things.

ChatGPT is a chatbot launched by OpenAI in November 2022. It is built on top of OpenAI’s GPT-3.5 family of large language models, and is fine-tuned with both supervised and reinforcement learning techniques. — Wikipedia [1]

I have read about ChatGPT before and I knew what it was about, but now I was made even more curious about it by a video of the well-known psychologist Jordan Peterson. He predicts that in the coming years ChatGPT will turn the world upside down and will cost many their jobs. What of it is really true? I think that AI will replace some jobs, we can see it already, but maybe this whole up side down thing it is a bit exaggerated here.

In addition, it must be said that AI can take over many activities that currently cannot be fulfilled by humans. So I simply took a use case of data analysis with BigQuery SQL. Can ChatGPT really provide me useful scripts here? It would be not wrong if Data Analysts or even people from the business department can perform better data analysis if AI supports them in this. There are many unfilled positions here and especially people who may want to perform data analysis but sometimes just need instant help instead of waiting for answers on Stackoverflow or Reddit, maybe ChatGPT is a better alternative? By the way, I chose BigQuery SQL because I work with it the most. However, other programming languages, frameworks, etc. are of course just as usable with this technology.

Use Case with BigQuery SQL

So, let’s say that you are a business user and would like to analyze data with BigQuery. Companies are becoming more and more data-driven, so people from all areas are equipped with (self-service) BI or even advanced services like Google BigQuery. Here, it may be the case that you know the question but just can’t think of the appropriate code, so let’s ask ChatGPT if it can help us with some things.

First Question:

How to pivot in BigQuery SQL — can you generate a dummy script?

Thats’s what ChatGPT gives me back:

WITH data AS (
 SELECT ‘A’ AS category, 1 AS value UNION ALL
 SELECT ‘A’, 2 UNION ALL
 SELECT ‘A’, 3 UNION ALL
 SELECT ‘B’, 4 UNION ALL
 SELECT ‘B’, 5 UNION ALL
 SELECT ‘C’, 6
)
SELECT * FROM data
PIVOT (
 SUM(value)
 FOR category IN (‘A’ AS A, ‘B’ AS B, ‘C’ AS C)
)
Result of the generated Query — Image by Author

So that’s quite impressing, because it really works with no errors and it also gives me a very good description of the script and the PIVOT operator. It is especially cool that it even highlights code accordingly so that I don’t have to this this by myself anymore :)

This script creates a sample data set with three categories (A, B, and C) and corresponding values. The PIVOT operator is then used to pivot the data and sum the values for each category. The resulting table will have three columns, one for each category, and will contain the sum of the values for each category.

To use the PIVOT operator, you'll need to specify the following:

The input table: In this example, the input table is data. …..— Generated by ChatGPT [2]

Second Question:

Can you create me a BigQuery SQL script which creates random numbers for me?

This it what Chat GPT gives me back:

For example, to generate 10 random numbers between 1 and 100, you could use the following query:

CREATE TEMPORARY TABLE random_numbers AS
SELECT CAST(RAND() * (1001 + 1) + 1 AS INT64) AS random_number
FROM UNNEST(GENERATE_ARRAY(1, 10))
WHERE RAND() <= 1;

To use the script, you’ll need to specify the following parameters:

min: The minimum value of the random number range.

max: The maximum value of the random number range.

num: The number of random numbers to generate.

probability: The probability that a random number will be generated (this can be used to generate a subset of the numbers in the range). — Generated by ChatGPT [2]

And again a clean script with a great description, which helps me in my work.

Conclusion

As you can see above, this technology does really work and it actually produces working code. Of course, this also works with other languages and systems. My goal with this article was not only showing what ChatGPT can do and that it can produce code but rather what this means for the work of people. Namely, that it can support me with the creation of scripts, in this case to evaluate data. Especially in times where professionals are scarce and people may want to educate themselves in this area, systems like ChatGPT can be extremely helpful. It enables people to analyze data even without great programming skills. Also it is versatile, since not only people from the field of Data Science can work with ChatGPT, also other departments such as Marketing can also benefit from this AI technology. They can, for instance, use this for having the right CSS script for the adaptation of the company website.

So you can really say that such systems will make work easier in the future, can solve problems and they are great for learning programming languages.

Sources and Further Readings

[1] Wikipedia, ChatGPT (2022)

[2] ChatGPT (2022)

Data Science
ChatGPT
Artificial Intelligence
Technology
Business
Recommended from ReadMedium