avatarValentina Alto

Summary

This article introduces the Assistants API, a feature from OpenAI that allows developers to create AI assistants within their applications, and provides examples of using the Code Interpreter tool with Azure OpenAI.

Abstract

The Assistants API is a feature from OpenAI that enables developers to build AI assistants within their applications. These assistants can understand and respond to user queries using natural language. The article explains the components of an assistant, such as threads, messages, runs, and run steps. It also covers the three types of tools supported by the Assistants API: Code Interpreter, Function calling, and Retrieval. The Code Interpreter tool lets developers write and run Python code in a safe environment, while Function calling allows them to define and call custom functions. The article then provides examples of building an assistant using the Code Interpreter tool with Azure OpenAI, using structured data, unstructured PDFs, and function calling.

Opinions

  • The Assistants API is a powerful tool for developers to create AI assistants within their applications.
  • The Code Interpreter tool is particularly useful for solving math problems, analyzing data, and generating charts and graphs.
  • The Function calling tool allows developers to perform calculations, conversions, or other tasks.
  • The article provides clear examples of using the Code Interpreter tool with Azure OpenAI, demonstrating its capabilities with structured data, unstructured PDFs, and function calling.
  • The article emphasizes the importance of understanding the components of an assistant, such as threads, messages, runs, and run steps, to build powerful conversational AI systems.

Getting Started with Code Interpreter in Azure OpenAI

An example with the new Assistant APIs

🥁Drumroll🥁

After months of waiting, OpenAI Assistants APIs are now available on Azure! You can now build your own AI assistants seamlessly, combining all the necessary building blocks — system message, your data, plug-ins and function callings — in one environment: Azure OpenAI.

But what are exactly Assistants API and why there was so much hype around them?

Introduction to Assistants API

The Assistants API is a feature first released by OpenAI that lets you create and run AI assistants within your own applications. An AI assistant is like a smart helper that can understand and respond to user queries using natural language.

Assistants are made of different components:

  1. Assistant: The assistant is the heart of the system — an AI model designed to understand and generate human-like responses. It’s the brain behind the scenes, capable of handling various tasks, answering questions, and engaging in conversations. You can configure your assistant with a name, instructions, LLM behind, and tools.
  1. Thread: A thread represents a conversation session between the assistant and a user. Imagine it as a chat window or a dialogue container. When you start a conversation, you create a thread. All subsequent messages exchanged during that conversation are part of the same thread. Threads help maintain context and keep track of the ongoing discussion.

Note. To prevent the conversation from becoming too long for the model’s context, threads automatically handle truncation. When the size of messages exceeds the model’s context window, the thread includes as many relevant messages as possible while dropping the oldest ones. This ensures that the context remains manageable in terms of tokens from the LLM.

  1. Message: A message is a unit of communication within a thread. It can be a user query, an instruction, or any content exchanged between the user and the assistant. Messages build up the conversation history. Each message contributes to the context, allowing the assistant to understand and respond appropriately.
  2. Run: A run refers to a single interaction with the assistant. It includes one or more messages exchanged back and forth. A run begins when the user initiates a conversation (creates a thread) and ends when the conversation concludes. Runs encapsulate the entire flow of communication.
  3. Run Step: Within a run, we have run steps. These steps break down the conversation into smaller chunks. For example, if the user sends a long paragraph, the assistant processes it in smaller segments (run steps). Each step involves understanding the input, generating a response, and maintaining context.

In summary, an assistant operates within threads, communicates through messages, and executes runs composed of run steps. Understanding these components helps developers build powerful conversational AI systems!

You can customize your AI assistant by giving it instructions, choosing a model, and enabling tools.

Definition

A tool is what enables an LLM to perform a specific task or function. For example, the Code Interpreter tool allows me to write and execute Python code. The Retrieval tool allows me to search and find information from various sources. The Function calling tool allows me to define and use custom functions. Tools help me to provide more useful and relevant responses to the user.

As of today, Assistants API support three types of tools:

  • Code Interpreter: This tool lets you write and run Python code in a safe environment. You can use it to solve math problems, analyze data, or generate charts and graphs. This is extremely powerful in the moment you need to analyze structured data such as excel files.
  • Function calling: This tool lets you define and call custom functions. You can use it to perform calculations, conversions, or other tasks.
  • Retrieval (coming soon): This tool lets you search and retrieve information from documents or web pages. You can use it to answer questions, provide facts, or suggest solutions.

The Assistants API makes it easy for you to manage the conversation state and context. You can create a Thread for each user and add Messages to it as the user asks questions. The Assistants API will automatically call the relevant tools and generate responses for you.

In this article, we are going to cover the Code Interpreter capabilities.

Building your first Assistants with Code Interpreter

Now that you know what Assistant APIs are, let’s build your first assistant. In this section, we are going to cover three examples: structured data, unstructured PDFs and function calling.

Prerequisites

In order to be able to create your first assistant, you will need the following prerequisites:

  • An Azure subscription — Create one for free.
  • Access granted to Azure OpenAI in the desired Azure subscription (you can compile the form here)
  • An Azure OpenAI resource with gpt-4 (1106-preview) model deployed.

Finance Assistant

Data Used: Financial Sample Excel workbook

In this scenario, we are going to leverage Code Interpreter as tool. You can enable it with the slide button:

You can then upload your .csv file by clicking “Add files”:

And that’s it! Let’s now ask some questions to our Finance Assistant.

We can also ask the code the Assistant used to run the request:

We can also generate a line plot:

Cool! It worked perfectly on our structured table.

Let’s now challenge the Assistant even more with another example.

Market Analyst

Data: Bitcoing Price (USD)

This is the configuration I’ve used:

Now, let’s start asking some visual graphs:

We can also ask some basic statistics, like the day when the highest volume of transactions occurred (including the code needed to retrieve this information):

Again, the assistant was able to run the proper Python code to produce the answer to our query.

Quality checker

Data used: Apple Quality

In this last example, we will see how an Assistant can be configured to act as a professional quality checker. For this purpose, we will use a binary classification dataset about apples’ quality, based on their features.

Let’s start by asking some basic statistics about this dataset.

We can also see some basic graphs about those statistics:

Now that we’ve an idea of the content of the dataset, let’s see whether it is able to provide us with a forecast of an apple’s quality, given some independent variables (which are the apple’s features in the dataset):

Cool! The assistant was able to forecast our apple’s quality. However, we might want to delve deeper into the reasoning behind it. Let’s ask the assistant to show us the code behind its response:

As you can see, the assistant was able to tell us the forecasted quality of our apple, using a logistic regression ML model (in fact, the task was a binary classification one). Plus, it was able provide us with the code it ran to get to the result (as a data scientist, I would always double check whether the model leveraged the proper algorithm with the proper configuration!).

Conclusions

The Assistants API empowers developers to create intelligent and responsive AI assistants within their own applications. These assistants can understand user queries, leverage powerful models, and provide context-aware responses. Here are the key takeaways:

  1. Versatility: The Assistants API supports three essential tools — Code Interpreter, Retrieval, and Function calling — allowing developers to tailor their assistants’ capabilities to specific use cases.
  2. Customization: Developers can define an Assistant’s behavior through instructions, choose from various GPT-3.5 and GPT-4 (also with vision) models, and even enable fine-tuned models in the future.
  3. Integration Flow: The typical integration involves creating an Assistant, starting a Thread for user conversations, adding Messages to the Thread, and running the Assistant to trigger responses. The API is actively evolving, with more features on the horizon.
  4. Playground Exploration: Developers can experiment with the Assistants API using the Assistants playground, which provides a no-code environment for learning and testing.

In summary, the Assistants API opens up exciting possibilities for building human-like conversational agents, whether you’re creating a math tutor, a customer support chatbot, or any other interactive application.

Stay tuned to see what will come next!

References

Assistants Api
Azureopenai
Code Interpreter
Llm
Gpt 4
Recommended from ReadMedium