avatarFS Ndzomga

Summary

The article discusses the creation of a ReAct (Reasoning and Acting) AI agent using Python, focusing on improving the reasoning abilities of LLMs.

Abstract

The article begins by explaining what AI agents are and their potential impact on daily life. It then discusses the challenges with current LLM-based agents, such as reasoning and reliability, and introduces the ReAct strategy to address these issues. The author explains the steps involved in the ReAct chain, including thinking, choosing an action and a tool, determining the input for the action, taking action and observing the result, and determining whether to end the reasoning chain or not. The article provides code snippets and examples for implementing the Agent class, using OpenAI's LLM as the brain, and utilizing the anonLLM library to generate structured JSON outputs. The author also discusses the importance of tools for the agent to take action and receive feedback, providing examples of tools like Wikipedia search and calculation. Finally, the article concludes with references to further reading on the topic.

Opinions

  • The author believes that AI agents will significantly impact daily life, with the potential to delegate workflows and achieve faster results.
  • The author suggests that the ReAct strategy can improve the reasoning abilities of LLM-based agents by combining reasoning and acting, which can help ground LLM's responses.
  • The author emphasizes the importance of structured JSON outputs for communication between different components of the Agent stack, using the anonLLM library to handle this task.
  • The author provides several examples of tools that can be added to the agent, highlighting the versatility of the Agent class and its potential applications.

Building a ReAct (Reasoning and Acting) AI Agent From Scratch Using Python

Photo by Possessed Photography on Unsplash

An AI agent is a system or software that can autonomously perform actions or tasks to achieve specific objectives, based on the environment it operates within and the inputs it receives. AI agents are designed to make decisions and act upon them.

AI agents are probably the next innovative way AI will change our daily lives. With AI agents, we can completely delegate some workflows and get results faster.

The challenge with AI agents is reasoning and reliability. Most AI agents that have emerged recently are based on LLMs. But LLMs are not reasoning engines per se. Only GPT-4 is quite advanced in faking reasoning.

To improve the reasoning abilities of LLMs, several prompting strategies have been invented, including, but not limited to chain of thoughts, tree of thoughts, and ReAct (Reasoning and Acting).

In this article, I will show you how to implement a ReAct AI agent in python from first principles.

ReAct is a strategy that combines reasoning and acting as a way to increase overall reliability and fact-fullness of LLM-based Agents, since acting and getting feedback from actions can help ground LLM’s responses.

Creating the Agent

The first step is to create the agent class. The agent has to be defined by an LLM that will act as his brain. The agent should also have access to a set of tools that will help it take action and receive feedback from those actions. Finally, the agent will solve every problem-solving task by going through a “ReAct chain,” including the following steps: thinking, choosing an action and a tool, determining the input for the action, taking action and observing the result, determining whether to end the reasoning chain or not.

ReAct Chain

In the Agent class definition below, you will notice that I use an OpenAI’s LLM as the Brain of my agent. To do so I use the anonLLM library that I developed a few months ago. The core goal of the anonLLM library is to hide personally identifiable information from LLM APIs, but here I will mainly use it as a wrapper around the python client of the OpenAI’s API. The advantage of using anonLLM stems from its ability to generate structured JSON outputs.

Why is it important? Because we need a way to communicate information between the steps of a ReAct chain. For example, when defining the parameters of an action, you want to be able to pass these parameters directly to the action function. But an LLM generates text with unpredictable formatting, making it difficult to exploit the results using regex parsing, for example. Generating structured JSON outputs allows us to streamline the communication process between different components of our Agent stack. The anonLLM library will handle that for us.

To generate a structured JSON output using anonLLM, we will just have to define and pass a BaseModel Class as output_format, like this:

Here is how I first defined the Agent class. The agent basically loops through a ReAct chain until it finds the answer to the user request. This is the first implementation, I will make it even more robust in future articles.

The Agent class is designed to interact with a language model (referred to as Brain), process requests, and utilize various tools to generate responses. Here's a breakdown of its functionality:

Initialization (__init__):

  • The agent is initialized with an API key and a model specification (defaulting to ‘gpt-4’).
  • self.brain is an instance of Brain, which seems to be a language model interface (possibly for OpenAI's GPT model).
  • The agent maintains a list of tools (instances of the Tool class), messages for tracking conversation history, and manages tokens to adhere to API limits.

Adding Tools (add_tool):

  • Tools are added to the agent using this method. Each tool is an instance of the Tool class, encapsulating a specific functionality (like Wikipedia search, calculation, etc.).

Message Handling (append_message):

  • This method adds messages to the agent’s history, managing the token count to stay within limits. Older messages are removed if the token limit is exceeded.

Reacting to Input (react):

  • This is the entry point for interacting with the agent. It receives an input request, appends it to the messages, sets the request, and initiates the thinking process.

Think Process (think):

  • The agent generates a thought process based on the current request and background information (previous messages). It uses self.brain to generate a response to the prompt.

Action Choice (choose_action):

  • After thinking, the agent chooses an action (a tool to use) based on the request and background info. This choice is made by generating a response using self.brain, formatted as a ToolChoice.

Executing Action (action):

  • The agent executes the chosen action by calling the act method of the selected tool. It first determines the required inputs for the tool function by inspecting its parameters.

Observation and Response (observation):

  • After performing an action, the agent observes the results and decides whether it has enough information to provide a final answer. If not, it goes back to the thinking process.

Tools

After defining, the agent, I needed to define some tools that the agent will use. A tool is basically a function that represents the action that will be taken, for example, performing a calculation, doing a search on Wikipedia or retrieving the current date.

Here is the definition of the Tool class:

Here is how I concretely defined tools and added them to an agent before testing it:

And finally, here are the reasoning traces of the agent:

Not that bad, isn’t it ?

Now, a lot more can be done to strengthen the reasoning abilities of this AI agent, like implementing RAG inside the agent, giving it access to the internet and more. The goal of this article was just to give you a taste of what can be done to improve the reasoning capabilities of AI agents.

Happy coding !

Full code

References for the extra curious

[1] J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. Chi, Q. Le, D. Zhou, Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (2022), arXiv:2201.11903 [cs.CL]

[2] S. Yao, D. Yu, J. Zhao, I. Shafran, T.L. Griffiths, Y. Cao, K. Narasimhan, Tree of Thoughts: Deliberate Problem Solving with Large Language Models (2023), arXiv:2305.10601

[3] S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, Y. Cao, ReAct: Synergizing Reasoning and Acting in Language Models (2022), ArXiv:2210.03629

Ai Agent
Artificial Intelligence
OpenAI
Python
React
Recommended from ReadMedium