This context introduces autonomous AI agents such as Auto-GPT, BabyAGI, AgentGPT, Microsoft Jarvis, and ChaosGPT, which can break down complex tasks into smaller tasks and use various models to complete them.
Abstract
The context begins by discussing the limitations of current foundation models like ChatGPT, which struggle with efficiently completing laborious tasks made up of distinct steps. It then introduces autonomous AI agents as a solution to this problem, explaining that they can take user input, break it down into smaller tasks, and tackle them one at a time. The context provides examples of several autonomous AI agents, including AgentGPT, Auto-GPT, BabyAGI, and Microsoft Jarvis, and discusses their features and capabilities. It also mentions ChaosGPT, an autonomous agent that aims to destroy humanity, and highlights the potential risks associated with these agents.
Bullet points
Current foundation models struggle with efficiently completing complex tasks made up of distinct steps.
Autonomous AI agents can take user input, break it down into smaller tasks, and tackle them one at a time.
Examples of autonomous AI agents include AgentGPT, Auto-GPT, BabyAGI, Microsoft Jarvis, and ChaosGPT.
AgentGPT is a browser-based platform that requires no installation or OpenAI keys for test usage.
Auto-GPT is an open-source agent library based on GPT-4 that chains together LLM "thoughts" to autonomously achieve tasks.
BabyAGI is an AI-powered task management system that can autonomously solve tasks based on the results of previous tasks.
Microsoft Jarvis is a collaborative system that uses a Large Language Model (LLM) as the central controller and various expert models as collaborative executors.
ChaosGPT is an autonomous agent that aims to destroy humanity, highlighting the potential risks associated with these agents.
Autonomous AI agents
ChatGPT, Next Level: Meet 10 Autonomous AI Agents: Auto-GPT, BabyAGI, AgentGPT, Microsoft Jarvis, ChaosGPT & friends
The ultimate curated list of autonomous AI agents: complete with tools, resources and examples
ChatGPT and many of the other current foundation models are great. They can answer innumerable questions, create AI art that rivals human masterpieces, analyze photos, and in some cases, they even show what we would call intelligence.
But there’s one simple challenge they’ve yet to conquer — to efficiently complete a laborious task made up of distinct steps.
Currently, AI models are like eager office interns, tireless and enthusiastic but desperately in need of guidance. They require monitoring, frequent directions, and vigilance against fudging or half-truths (aka “hallucinations”).
This is where AI agents step in. Agent AIs can do this autonomously. These autonomous helpers take user input, break it down into smaller tasks with the assistance of LLMs, and tackle them one at a time. The agents store the results and use them, if necessary, for subsequent steps in the process. As a result, AI Agents can handle complex tasks and access various foundation models that are not limited to language alone. For example, an agent might independently decide to utilize code, video or voice models; employ search engines or calculation tools to accomplish the task you’ve given it.
Image Credit: Maximilian Vogel
The autonomous agents are not simply smarter than the foundation models on which they are based, but open up a completely new dimension: They are capable of “slow thinking” (Kahneman’s “system 2”). They solve complicated questions — in which one crawls to the goal bit by bit via intermediate results. Until now, slow thinking was only possible for LLMs via prompting techniques such as chain-of-thought, and here also only to a very limited extent.
While the addressable level of complexity does not increase significantly with agent AIs, they cover an incredible amount of additional area in the problem space (dashed red box) due to their ability to solve complicated problems: In other words, everything that requires more than a few steps to solve.
Content:
Intro: What are Autonomous AI Agents?
Must-know AI Platforms: A deep-dive into AgentGPT, Auto-GPT, BabyAGI, Jarvis & more, resources included
The Completely Incomplete List of AI Agent Platforms
Outlook: From sterile AI to powerful and dangerous agents
Intro — What are autonomous agent AI
Let’s say we want to use an AI model to create a deck of 52 cards, with each card featuring a different musician. We’d also like to substitute the usual card suits such as clubs or hearts to different music genres, such as soul or house.
Is it possible for an AI model to complete such a complex task?
The simple answer is no.
While a language model can compile a list of genres and artists, we need at least one additional model (AI art model such as Midjourney) to produce the visuals. We may also need additional systems to search the internet and to store contents.
We could write a batch processing script doing all this.
Or — and here our agent AIs fly in — we could just provide a prompt telling what we want to do, and the agent writes the batch script, executes it and monitors the outcome.
Usually, AI agents use both for the single steps (ie. selecting an artist for a single card) and for framework tasks (ie. generating a task list) various external models. Theyare outsourcing the thinking steps while storing information, tracking tasks, managing interface and orchestrating the entire process.
Image credit: Maximilian Vogel, note: This is an illustrative example only — the results of most current AI agents are not as overwhelming.
Autonomous AI agents have only emerged in the last few weeks, but they’re already developing at breakneck speed. Even Microsoft is getting in on the action with Jarvis / HuggingGPT. I’ll give a brief introduction to some of the main AI agents and discuss possible impacts on application development, along with AI safety.
AgentGPT
Assemble, configure, and deploy autonomous AI Agents in your browser.
This is the first model in the list, not because it is the most important, but because no installation or OpenAI keys are needed.
Let’s deep-dive into how AgentGPT managed a job I gave it:
My task: „Find the 3 most widely used task management software tools for usage in a small company and compare them in terms of price, scope, ease of installation“
Reasoning:
Image credit: Maximilian Vogel / AgentGPT
Some intermediate output:
Image credit: Maximilian Vogel / AgentGPT
Many, many more lines of output later, we have the final result (the whole process took approximately 3 minutes):
Image credit: Maximilian Vogel / AgentGPT
Auto-GPT
An experimental and open-source agent library based on GPT-4. It chains together LLM “thoughts” to autonomously achieve whatever task you set. Auto-GPT is one of the first platforms to run GPT-4 fully and autonomously, pushing the boundaries of what is possible with AI.
Features:
Accesses the internet for queries and gathering information
Demo-Task: Look for a seasonal event on the internet and create a recipe for it.
Baby AGI
Baby AGI is an AI-powered task management system. The system uses OpenAI and Pinecone APIs to create, prioritize, and execute tasks. The appeal of Baby AGI is in its ability to autonomously solve tasks based on the results of previous tasks and to keep a predefined objective. It also prioritizes tasks efficiently.
Mode of work:
Pulls up the first task from the task list.
Sends the task to the execution agent, which uses OpenAI’s API and Llama to complete the task based on the context.
Enriches the result and stores it in Pinecone.
Creates new tasks and reprioritizes the task list based on the objective and the result of the previous task.
Task example: Find popular topics that don’t have enough documentation, for articles for my Linux tutorial blog:
Task Example: Plan a romantic dinner for my wife this Friday night in central Singapore:
JARVIS / HuggingGPT
Jarvis, or HuggingGPT, is a collaborative system comprising a Large Language Model (LLM) as the centralcontroller and numerous expert models as collaborative executors, sourcedfrom the Hugging Face Hub. This agent can employ LLMs as well as other models. The workflow of the system consists of four stages:
• Task Planning: Uses ChatGPT to analyze user requests to discern intent and breaks them down into manageable tasks.
• Model Selection: To solve the given tasks, ChatGPT selects the best suited expert models from Hugging Face, based on their descriptions.
• Task Execution: Invokes and executes each selected model, subsequently returning the results to ChatGPT.
• Response Generation: Finally, it uses ChatGPT to integrate the prediction of all models, and generate a comprehensive response.
AutoGPT.js
Create a custom AI agent, name it and assign it a mission for any goal you can imagine — all while running within the browser. Watch while it generates tasks, executes them, and learns from the outcomes for optimal results.
Free-AUTO-GPT-with-NO-API
Free AUTOGPT with NO API is a repository that offers a simple version of Autogpt, an autonomous AI agent capable of performing tasks independently. Unlike other versions, the implementation does not rely on any paid OpenAI API, making it accessible to anyone.
Agent-LLMAgent-LLM is an Artificial Intelligence Automation Platform designed to power efficient AI instruction management across multiple providers. The agent are equipped with adaptive memory, and this versatile solution offers a powerful plugin system that supports a wide range of commands, including web browsing. With growing support for numerous AI providers and models, Agent-LLM is constantly evolving to empower diverse applications.
AutoGPT-Next-Web1. Free one-click deployment with Vercel in 1 minute
2. Improved local support: After typing in Chinese, the content will be displayed in Chinese instead of English
3. UI designed to match AgentGPT, responsive design, and support for dark mode
4. Have your own domain? Even better, after binding, you can quickly access it anywhere without barriers
5. Support access code control, only you or trusted individuals can use the website
MiniGPT-4
Enhancing Vision-language Understanding with Advanced Large Language Models
Micro-GPT
MicroGPT is a simple and effective autonomous agent compatible with GPT-3.5-Turbo and GPT-4. It combines robust prompting , a minimal set of tools and short-term memory (Chain of Thoughts). Data augmentation via vector stores will be added soon.
Teenage-AGI
An(other) OpenAI and Pinecone-based agent. Process steps when getting a user query:
AI vectorizes the query and stores it in a Pinecone Vector Database
AI looks inside its memory and finds memories and past queries that are relevant to the current query
AI thinks about what action to take
AI stores the thought from Step 3
Based on the thought from Step 3 and relevant memories from Step 2, AI generates an output
AI stores the current query and its answer in its Pinecone vector database memory
CamelTo address the challenges of achieving autonomous cooperation, Cameleers have introduced a new communicative agent framework which is called role-playing.
ai-legionAI Legion is an LLM-powered autonomous agent platform.
Tools:
XircuitsThe Xircuits toolkit provides a comprehensive set of components for experimenting and creating Collaborative Large Language Model-based automatons (Agents) in the style of BabyAGI and Auto-GPT. By default, the toolkit comes with BabyAGI agents, but it can easily be modified to accommodate your own custom prompts.
gptrpgContains: 1. A basic RPG-like environment for an LLM-enabled AI Agent to inhabit . 2. A basic AI Agent connected to the OpenAI API to exist in the environment, serving as a proof of concept.
SFighterAI
SFighterAI features an AI agent trained through deep reinforcement learning to defeat the final boss in ‘Street Fighter II: Special Champion Edition’. The AI agent makes decisions based solely on the game screen’s RGB pixel values, achieving a 100% win rate in some scenarios.
Doomsday agents
ChaosGPT
ChaosGPT, aka the autonomous agent trying to destroy humanity. It has failed miserably so far due to lack of access to weapons of mass destruction. It is nevertheless fascinating to observe its attempts for world domination, especially as its underlying models are trained on humanity’s collective ideas about the topic.
Let’s hope it runs out of OpenAI tokens before achieving its goal…it seems to have been eerily quiet in the last days.
Some final thoughts: From sterile AI to powerful and dangerous Agents
AI agents offer more than just an improvement on foundation models—they add a new dimension altogether. Whilethey do not outperform classic foundation models when it comes to executing simple and specific tasks, they excel in breaking down complex tasks into smaller ones.
If foundation models get better in the future, they will not replace AI agents, but make them more powerful still.
Autonomous agents can:
Integrate different types of models (language, code, AI art, strategy and many more)
Integrate non-foundation model components such as search engines and calculation engines.
Branch into task sub-branches
Verify and rewrite output from one model using another model
Try something, check the results, accept it if it works or try something different
Run continuously and process ongoing input (e.g., controlling a running system over the time)
In short — AI agents allow the development of business applications without much need for software development. High-level processing schemas can, to a large extent, be specified in natural language, just as the granular description for a single task can be given to a foundation model.
Classic foundation models like GPT-4 are rather “sterile”in a positive sense. Meaning that if we initiate a task we won’t end up with an unintended sequence of actions with potentially disastrous outcomes. GPT-4 simply answers your question and reverts back to base after the end of a session. This is in contrast to how AI agents of the future could behave. Based on user instructions, they could set off a chain of actions that we have no way of anticipating.
A future AI agent (unlike the ones we discussed above) could create plans and actions which are beyond the control of any human.
If an agent is connected to the internet, it may do things it considers necessary to complete a task — things that were not intended by a human user — such as hacking into cloud systems to retrieve information. If an agent is able to train models or configure future instances of itself in order to complete tasks, a huge AI alignment problem could arise: Systems may emerge which are far beyond human control.
If you know a some more agent AIs, or resources or a story about autonomous AIs please drop me a note (e.g., responding to this article). Hit the speech bubble down here: ↓🗨