Workshop 5: Build AutoGen Collaborative Autonomous Agents

What is AutoGen

AutoGen is a framework that enables the development of LLM applications using multiple agents that can converse with each other to solve tasks. AutoGen agents are customizable, conversable, and seamlessly allow human participation. They can operate in various modes that employ combinations of LLMs, human inputs, and tools.

AutoGen enables building next-gen LLM applications based on multi-agent conversations with minimal effort. It simplifies the orchestration, automation, and optimization of a complex LLM workflow. It maximizes the performance of LLM models and overcomes their weaknesses.
It supports diverse conversation patterns for complex workflows. With customizable and conversable agents, developers can use AutoGen to build a wide range of conversation patterns concerning conversation autonomy, the number of agents, and agent conversation topology.
It provides a collection of working systems with different complexities. These systems span a wide range of applications from various domains and complexities. This demonstrates how AutoGen can easily support diverse conversation patterns.
AutoGen provides enhanced LLM inference. It offers utilities like API unification and caching, and advanced usage patterns, such as error handling, multi-config inference, context programming, etc.

AutoGen is powered by collaborative research studies from Microsoft, Penn State University, and the University of Washington. AutoGen: Documentation, Github

Diverse Applications Implemented with AutoGen

The figure below shows six examples of applications built using AutoGen.

Find a list of examples in this page: Automated Agent Chat Examples

Building AutoGen Collaborative Autonomous Agents

Author: Madhav Arora https://medium.com/@madhavarora1988/autogen-driving-innovation-through-collaborative-autonomous-agents-28dfa9d9b0c5 Introduction

Microsoft released an open-source framework for creating configurable and customizable agents. These agents can engage with each other in meaningful conversations to collaborate on tasks or solve problems. In this article, we will explore what this framework provides and also walk through a custom use case to demonstrate the power of this framework.

What is an Agent?

An Agent is a programmatic construct, an entity powered by the underlying LLM model. AutoGen enables the creation of such Agents and provides an environment in which they can converse or interact with each other. An Agent has access to tools that allow it to perform web searches or execute code, like inside a Docker container. The basic idea is to have a team of Agents specialized in specific domains, each with access to special tools, and collaborating with each other to complete a task. Imagine a team consisting of a designer, programmer, manager, and product owner, each powered by a model or a human, working together toward a single goal.

What are different types of Agents?

There is an Agent type called ‘UserProxyAgent,’ which acts as a proxy for a user. It can be configured to be backed by a model or by a human (you) if you would like to provide constant direction to the team of Agents. Another type is the ‘AssistantAgent,’ which excels at writing code. You can create more agents by configuring them with specialized models, whether they run on the cloud or on your local machine (cost-saving, yay!).

What does a simple multi Agent Conversation look like ?

In the conversation below, a Human assigns a task, and then UserProxyAgent and AssistantAgent collaborate on it. There is a provision for the Human to become involved in the conversation to guide it in a specific direction.

Let’s solve a custom problem using this framework.

The Problem

Sometime back, I wrote an article that explains how to obtain API keys for Google and OpenWeather APIs. I would like my Agents to ‘retrieve the year-to-date stock performance for a publicly traded company mentioned in the article.’ As you can see, the request is straightforward, but the conversation becomes quite interesting and intuitive

Code Time

Step 0: Get the dependencies

pip install pyautogen~=0.1.0 docker

Step 1: Create a file named “OAI_CONFIG_LIST”, this is for configuring your AI models API keys.

[
    {
        "model": "gpt-3.5-turbo",
        "api_key": "***"
    },
    {
        "model": "gpt-4",
        "api_key": "***"
    }
]

Step 2: Let’s use this file. As you can see, I am using GPT-4 in the file below. The reason for that is the not-so-great response from GPT-3.5 for this use case, at least

import autogen

config_list = autogen.config_list_from_json(
    "OAI_CONFIG_LIST",
    filter_dict={
        "model": ["gpt-4"]
    },
)

llm_config={
    "timeout": 600,
    "seed": 42,
    "config_list": config_list,
    "temperature": 0,
}

Step 3: Create your Agents

# create an AssistantAgent
assistant = autogen.AssistantAgent(
    name="assistant",
    llm_config=llm_config,
)
# create a UserProxyAgent
user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    human_input_mode="TERMINATE",
    max_consecutive_auto_reply=10,
    is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
    code_execution_config={"work_dir": "web"},
    llm_config=llm_config,
    system_message="""Reply TERMINATE if the task has been solved at full satisfaction.
Otherwise, reply CONTINUE, or the reason why the task is not solved yet."""
)

Step 4: Initiate the conversation by providing your prompt. As you can see, I have just provided the URL for my article and explained what I want

# the assistant receives a message from the user, which contains the task description
user_proxy.initiate_chat(
    assistant,
    message="""
Find a publicly traded company in the 
below article and 
get the YTD gain 
for it: https://medium.com/@madhavarora1988/getting-api-keys-6b3f486f5598
""",
)

Step 5: I’ll paste the shortened response for the ask above

user_proxy (to assistant):
Find a publicly traded company in the below article and 
get the YTD gain for it
-----------------------------------------------------------
assistant (to user_proxy):
To solve this task, we need to follow these steps:
1. Extract the content of the webpage.
2. Find the name of a publicly traded company in the content.
3. Use a financial data API to get the YTD (Year-to-Date) gain for the company.
Let's start with the first step, extracting the content of the webpage. We can use the `requests` and `beautifulsoup4` libraries in Python to do this. If these libraries are not installed, they can be installed using pip:

Now, let's extract the content of the webpage.

```python
import requests
from bs4 import BeautifulSoup
//truncated to make it short
```

Please execute the above Python code.

------------------------------------------------------
user_proxy (to assistant):
exitcode: 0 (execution succeeded)
Code output: //truncated to make it short
-------------------------------------------------------
assistant (to user_proxy):

From the content of the webpage, 
it appears that there are no publicly 
traded companies mentioned in the article. 
The article is about getting API keys for OpenWeatherMap 
and Google APIs. While Google is a publicly traded company, 
it's not directly mentioned in the context of being a publicly traded company.

Let's try to get the YTD gain for Google (Alphabet Inc.) and create a chart for the same. We can use the `yfinance` library in Python to get the financial data. If this library is not installed, it can be installed using pip:

//more code

Now, let's get the YTD gain for Google and create a chart for the same.
//more code

# Get the data for the stock
data = yf.download('GOOGL','2022-01-01','2022-12-31')

# Calculate the YTD gain
data['YTD Gain'] = data['Close'].pct_change().cumsum()

The YTD gain for Google is 46.9591530345034%
---------------------------------------------------------

let me try to explain the interaction using a simple diagram

As you can see the conversation is akin to 2 individuals interacting with each other, each specialized in their domain.

Conclusion

This has great potential, considering how we all currently work and bring our strength, creativity our own persona to the table and how we as teams accomplish great things, I think AutoGen takes that Idea a step further and allows LLM agents to interact with each other and with us in real time. The future that I envision is having teams comprising of Humans and Agents, and there is a name for it: ‘Hybrid Multi-Team Systems’ (HMTS).

PreviousWorkshop 4: Build OpenAI Virtual Assistant NextWorkshop 6: Connecting the World's APIs to Web3 using Chainlink Functions

Last updated 1 year ago