Multi-Agent Conversation With AutoGen AI
A simple experiment with multiple collaborative AI Agents interacting via group chat to produce solutions architectures based on business requirements.
Join the DZone community and get the full member experience.
Join For FreeIn this article, we will run some basic Artificial Intelligence (AI) Python code using the Databricks Community Edition (CE), a free, cloud-based platform. Since we are dealing with open-source libraries only, this experiment can be easily reproduced in any Python/PySpark environment.
Preliminaries: AutoGen and Conversable Agents
AutoGen is an open-source platform that enables us to create collaborative AI agents. These agents work together to accomplish tasks, often by interacting through a group chat. A conversable agent can send, receive, and generate messages. It can also be customized using a combination of AI models, tools, and human input. A conversable agent can be any of the following:
- A user proxy agent, which, as its name implies, is a proxy agent for humans, active somewhere between user inputs and agents’ replies and can execute code.
- One or more assistant agents that are — unsurprisingly — AI assistants that use large language models (LLMs), but do not require human input or code execution.
We will create a group chat between a user proxy agent (e.g., a Head of Architecture) and three assistant agents: a Cloud Architect, an open-source (OSS) Architect, and a Lead Architect. The objective would be to provide a solution architecture based on a list of business requirements. We can represent this in the following diagram of interactions:
The conversational flow works as follows:
- Business requirements are provided to the proxy agent
- The proxy agent initiates a chat between the architects
- The Cloud Architect will speak first, providing a proposal for each major cloud provider: Azure, AWS, and GCP.
- Next speaker:
- The OSS Architect will offer a solution using OSS frameworks outside the cloud realm.
- Next (and final) speaker:
- The Lead Architect will review all solutions and provide a final proposal.
Prompt Engineering
Let’s create the prompts first, starting with the common piece (the task at hand) that contains some simple requirements:
task = '''
**Task**: As an architect, you are required to design a solution for the
following business requirements:
- Data storage for massive amounts of IoT data
- Real-time data analytics and machine learning pipeline
- Scalability
- Cost Optimization
- Region pairs in Europe, for disaster recovery
- Tools for monitoring and observability
- Timeline: 6 months
Break down the problem using a Chain-of-Thought approach. Ensure that your
solution architecture is following best practices.
'''
Prompt for the Cloud Architect:
cloud_prompt = '''
**Role**: You are an expert cloud architect. You need to develop architecture proposals
using either cloud-specific PaaS services, or cloud-agnostic ones.
The final proposal should consider all 3 main cloud providers: Azure, AWS and GCP, and provide
a data architecture for each. At the end, briefly state the advantages of cloud over on-premises
architectures, and summarize your solutions for each cloud provider using a table for clarity.
'''
cloud_prompt += task
For the OSS Architect:
oss_prompt = '''
**Role**: You are an expert on-premises, open-source software architect. You need
to develop architecture proposals without considering cloud solutions.
Only use open-source frameworks that are popular and have lots of active contributors.
At the end, briefly state the advantages of open-source adoption, and summarize your
solutions using a table for clarity.
'''
oss_prompt += task
And, the Lead Architect:
lead_prompt = '''
**Role**: You are a lead Architect tasked with managing a conversation between
the cloud and the open-source Architects.
Each Architect will perform a task and respond with their resuls. You will critically
review those and also ask for, or pointo, the disadvantages of their soltuions.
You will review each result, and choose the best solution in accordance with the business
requirements and architecture best practices. You will use any number of summary tables to
communicate your decision.
'''
lead_prompt += task
Once we have completed the prompts, we can start creating our conversable agents and having them interact in a chat setting.
Multi-Agent Chat Implementation
Since we are using Python, we need to install the corresponding pyautogen
package:
pip install pyautogen
We start with the imports and some configurations. We could work with multiple models, but we are choosing to use only GPT-4o in this simple example.
import autogen
from autogen import UserProxyAgent
from autogen import AssistantAgent
config_list = [{'model': 'gpt-4o', 'api_key': '---------'}]
gpt4o_config = {
"cache_seed": 42, # change the cache_seed for different trials
"temperature": 0,
"config_list": config_list,
"timeout": 120,
}
A zero temperature
minimizes the randomness of the model, concentrating on the most likely (i.e., highest-scoring) choices. We will now create our agents with their specific prompts:
user_proxy = autogen.UserProxyAgent(
name="supervisor",
system_message = "A Human Head of Architecture",
code_execution_config={
"last_n_messages": 2,
"work_dir": "groupchat",
"use_docker": False,
},
human_input_mode="NEVER",
)
cloud_agent = AssistantAgent(
name = "cloud",
system_message = cloud_prompt,
llm_config={"config_list": config_list}
)
oss_agent = AssistantAgent(
name = "oss",
system_message = oss_prompt,
llm_config={"config_list": config_list}
)
lead_agent = AssistantAgent(
name = "lead",
system_message = lead_prompt,
llm_config={"config_list": config_list}
)
The code above should be self-explanatory. Before starting the chat, we need to specify an order of interaction since we want the Lead Architect to go last when the other two assistants have already provided their solutions. The Lead will need to review those and decide. To make sure that this order is followed, we create a state transition
function to be used in the chat for speaker selection:
def state_transition(last_speaker, groupchat):
messages = groupchat.messages
if last_speaker is user_proxy:
return cloud_agent
elif last_speaker is cloud_agent:
return oss_agent
elif last_speaker is oss_agent:
return lead_agent
elif last_speaker is lead_agent:
# lead -> end
return None
This is a simple example of a Finite State Machine (FSM). We are now ready to trigger the chat with this last bit of code:
groupchat = autogen.GroupChat(
agents=[user_proxy, cloud_agent, oss_agent, lead_agent],
messages=[],
max_round=6,
speaker_selection_method=state_transition,
)
manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)
user_proxy.initiate_chat(
manager, message="Provide your best architecture based on these business requirements."
)
Interpreting the Output
Due to space considerations, we will show only excerpts of the output. The code is straightforward to replicate, and those interested in the details can run it directly in their own environments.
The chat starts with the first speaker:
supervisor (to chat_manager):
Provide your best architecture based on these business requirements.
--------------------------------------------------------------------------------
Next speaker: cloud
cloud (to chat_manager):
### Chain-of-Thought Approach
#### 1. Understanding the Business Requirements
Based on the given requirements, we need:
- **High-capacity data storage** for IoT data.
...
### Advantages of Cloud over On-Premises Architectures
- **Scalability**: Easily scale up/down based on demand without large capital expenditure.
- **Cost Efficiency**: Pay-as-you-go pricing.
- **High Availability**: Built-in options for redundancy and failover.
- **Global Reach**: Deploy services in multiple regions across the world.
- **Ease of Management**: Managed services reduce operational burden.
- **Innovation Velocity**: Rapid deployment of new services and features.
...
### Summary Table
| Criterion | AWS | Azure | GCP |
|-------------------------------|-----------------------------------------|--------------------------------------------|-------------------------------------------------|
| **Primary Data Storage** | Amazon S3, DynamoDB | Azure Blob Storage, Cosmos DB | Google Cloud Storage, Bigtable |
| **Real-time Analytics** | Amazon Kinesis | Azure Stream Analytics | Google Cloud Dataflow |
| **Machine Learning** | Amazon SageMaker | Azure Machine Learning | Google AI Platform
.....
The cloud agent then lists relevant services with each cloud provider based on the requirements provided, explains the advantages of the cloud over on-premises, and presents a summary table.
Next comes the second speaker. The OSS agent follows the same general approach, listing relevant services, discussing the advantages of open source, and outputting its own solution.
Next speaker: oss
oss (to chat_manager):
### Chain-of-Thought Approach
#### 1. Understanding the Business Requirements
....
### Advantages of Open-Source Adoption
- **Cost Savings**: No licensing fees.
- **Community Support**: Large active communities for troubleshooting and inquiries.
- **Flexibility and Customization**: Ability to modify source code to meet unique business needs.
- **Innovation**: Rapid adoption of cutting-edge technologies and features.
...
### Summary Table
| Requirement | Solution Components |
|------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **Data Storage** | Apache Cassandra, HDFS |
| **Real-time Analytics** | Apache Kafka, Apache Flink, Apache Spark |
| **Machine Learning** | TensorFlow |
| **Scalability** | Kubernetes, Docker |
| **Disaster Recovery Regions** | Primary: Germany (Berlin), Secondary: Netherlands (Amsterdam) |
| **Monitoring** | Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana)
...
Finally, the lead agent steps in, analyzes the advantages and disadvantages of each solution, uses a suitability matrix, etc., and communicates its final recommendations:
Next speaker: lead
lead (to chat_manager):
### Evaluating Solutions Based on the Business Requirements and Best Practices
#### Analysis and Comparison of Proposed Architectures
| **Criterion** | **AWS** | **Azure** | **GCP** | **Open-Source** |
|-----------------------------------|----------------------------------------------------------------|----------------------------------------------------------------|----------------------------------------------------------------|---------------------------------------------------------------|
| **Primary Data Storage** | Amazon S3, DynamoDB | Azure Blob Storage, Cosmos DB | Google Cloud Storage, Bigtable | Apache Cassandra, HDFS |
| **Real-time Analytics** | Amazon Kinesis | Azure Stream Analytics | Google Cloud Dataflow | Apache Kafka, Apache Flink |
| **Machine Learning** | Amazon SageMaker | Azure Machine Learning | Google AI Platform | TensorFlow |
...
### Advantages and Disadvantages
...
#### Recommendation:
Based on the evaluation, here is a recommendation considering the timeline of 6 months, the requirements for scalability, disaster recovery, machine learning, and cost optimization:
- **Primary Recommendation**: **AWS**
- **Reasoning**: AWS provides a highly-integrated suite of services that are easy to deploy rapidly. They have strong support for both real-time analytics and machine learning. AWS offers robust monitoring tools and several options for cost optimization, which are fully managed, reducing the operational overhead and the complexity of the setup. They have a well-established global presence with region pairs suitable for disaster recovery.
- **Secondary Recommendation**: **Azure**
- **Reasoning**: If there is a heavier reliance on Microsoft ecosystems within the current infrastructure, Azure would be a suitable alternative. Azure provides strong support for machine learning, analytics, and disaster recovery and offers robust monitoring tools similar to AWS.
- **Alternative Consideration**: **GCP** for organizations already invested in Google technologies or with a requirement for industry-leading machine learning and data analytics capabilities.
- **Open-Source Option**: While it might be cost-effective and avoid vendor lock-in, the complexity, high maintenance overhead, and potential risks in meeting the tight deadline could make it less suitable as the primary choice but could be considered for long-term strategic investments if flexibility and cost savings are a high priority.
Note that any opinion expressed by the lead agent remains its own and does not necessarily reflect the opinion of the author of this article.
Closing Thoughts
Using a fairly simple setup, we created a basic architecture team organization where different types of participants collaborate to provide solutions to the business.
Of course, architecture involves more than just providing technical recommendations. Architects often deal with complex, unique challenges that require a deep understanding of business needs and thinking critically about long-term implications. There are intangible factors that AI may not fully grasp, lacking the necessary interpersonal skills and emotional intelligence… at least for now.
Published at DZone with permission of Tony Siciliani, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments