当我们面对负责任务时,人类更倾向于将其分解为更简单的字任务,并逐一解决。 智能体规划模块旨在使智能体具备类人的规划能力,从而使代理的行为更合理、强大和可靠。
简介
下图是智能体规划模块的简易架构图
在示意图中,规划模块接收到用户的任务,然后将任务分解为子任务并将他们分配给响应的代理(Agent1、Agent2、Agent3 …)
用智能体进行数据分析
在下述案例中,我们将向您展示如何使用智能体规划来进行数据分析。在使用本教程前,首先需要安装必要的依赖,安装依赖的方法可以参考先前的文档。
数据准备
方便起见,我们将使用SQLite临时数据库来存储数据。数据库将在临时目录中创建,并在程序退出后删除。
首先,创建一个临时SQLite数据库并创建一些表并插入一些测试数据。
from dbgpt.datasource.rdbms.conn_sqlite import SQLiteTempConnector
connector = SQLiteTempConnector.create_temporary_db()
connector.create_temp_tables(
{
"students": {
"columns": {
"student_id": "INTEGER PRIMARY KEY",
"student_name": "TEXT",
"major": "TEXT",
"year_of_enrollment": "INTEGER",
"student_age": "INTEGER",
},
"data": [
(1, "Zhang San", "Computer Science", 2020, 20),
(2, "Li Si", "Computer Science", 2021, 19),
(3, "Wang Wu", "Physics", 2020, 21),
(4, "Zhao Liu", "Mathematics", 2021, 19),
(5, "Zhou Qi", "Computer Science", 2022, 18),
(6, "Wu Ba", "Physics", 2020, 21),
(7, "Zheng Jiu", "Mathematics", 2021, 19),
(8, "Sun Shi", "Computer Science", 2022, 18),
(9, "Liu Shiyi", "Physics", 2020, 21),
(10, "Chen Shier", "Mathematics", 2021, 19),
],
},
"courses": {
"columns": {
"course_id": "INTEGER PRIMARY KEY",
"course_name": "TEXT",
"credit": "REAL",
},
"data": [
(1, "Introduction to Computer Science", 3),
(2, "Data Structures", 4),
(3, "Advanced Physics", 3),
(4, "Linear Algebra", 4),
(5, "Calculus", 5),
(6, "Programming Languages", 4),
(7, "Quantum Mechanics", 3),
(8, "Probability Theory", 4),
(9, "Database Systems", 4),
(10, "Computer Networks", 4),
],
},
"scores": {
"columns": {
"student_id": "INTEGER",
"course_id": "INTEGER",
"score": "INTEGER",
"semester": "TEXT",
},
"data": [
(1, 1, 90, "Fall 2020"),
(1, 2, 85, "Spring 2021"),
(2, 1, 88, "Fall 2021"),
(2, 2, 90, "Spring 2022"),
(3, 3, 92, "Fall 2020"),
(3, 4, 85, "Spring 2021"),
(4, 3, 88, "Fall 2021"),
(4, 4, 86, "Spring 2022"),
(5, 1, 90, "Fall 2022"),
(5, 2, 87, "Spring 2023"),
],
},
}
)
创建一个数据库资源
from dbgpt.agent.resource import RDBMSConnectorResource
db_resource = RDBMSConnectorResource("student_manager", connector=connector)
数据分析计划自动构建
为创建一个计划进行数据分析,这里我们介绍一个新的Agent,名为AutoPlanChatManager
,顾名思义,此Agent可以根据任务需求自动生成计划并将子任务分发给具体的协作Agent
AutoPlanChatManager
是一个特殊的Agent,创建方式与其他智能体类似,但它有一种特殊的雇佣方法来雇佣其他智能体。
import asyncio
import os
from dbgpt.agent import (
AgentContext,
AgentMemory,
AutoPlanChatManager,
LLMConfig,
UserProxyAgent,
)
from dbgpt.agent.expand.data_scientist_agent import DataScientistAgent
from dbgpt.model.proxy import OpenAILLMClient
async def main():
llm_client = OpenAILLMClient(
model_alias="gpt-3.5-turbo", # or other models, eg. "gpt-4o"
api_base=os.getenv("OPENAI_API_BASE"),
api_key=os.getenv("OPENAI_API_KEY"),
)
context: AgentContext = AgentContext(
conv_id="test123", language="en", temperature=0.5, max_new_tokens=2048
)
agent_memory = AgentMemory()
user_proxy = await UserProxyAgent().bind(agent_memory).bind(context).build()
sql_boy = (
await DataScientistAgent()
.bind(context)
.bind(LLMConfig(llm_client=llm_client))
.bind(db_resource)
.bind(agent_memory)
.build()
)
manager = (
await AutoPlanChatManager()
.bind(context)
.bind(agent_memory)
.bind(LLMConfig(llm_client=llm_client))
.build()
)
manager.hire([sql_boy])
await user_proxy.initiate_chat(
recipient=manager,
reviewer=user_proxy,
message="Analyze student scores from at least three dimensions",
)
# dbgpt-vis message infos
print(await agent_memory.gpts_memory.one_chat_completions("test123"))
if __name__ == "__main__":
asyncio.run(main())
输出如下
--------------------------------------------------------------------------------
User (to AutoPlanChatManager)-[]:
"Analyze student scores from at least three dimensions"
--------------------------------------------------------------------------------
un_stream ai response: [
{
"serial_number": "1",
"agent": "DataScientist",
"content": "Retrieve student scores data from the database including scores for each subject, overall performance, and attendance records.",
"rely": ""
},
{
"serial_number": "2",
"agent": "DataScientist",
"content": "Analyze student scores data to identify trends and patterns in academic performance, subject-wise scores, and attendance correlation.",
"rely": "1"
},
{
"serialjson_number": "3",
"agent": "DataScientist",
"content": "Visualize the analyzed data using appropriate graphs and charts to represent the student scores from different dimensions effectively.",
"rely": "2"
}
]
--------------------------------------------------------------------------------
Planner (to AutoPlanChatManager)-[gpt-3.5-turbo]:
"[\n {\n \"serial_number\": \"1\",\n \"agent\": \"DataScientist\",\n \"content\": \"Retrieve student scores data from the database including scores for each subject, overall performance, and attendance records.\",\n \"rely\": \"\"\n },\n {\n \"serial_number\": \"2\",\n \"agent\": \"DataScientist\",\n \"content\": \"Analyze student scores data to identify trends and patterns in academic performance, subject-wise scores, and attendance correlation.\",\n \"rely\": \"1\"\n },\n {\n \"serialjson_number\": \"3\",\n \"agent\": \"DataScientist\",\n \"content\": \"Visualize the analyzed data using appropriate graphs and charts to represent the student scores from different dimensions effectively.\",\n \"rely\": \"2\"\n }\n]"
>>>>>>>>Planner Review info:
Pass(None)
>>>>>>>>Planner Action report:
execution succeeded,
[
{
"serial_number": "1",
"agent": "DataScientist",
"content": "Retrieve student scores data from the database including scores for each subject, overall performance, and attendance records.",
"rely": ""
},
{
"serial_number": "2",
"agent": "DataScientist",
"content": "Analyze student scores data to identify trends and patterns in academic performance, subject-wise scores, and attendance correlation.",
"rely": "1"
},
{
"serialjson_number": "3",
"agent": "DataScientist",
"content": "Visualize the analyzed data using appropriate graphs and charts to represent the student scores from different dimensions effectively.",
"rely": "2"
}
]
--------------------------------------------------------------------------------
GroupChat select_speaker failed to resolve the next speaker's name. This is because the speaker selection OAI call returned:
DataScientist
--------------------------------------------------------------------------------
AutoPlanChatManager (to Edgar)-[]:
"Retrieve student scores data from the database including scores for each subject, overall performance, and attendance records."
--------------------------------------------------------------------------------
un_stream ai response: {
"display_type": "response_table",
"sql": "SELECT s.student_id, s.student_name, s.major, s.year_of_enrollment, s.student_age, c.course_name, c.credit, sc.score, sc.semester FROM students s JOIN scores sc ON s.student_id = sc.student_id JOIN courses c ON sc.course_id = c.course_id",
"thought": "I have crafted a SQL query that retrieves student scores data including scores for each subject, overall performance, and attendance records by joining the 'students', 'scores', and 'courses' tables on their respective keys."
}
--------------------------------------------------------------------------------
Edgar (to AutoPlanChatManager)-[gpt-3.5-turbo]:
"{\n \"display_type\": \"response_table\",\n \"sql\": \"SELECT s.student_id, s.student_name, s.major, s.year_of_enrollment, s.student_age, c.course_name, c.credit, sc.score, sc.semester FROM students s JOIN scores sc ON s.student_id = sc.student_id JOIN courses c ON sc.course_id = c.course_id\",\n \"thought\": \"I have crafted a SQL query that retrieves student scores data including scores for each subject, overall performance, and attendance records by joining the 'students', 'scores', and 'courses' tables on their respective keys.\"\n}"
>>>>>>>>Edgar Review info:
Pass(None)
>>>>>>>>Edgar Action report:
execution succeeded,
{"display_type":"response_table","sql":"SELECT s.student_id, s.student_name, s.major, s.year_of_enrollment, s.student_age, c.course_name, c.credit, sc.score, sc.semester FROM students s JOIN scores sc ON s.student_id = sc.student_id JOIN courses c ON sc.course_id = c.course_id","thought":"I have crafted a SQL query that retrieves student scores data including scores for each subject, overall performance, and attendance records by joining the 'students', 'scores', and 'courses' tables on their respective keys."}
--------------------------------------------------------------------------------
GroupChat select_speaker failed to resolve the next speaker's name. This is because the speaker selection OAI call returned:
DataScientist
--------------------------------------------------------------------------------
AutoPlanChatManager (to Edgar)-[]:
"Read the result data of the dependent steps in the above historical message to complete the current goal:Analyze student scores data to identify trends and patterns in academic performance, subject-wise scores, and attendance correlation."
--------------------------------------------------------------------------------
un_stream ai response: {
"display_type": "response_table",
"sql": "SELECT student_id, student_name, course_name, credit, AVG(score) AS average_score, COUNT(score) AS total_scores, MAX(score) AS highest_score, MIN(score) AS lowest_score, SUM(score) AS total_score, AVG(semester) AS average_semester FROM (SELECT s.student_id, s.student_name, c.course_name, c.credit, sc.score, sc.semester FROM students s JOIN scores sc ON s.student_id = sc.student_id JOIN courses c ON sc.course_id = c.course_id) GROUP BY student_id, course_name",
"thought": "I have created a SQL query that analyzes student scores data to identify trends and patterns in academic performance, subject-wise scores, and attendance correlation. The query calculates average score, total scores, highest score, lowest score, total score, and average semester for each student in each course."
}
--------------------------------------------------------------------------------
Edgar (to AutoPlanChatManager)-[gpt-3.5-turbo]:
"{\n \"display_type\": \"response_table\",\n \"sql\": \"SELECT student_id, student_name, course_name, credit, AVG(score) AS average_score, COUNT(score) AS total_scores, MAX(score) AS highest_score, MIN(score) AS lowest_score, SUM(score) AS total_score, AVG(semester) AS average_semester FROM (SELECT s.student_id, s.student_name, c.course_name, c.credit, sc.score, sc.semester FROM students s JOIN scores sc ON s.student_id = sc.student_id JOIN courses c ON sc.course_id = c.course_id) GROUP BY student_id, course_name\",\n \"thought\": \"I have created a SQL query that analyzes student scores data to identify trends and patterns in academic performance, subject-wise scores, and attendance correlation. The query calculates average score, total scores, highest score, lowest score, total score, and average semester for each student in each course.\"\n}"
>>>>>>>>Edgar Review info:
Pass(None)
>>>>>>>>Edgar Action report:
execution succeeded,
{"display_type":"response_table","sql":"SELECT student_id, student_name, course_name, credit, AVG(score) AS average_score, COUNT(score) AS total_scores, MAX(score) AS highest_score, MIN(score) AS lowest_score, SUM(score) AS total_score, AVG(semester) AS average_semester FROM (SELECT s.student_id, s.student_name, c.course_name, c.credit, sc.score, sc.semester FROM students s JOIN scores sc ON s.student_id = sc.student_id JOIN courses c ON sc.course_id = c.course_id) GROUP BY student_id, course_name","thought":"I have created a SQL query that analyzes student scores data to identify trends and patterns in academic performance, subject-wise scores, and attendance correlation. The query calculates average score, total scores, highest score, lowest score, total score, and average semester for each student in each course."}
--------------------------------------------------------------------------------
GroupChat select_speaker failed to resolve the next speaker's name. This is because the speaker selection OAI call returned:
DataScientist
--------------------------------------------------------------------------------
AutoPlanChatManager (to Edgar)-[]:
"Read the result data of the dependent steps in the above historical message to complete the current goal:Visualize the analyzed data using appropriate graphs and charts to represent the student scores from different dimensions effectively."
--------------------------------------------------------------------------------
un_stream ai response: {
"display_type": "response_table",
"sql": "SELECT student_id, student_name, course_name, credit, AVG(score) AS average_score, COUNT(score) AS total_scores, MAX(score) AS highest_score, MIN(score) AS lowest_score, SUM(score) AS total_score, AVG(semester) AS average_semester FROM (SELECT s.student_id, s.student_name, c.course_name, c.credit, sc.score, sc.semester FROM students s JOIN scores sc ON s.student_id = sc.student_id JOIN courses c ON sc.course_id = c.course_id) GROUP BY student_id, course_name",
"thought": "The SQL query provided will generate a table with the analyzed data including average score, total scores, highest score, lowest score, total score, and average semester for each student in each course. This table can be further used for visualization purposes to represent student scores from different dimensions effectively."
}
--------------------------------------------------------------------------------
Edgar (to AutoPlanChatManager)-[gpt-3.5-turbo]:
"{\n \"display_type\": \"response_table\",\n \"sql\": \"SELECT student_id, student_name, course_name, credit, AVG(score) AS average_score, COUNT(score) AS total_scores, MAX(score) AS highest_score, MIN(score) AS lowest_score, SUM(score) AS total_score, AVG(semester) AS average_semester FROM (SELECT s.student_id, s.student_name, c.course_name, c.credit, sc.score, sc.semester FROM students s JOIN scores sc ON s.student_id = sc.student_id JOIN courses c ON sc.course_id = c.course_id) GROUP BY student_id, course_name\",\n \"thought\": \"The SQL query provided will generate a table with the analyzed data including average score, total scores, highest score, lowest score, total score, and average semester for each student in each course. This table can be further used for visualization purposes to represent student scores from different dimensions effectively.\"\n}"
>>>>>>>>Edgar Review info:
Pass(None)
>>>>>>>>Edgar Action report:
execution succeeded,
{"display_type":"response_table","sql":"SELECT student_id, student_name, course_name, credit, AVG(score) AS average_score, COUNT(score) AS total_scores, MAX(score) AS highest_score, MIN(score) AS lowest_score, SUM(score) AS total_score, AVG(semester) AS average_semester FROM (SELECT s.student_id, s.student_name, c.course_name, c.credit, sc.score, sc.semester FROM students s JOIN scores sc ON s.student_id = sc.student_id JOIN courses c ON sc.course_id = c.course_id) GROUP BY student_id, course_name","thought":"The SQL query provided will generate a table with the analyzed data including average score, total scores, highest score, lowest score, total score, and average semester for each student in each course. This table can be further used for visualization purposes to represent student scores from different dimensions effectively."}
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
AutoPlanChatManager (to User)-[]:
"Analyze student scores from at least three dimensions"
>>>>>>>>AutoPlanChatManager Review info:
Pass(None)
>>>>>>>>AutoPlanChatManager Action report:
execution succeeded,
```vis-chart
{"sql": "SELECT student_id, student_name, course_name, credit, AVG(score) AS average_score, COUNT(score) AS total_scores, MAX(score) AS highest_score, MIN(score) AS lowest_score, SUM(score) AS total_score, AVG(semester) AS average_semester FROM (SELECT s.student_id, s.student_name, c.course_name, c.credit, sc.score, sc.semester FROM students s JOIN scores sc ON s.student_id = sc.student_id JOIN courses c ON sc.course_id = c.course_id) GROUP BY student_id, course_name", "type": "response_table", "title": "", "describe": "The SQL query provided will generate a table with the analyzed data including average score, total scores, highest score, lowest score, total score, and average semester for each student in each course. This table can be further used for visualization purposes to represent student scores from different dimensions effectively.", "data": [{"student_id": 1, "student_name": "Zhang San", "course_name": "Data Structures", "credit": 4.0, "average_score": 85.0, "total_scores": 1, "highest_score": 85, "lowest_score": 85, "total_score": 85, "average_semester": 0.0}, {"student_id": 1, "student_name": "Zhang San", "course_name": "Introduction to Computer Science", "credit": 3.0, "average_score": 90.0, "total_scores": 1, "highest_score": 90, "lowest_score": 90, "total_score": 90, "average_semester": 0.0}, {"student_id": 2, "student_name": "Li Si", "course_name": "Data Structures", "credit": 4.0, "average_score": 90.0, "total_scores": 1, "highest_score": 90, "lowest_score": 90, "total_score": 90, "average_semester": 0.0}, {"student_id": 2, "student_name": "Li Si", "course_name": "Introduction to Computer Science", "credit": 3.0, "average_score": 88.0, "total_scores": 1, "highest_score": 88, "lowest_score": 88, "total_score": 88, "average_semester": 0.0}, {"student_id": 3, "student_name": "Wang Wu", "course_name": "Advanced Physics", "credit": 3.0, "average_score": 92.0, "total_scores": 1, "highest_score": 92, "lowest_score": 92, "total_score": 92, "average_semester": 0.0}, {"student_id": 3, "student_name": "Wang Wu", "course_name": "Linear Algebra", "credit": 4.0, "average_score": 85.0, "total_scores": 1, "highest_score": 85, "lowest_score": 85, "total_score": 85, "average_semester": 0.0}, {"student_id": 4, "student_name": "Zhao Liu", "course_name": "Advanced Physics", "credit": 3.0, "average_score": 88.0, "total_scores": 1, "highest_score": 88, "lowest_score": 88, "total_score": 88, "average_semester": 0.0}, {"student_id": 4, "student_name": "Zhao Liu", "course_name": "Linear Algebra", "credit": 4.0, "average_score": 86.0, "total_scores": 1, "highest_score": 86, "lowest_score": 86, "total_score": 86, "average_semester": 0.0}, {"student_id": 5, "student_name": "Zhou Qi", "course_name": "Data Structures", "credit": 4.0, "average_score": 87.0, "total_scores": 1, "highest_score": 87, "lowest_score": 87, "total_score": 87, "average_semester": 0.0}, {"student_id": 5, "student_name": "Zhou Qi", "course_name": "Introduction to Computer Science", "credit": 3.0, "average_score": 90.0, "total_scores": 1, "highest_score": 90, "lowest_score": 90, "total_score": 90, "average_semester": 0.0}]}
输出是[GPT-Vis](https://github.com/eosphoros-ai/GPT-Vis)可视化协议,我们可以解析出一些有用的信息
### 计划
```python
[
{
"serial_number": "1",
"agent": "DataScientist",
"content": "Retrieve student scores data from the database including scores for each subject, overall performance, and attendance records.",
"rely": ""
},
{
"serial_number": "2",
"agent": "DataScientist",
"content": "Analyze student scores data to identify trends and patterns in academic performance, subject-wise scores, and attendance correlation.",
"rely": "1"
},
{
"serialjson_number": "3",
"agent": "DataScientist",
"content": "Visualize the analyzed data using appropriate graphs and charts to represent the student scores from different dimensions effectively.",
"rely": "2"
}
]
第一个任务的大模型输出
{
"display_type": "response_table",
"sql": "SELECT s.student_id, s.student_name, s.major, s.year_of_enrollment, s.student_age, c.course_name, c.credit, sc.score, sc.semester FROM students s JOIN scores sc ON s.student_id = sc.student_id JOIN courses c ON sc.course_id = c.course_id",
"thought": "I have crafted a SQL query that retrieves student scores data including scores for each subject, overall performance, and attendance records by joining the 'students', 'scores', and 'courses' tables on their respective keys."
}
第二个任务的大模型输出
{
"display_type": "response_table",
"sql": "SELECT student_id, student_name, course_name, credit, AVG(score) AS average_score, COUNT(score) AS total_scores, MAX(score) AS highest_score, MIN(score) AS lowest_score, SUM(score) AS total_score, AVG(semester) AS average_semester FROM (SELECT s.student_id, s.student_name, c.course_name, c.credit, sc.score, sc.semester FROM students s JOIN scores sc ON s.student_id = sc.student_id JOIN courses c ON sc.course_id = c.course_id) GROUP BY student_id, course_name",
"thought": "I have created a SQL query that analyzes student scores data to identify trends and patterns in academic performance, subject-wise scores, and attendance correlation. The query calculates average score, total scores, highest score, lowest score, total score, and average semester for each student in each course."
}
第三个任务的大模型输出
{
"display_type": "response_table",
"sql": "SELECT student_id, student_name, course_name, credit, AVG(score) AS average_score, COUNT(score) AS total_scores, MAX(score) AS highest_score, MIN(score) AS lowest_score, SUM(score) AS total_score, AVG(semester) AS average_semester FROM (SELECT s.student_id, s.student_name, c.course_name, c.credit, sc.score, sc.semester FROM students s JOIN scores sc ON s.student_id = sc.student_id JOIN courses c ON sc.course_id = c.course_id) GROUP BY student_id, course_name",
"thought": "The SQL query provided will generate a table with the analyzed data including average score, total scores, highest score, lowest score, total score, and average semester for each student in each course. This table can be further used for visualization purposes to represent student scores from different dimensions effectively."
}
当然,如果运行上述代码,它会包含执行SQL查询后的数据,这里由于输出结果太长,我们不做展示。