DeepAgent 示例：Tool 定义体系详解

📁 项目路径：adk/multiagent/deep/
🎯 核心设计：两层工具分配 + 装饰器包装 + 子智能体委派

一、整体 Tool 架构

本项目采用 两层 Tool 分配机制——主 Agent 和子 Agent 各自拥有不同的工具集：

┌─ DeepAgent (主 Agent) ───────────────────────────────┐
│  工具：read_file, tree                   ← 只读能力  │
│                                                       │
│  子Agent：                                            │
│  ┌─ CodeAgent ─────────────────────────────────────┐  │
│  │  工具：bash, tree, edit_file, read_file,        │  │
│  │        python_runner                   ← 读+写  │  │
│  └─────────────────────────────────────────────────┘  │
│  ┌─ WebSearchAgent ────────────────────────────────┐  │
│  │  工具：duckduckgo_search               ← 搜索   │  │
│  └─────────────────────────────────────────────────┘  │
└───────────────────────────────────────────────────────┘

🎨 设计意图

角色	职责	工具权限	设计目的
主 Agent	规划、协调、只读查看	`read_file`, `tree`	专注”看”，避免意外修改
CodeAgent	文件操作、代码执行	`bash`, `edit_file`, `python_runner` 等	专注”做”，承担写操作风险
WebSearchAgent	信息检索	`duckduckgo_search`	专注”查”，隔离外部依赖

💡 核心原则：主 Agent 只负责”看”（规划），子 Agent 才负责”做”（执行）——职责隔离，降低耦合。

二、Tool 的定义模式（三步法）

每个 Tool 都遵循相同的 三步定义模式。以 bash 工具为例：

🔹 步骤 1：定义 ToolInfo（元信息）

// tools/bash.go
var bashToolInfo = &schema.ToolInfo{
    Name: "bash",
    Desc: `Run commands in a bash shell...`,           // 📝 描述给 LLM 看
    ParamsOneOf: schema.NewParamsOneOfByParams(map[string]*schema.ParameterInfo{
        "command": {
            Type:     "string",
            Desc:     "The command to execute",
            Required: true,
        },
    }),
}

💡 ToolInfo 是 LLM 的”使用说明书”——LLM 根据 Name、Desc、参数 Schema 来决定是否调用以及如何调用。

🔹 步骤 2：实现 `InvokableTool` 接口

type bashTool struct {
    op commandline.Operator       // 🔌 注入的底层执行引擎
}

// 返回元信息
func (b *bashTool) Info(_ context.Context) (*schema.ToolInfo, error) {
    return bashToolInfo, nil
}

// 执行逻辑
func (b *bashTool) InvokableRun(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) {
    // 1️⃣ 解析 JSON 参数
    input := &shellInput{}
    json.Unmarshal([]byte(argumentsInJSON), input)

    // 2️⃣ 通过 Operator 执行命令
    cmd, err := o.op.RunCommand(ctx, []string{input.Command})
    if err != nil {
        return "", err
    }

    // 3️⃣ 格式化返回结果
    return utils.FormatCommandOutput(cmd), nil
}

接口契约：

方法	签名	职责
`Info()`	`(*schema.ToolInfo, error)`	返回工具元信息，供 LLM 决策
`InvokableRun()`	`(string, error)`	执行实际逻辑，返回文本结果

🔹 步骤 3：用 `WrapTool` 包装

// 注册工具时添加预处理/后处理
tools.NewWrapTool(
    tools.NewBashTool(operator),  // 基础工具
    preprocess,                    // 预处理链
    postprocess,                   // 后处理链
)

💡 包装器模式让工具具备可组合的增强能力，无需修改核心逻辑。

三、所有 Tool 一览

Tool	文件	输入参数	用途	所属 Agent
`bash`	`bash.go`	`command`	执行 shell 命令	CodeAgent
`read_file`	`read.go`	`path`, `start_row`, `n_rows`	按行读取文件	主Agent + CodeAgent
`edit_file`	`edit.go`	`path`, `content`	创建/覆盖文件	CodeAgent
`tree`	`tree.go`	`path`	查看目录结构	主Agent + CodeAgent
`python_runner`	`python_runner.go`	`code`	写 Python 并执行	CodeAgent
`image_reader`	`read_image.go`	`query`, `image_path`	用视觉模型分析图片	（预留未使用）
`submit_result`	`submit_result.go`	`is_success`, `result`, `files`	提交最终结果	主Agent（框架内置）
`create_plan`	`generic/plan.go`	`steps[]`	生成执行计划	主Agent（框架内置）

四、WrapTool 包装机制（核心设计）

🔹 处理流程

LLM 生成的 JSON 参数
    │
    ▼ ── 预处理 (Preprocess) ──
    │   ToolRequestRepairJSON:
    │   • 去掉 <|FunctionCallBegin|> 等模型格式残留
    │   • jsonrepair 自动修复损坏的 JSON
    │
    ▼
    实际 Tool 执行 (InvokableRun)
    │
    ▼ ── 后处理 (Postprocess) ──
    │   FilePostProcess:
    │   • 解析 runResult 结构
    │   • 提取 stdout / stderr / fileChange
    │   • 格式化为简洁文本返回给 LLM
    │
    ▼
LLM 收到的工具结果

🔹 包装器实现

// tools/wrap.go
type wrapTool struct {
    baseTool    tool.InvokableTool
    preprocess  []ToolRequestPreprocess      // 请求预处理链
    postprocess []ToolResponsePostprocess    // 响应后处理链
}

func (w *wrapTool) InvokableRun(ctx context.Context, args string, opts ...tool.Option) (string, error) {
    // 🔁 依次执行预处理
    for _, pre := range w.preprocess {
        args, _ = pre(ctx, w.baseTool, args)
    }
    
    // ⚙️ 执行实际工具
    resp, err := w.baseTool.InvokableRun(ctx, args, opts...)
    if err != nil {
        return "", err
    }
    
    // 🔁 依次执行后处理
    for _, post := range w.postprocess {
        resp, _ = post(ctx, w.baseTool, resp, args)
    }
    
    return resp, nil
}

🔹 各 Tool 的包装配置

Tool	预处理	后处理	说明
`bash`	`RepairJSON`	`FilePostProcess`	命令输出格式化
`tree`	`RepairJSON`	无	目录结构直接返回
`edit_file`	`RepairJSON`	`EditFilePostProcess`	简化为 “success” 消息
`read_file`	`RepairJSON`	无	文件内容直接返回
`python_runner`	`RepairJSON`	`FilePostProcess`	执行结果格式化
主Agent工具	无	无	主Agent模型 temperature=0，输出稳定

⚠️ 关键设计：主 Agent 的工具没有加 JSON 修复——因为主 Agent 用的模型 temperature=0，输出更稳定；而 CodeAgent 用 temperature=1，创造性高但易出错，需要修复保护。

五、agents/ 目录的作用

agents/ 目录定义了两个子 Agent，它们不是 Tool，而是可以被 DeepAgent 委派任务的独立智能体。

🔹 CodeAgent（agents/code_agent.go）

定位：专门处理文件操作和代码执行的”工程师”

func NewCodeAgent(ctx context.Context, operator commandline.Operator) (adk.Agent, error) {
    cm, _ := utils.NewChatModel(ctx,
        utils.WithMaxTokens(14125),
        utils.WithTemperature(float32(1)),    // 🎲 高创造性
        utils.WithTopP(float32(1)),
    )

    return adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{
        Name:        "CodeAgent",
        Description: "specialized in handling Excel files via Python code...",
        Instruction: `You are a code agent. Your workflow is:
            1. 分析任务
            2. 使用工具辅助编码
            3. 编写 Python 代码完成任务
            4. 将结果写入文件`,
        Model:         cm,
        Tools:         []tool.BaseTool{bash, tree, edit_file, read_file, python_runner},  // 5 个工具
        MaxIterations: 1000,     // 🔄 允许大量迭代
        GenModelInput: func(...) {
            // 📥 注入工作目录和当前时间到 Prompt
        },
    })
}

🔑 关键设计点：

设计	说明	收益
`MaxIterations: 1000`	文件处理可能需要多轮工具调用（读→写→执行→检查→修改→再执行）	支持复杂任务的自主迭代
`GenModelInput`	通过 Jinja2 模板向 LLM 注入运行时上下文（工作目录、查询内容、时间）	让 Agent 感知环境，决策更准确
`Description`	这段描述是给主 Agent 看的	主 Agent 根据它决定何时委派任务给 CodeAgent

🔹 WebSearchAgent（agents/web_search.go）

定位：信息检索的”调研员”

func NewWebSearchAgent(ctx context.Context) (adk.Agent, error) {
    cm, _ := utils.NewChatModel(ctx)    // 默认参数
    searchTool, _ := duckduckgo.NewTextSearchTool(ctx, &duckduckgo.Config{})

    return adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{
        Name:          "WebSearchAgent",
        Description:   "utilizes ReAct model... using web search tools",
        Tools:         []tool.BaseTool{searchTool},          // 仅 1 个工具
        MaxIterations: 10,                    // 搜索不需要太多轮
    })
}

🔹 子 Agent 如何被使用

在 main.go 中，子 Agent 通过 SubAgents 配置注册给 DeepAgent：

deepAgent, _ := deep.New(ctx, &deep.Config{
    SubAgents: []adk.Agent{codeAgent, webSearchAgent},  // ← 注册子 Agent
    ToolsConfig: adk.ToolsConfig{
        Tools: []tool.BaseTool{read_file, tree},         // ← 主 Agent 自己的工具
    },
})

执行时：DeepAgent 在执行计划的每一步时，会根据子 Agent 的 Description 自动选择合适的子 Agent 来执行该步骤。

🔹 SubAgent ≠ Tool 的核心区别

维度	Tool	SubAgent
本质	单次函数调用	一个完整的 Agent（有自己的模型、工具、多轮循环）
能力	执行一个原子操作	能自主规划多步执行
复杂度	`bash("ls")` → 返回结果	接收”提取CSV第一列” → 自己决定读文件→写代码→执行→验证
迭代	无	CodeAgent 最多 1000 轮内部迭代
调用方	LLM 直接调用	主 Agent 委派任务

💡 一句话区分：Tool 是”手”，执行具体动作；SubAgent 是”小脑”，能自主完成子任务。

六、总结：Tool 定义的分层设计

                  DeepAgent
                 ┌─────────────────────────────────┐
                 │  create_plan      ← 框架内置     │
deep.New 自动    │  submit_result    ← 框架内置     │
提供的 Tool      │  transfer_to_CodeAgent  ← 自动   │
                 │  transfer_to_WebSearch  ← 自动   │
                 ├─────────────────────────────────┤
手动注册的 Tool  │  read_file        ← tools/read.go│
                 │  tree             ← tools/tree.go│
                 └──────────┬──────────────────────┘
                            │ 委派任务
            ┌───────────────┴──────────────┐
            ▼                              ▼
  CodeAgent (agents/)              WebSearchAgent (agents/)
  ┌───────────────────┐           ┌────────────────────┐
  │ bash              │           │ duckduckgo_search  │
  │ tree              │           │ (eino-ext 提供)    │
  │ edit_file         │           └────────────────────┘
  │ read_file         │
  │ python_runner     │
  │ (全部在 tools/)    │
  └───────────────────┘

🔑 分层设计要点

层级	组件	职责	关键特性
框架层	`deep.New()` 内置 Tool	计划生成、结果提交、子Agent委派	自动化编排，无需手动实现
工具层	`tools/*.go`	原子操作封装（读/写/执行）	WrapTool 包装 + JSON 修复 + 跨平台支持
智能体层	`agents/*.go`	复杂任务自主执行	独立模型配置 + 多轮迭代 + 上下文注入

🎯 agents/ 的核心价值

将”能力”从单个 Tool 提升到”自主执行复杂任务的 Agent”级别。

CodeAgent 不是一个工具，而是一个拥有 5 个工具、能自主迭代 1000 轮 的独立智能体
主 Agent 只需告诉它 “做什么”，不需要告诉它 “怎么做”
这正是 Plan-Execute 模式 的精髓：规划与执行分离，专业的人做专业的事

🔑 关键要点速记

┌─────────────────┬───────────────────────────────────────────┐
│      要点       │                   说明                    │
├─────────────────┼───────────────────────────────────────────┤
│ 两层工具分配    │ 主Agent只读，子Agent可写，职责隔离        │
├─────────────────┼───────────────────────────────────────────┤
│ Tool三步定义    │ ToolInfo → InvokableTool → WrapTool       │
├─────────────────┼───────────────────────────────────────────┤
│ WrapTool包装    │ 预处理(JSON修复)+后处理(结果格式化)        │
├─────────────────┼───────────────────────────────────────────┤
│ 差异化配置      │ 主Agent工具无修复，CodeAgent工具有修复    │
├─────────────────┼───────────────────────────────────────────┤
│ SubAgent≠Tool   │ Agent能自主多轮迭代，Tool只能单次调用     │
├─────────────────┼───────────────────────────────────────────┤
│ 委派机制        │ 主Agent根据Description自动选择子Agent     │
├─────────────────┼───────────────────────────────────────────┤
│ 上下文注入      │ GenModelInput向Prompt注入工作目录/时间等  │
└─────────────────┴───────────────────────────────────────────┘

📌 一句话总结：DeepAgent 的 Tool 体系通过 分层设计 + 装饰器包装 + 智能体委派，实现了”规划-执行”的可靠分离；理解 Tool 与 SubAgent 的本质区别，是构建可扩展多智能体系统的关键。