GlobTool 与 GrepTool：搜索工具 - 和平哥的学习笔记

在 Claude Code 的 40 余个工具中，GlobTool 和 GrepTool 构成了代码库探索的"搜索双璧"。前者负责按文件名模式快速定位文件，后者基于 ripgrep 在文件内容中执行正则搜索。二者紧密协作，支撑着 Agent 从"茫茫代码海"中精准提取信息的核心能力。本文将深入解析这两个工具的源码实现、搜索策略及其与相邻工具的协作关系。

一、搜索工具在 Claude Code 中的定位

Claude Code 的工具系统按功能可划分为文件操作、代码搜索、系统执行、网络访问、Agent 协作等多个类别。在代码搜索类别中，GlobTool 和 GrepTool 是最基础、调用频率最高的两个工具。

从架构设计上看，这两个工具遵循了 Claude Code 统一的"自描述工具模式"：每个工具都是一个通过 buildTool 工厂函数构建的普通对象字面量，而非类的实例。这种模式带来了极佳的可组合性——例如 GlobTool 直接复用了 GrepTool 的结果渲染逻辑（renderToolResultMessage），而无需引入复杂的继承体系。

// 源码文件：src/tools/GlobTool/GlobTool.ts（第 46-90 行）
export const GlobTool = buildTool({
  name: GLOB_TOOL_NAME,
  searchHint: 'find files by name pattern or wildcard',
  maxResultSizeChars: 100_000,
  async description() {
    return DESCRIPTION
  },
  isConcurrencySafe() {
    return true
  },
  isReadOnly() {
    return true
  },
  // ...
})

两个工具都被标记为 isReadOnly() 返回 true，这意味着它们不会修改文件系统，可以在权限系统的宽松模式下被自动批准执行。同时 isConcurrencySafe() 返回 true 表明它们支持并发调用，不会相互干扰。

二、GlobTool：文件模式匹配工具

2.1 工具定义与输入输出 Schema

GlobTool 的核心职责是：给定一个 glob 模式，返回匹配的文件路径列表。它的输入 Schema 极简，仅包含 pattern（必需）和 path（可选）两个字段：

// 源码文件：src/tools/GlobTool/GlobTool.ts（第 22-35 行）
const inputSchema = lazySchema(() =>
  z.strictObject({
    pattern: z.string().describe('The glob pattern to match files against'),
    path: z
      .string()
      .optional()
      .describe(
        'The directory to search in. If not specified, the current working directory will be used.',
      ),
  }),
)

输出 Schema 则提供了丰富的元信息，包括执行耗时、匹配文件总数、文件路径数组以及是否被截断：

// 源码文件：src/tools/GlobTool/GlobTool.ts（第 37-52 行）
const outputSchema = lazySchema(() =>
  z.object({
    durationMs: z.number().describe('Time taken to execute the search in milliseconds'),
    numFiles: z.number().describe('Total number of files found'),
    filenames: z.array(z.string()).describe('Array of file paths that match the pattern'),
    truncated: z.boolean().describe('Whether results were truncated (limited to 100 files)'),
  }),
)

2.2 glob 模式匹配的实现

GlobTool 的搜索逻辑并非使用 Node.js 原生的 fs 模块或第三方 glob 库，而是巧妙地复用了 ripgrep 的 --files 和 --glob 参数。这一决策在 src/utils/glob.ts 中有清晰的体现。

// 源码文件：src/utils/glob.ts（第 65-120 行）
export async function glob(
  filePattern: string,
  cwd: string,
  { limit, offset }: { limit: number; offset: number },
  abortSignal: AbortSignal,
  toolPermissionContext: ToolPermissionContext,
): Promise<{ files: string[]; truncated: boolean }> {
  let searchDir = cwd
  let searchPattern = filePattern

  // 处理绝对路径：提取基础目录并转换为相对模式
  if (isAbsolute(filePattern)) {
    const { baseDir, relativePattern } = extractGlobBaseDirectory(filePattern)
    if (baseDir) {
      searchDir = baseDir
      searchPattern = relativePattern
    }
  }

  const args = [
    '--files',
    '--glob',
    searchPattern,
    '--sort=modified',
    ...(noIgnore ? ['--no-ignore'] : []),
    ...(hidden ? ['--hidden'] : []),
  ]

  const allPaths = await ripGrep(args, searchDir, abortSignal)
  // ...
}

这里的关键设计在于：

绝对路径处理：当用户传入绝对路径模式（如 /home/user/project/**/*.ts）时，extractGlobBaseDirectory 函数会提取出静态前缀目录（/home/user/project）和相对模式（**/*.ts）。这是因为 ripgrep 的 --glob 参数只接受相对模式。
ripgrep 参数组合：
- --files：仅列出文件路径，不搜索内容
- --glob：按 glob 模式过滤文件
- --sort=modified：按修改时间排序（最旧的在前，最新的在后——注意这与文档描述的"最近修改在前"略有不同）
- --no-ignore：默认忽略 .gitignore，可通过环境变量 CLAUDE_CODE_GLOB_NO_IGNORE=false 关闭
- --hidden：默认包含隐藏文件，可通过环境变量 CLAUDE_CODE_GLOB_HIDDEN=false 排除

2.3 结果排序与截断策略

GlobTool 的结果截断逻辑非常直接：默认最多返回 100 个文件（可通过 globLimits.maxResults 配置），超出部分会被截断并标记 truncated: true。

// 源码文件：src/tools/GlobTool/GlobTool.ts（第 140-155 行）
async call(input, { abortController, getAppState, globLimits }) {
  const limit = globLimits?.maxResults ?? 100
  const { files, truncated } = await glob(
    input.pattern,
    GlobTool.getPath(input),
    { limit, offset: 0 },
    abortController.signal,
    appState.toolPermissionContext,
  )
  const filenames = files.map(toRelativePath)
  // ...
}

路径会被转换为相对于当前工作目录的相对路径，以节省上下文 tokens。这种"相对化"处理是 Claude Code 搜索工具的统一约定。

2.4 忽略文件与权限控制

GlobTool 在搜索时会叠加多层忽略规则：

环境变量控制：CLAUDE_CODE_GLOB_NO_IGNORE 和 CLAUDE_CODE_GLOB_HIDDEN 控制是否尊重 .gitignore 和隐藏文件。
权限系统的忽略模式：getFileReadIgnorePatterns 从权限上下文中获取额外的忽略模式，并通过 normalizePatternsToPath 将其转换为绝对路径模式。
插件缓存排除：getGlobExclusionsForPluginCache 会排除孤立的插件版本目录，避免在插件缓存中搜索到大量无用文件。

// 源码文件：src/utils/glob.ts（第 100-115 行）
for (const pattern of ignorePatterns) {
  args.push('--glob', `!${pattern}`)
}

for (const exclusion of await getGlobExclusionsForPluginCache(searchDir)) {
  args.push('--glob', exclusion)
}

三、GrepTool：基于 ripgrep 的内容搜索工具

如果说 GlobTool 是"按名字找人"，那么 GrepTool 就是"按内容找人"。GrepTool 的架构远比 GlobTool 复杂，因为它需要封装 ripgrep 的绝大多数高级功能，同时将这些能力以 LLM 友好的方式暴露出来。

3.1 输入 Schema：丰富的搜索参数

GrepTool 的输入 Schema 是 Claude Code 所有工具中最复杂的之一，几乎映射了 ripgrep 的所有常用参数：

// 源码文件：src/tools/GrepTool/GrepTool.ts（第 25-95 行）
const inputSchema = lazySchema(() =>
  z.strictObject({
    pattern: z.string().describe('The regular expression pattern to search for in file contents'),
    path: z.string().optional().describe('File or directory to search in (rg PATH). Defaults to cwd.'),
    glob: z.string().optional().describe('Glob pattern to filter files (e.g. "*.js", "*.{ts,tsx}")'),
    output_mode: z.enum(['content', 'files_with_matches', 'count']).optional(),
    '-B': semanticNumber(z.number().optional()).describe('Number of lines before each match'),
    '-A': semanticNumber(z.number().optional()).describe('Number of lines after each match'),
    '-C': semanticNumber(z.number().optional()).describe('Alias for context'),
    context: semanticNumber(z.number().optional()).describe('Lines before and after each match'),
    '-n': semanticBoolean(z.boolean().optional()).describe('Show line numbers'),
    '-i': semanticBoolean(z.boolean().optional()).describe('Case insensitive search'),
    type: z.string().optional().describe('File type to search (rg --type)'),
    head_limit: semanticNumber(z.number().optional()).describe('Limit output to first N lines'),
    offset: semanticNumber(z.number().optional()).describe('Skip first N lines before applying head_limit'),
    multiline: semanticBoolean(z.boolean().optional()).describe('Enable multiline mode (rg -U --multiline-dotall)'),
  }),
)

这里有几个值得关注的设计细节：

语义化数字/布尔值：semanticNumber 和 semanticBoolean 包装器让 LLM 可以用自然语言描述参数值（如 "show 3 lines of context"），而不仅是原始数值。
三种输出模式：content（显示匹配行及上下文）、files_with_matches（仅显示文件路径，默认）、count（显示匹配计数）。
分页机制：head_limit + offset 的组合让模型可以在大量结果中进行分页浏览。

3.2 ripgrep 集成：utils/ripgrep.ts

GrepTool 的真正核心不在工具定义本身，而在 src/utils/ripgrep.ts 这个 21KB 的模块中。该模块处理了 ripgrep 的调用配置、错误恢复、超时管理、流式输出等全部底层细节。

3.2.1 三种 ripgrep 运行模式

// 源码文件：src/utils/ripgrep.ts（第 25-65 行）
type RipgrepConfig = {
  mode: 'system' | 'builtin' | 'embedded'
  command: string
  args: string[]
  argv0?: string
}

const getRipgrepConfig = memoize((): RipgrepConfig => {
  const userWantsSystemRipgrep = isEnvDefinedFalsy(process.env.USE_BUILTIN_RIPGREP)

  if (userWantsSystemRipgrep) {
    const { cmd: systemPath } = findExecutable('rg', [])
    if (systemPath !== 'rg') {
      // SECURITY: 使用命令名 'rg' 而非完整路径，防止 PATH 劫持攻击
      return { mode: 'system', command: 'rg', args: [] }
    }
  }

  // Bundled 模式下，ripgrep 静态编译到 bun-internal 中
  if (isInBundledMode()) {
    return {
      mode: 'embedded',
      command: process.execPath,
      args: ['--no-config'],
      argv0: 'rg',
    }
  }

  // 使用随包分发的内置 ripgrep 二进制文件
  const rgRoot = path.resolve(__dirname, 'vendor', 'ripgrep')
  const command = process.platform === 'win32'
    ? path.resolve(rgRoot, `${process.arch}-win32`, 'rg.exe')
    : path.resolve(rgRoot, `${process.arch}-${process.platform}`, 'rg')

  return { mode: 'builtin', command, args: [] }
})

Claude Code 支持三种 ripgrep 运行模式，按优先级排列：

System 模式：使用系统已安装的 rg。这里有一个安全细节：源码特意使用命令名 'rg' 而非 findExecutable 返回的完整路径，目的是让操作系统通过 NoDefaultCurrentDirectoryInExePath 机制安全解析，防止当前目录下的恶意 ./rg.exe 被意外执行。
Embedded 模式：在 Bun 打包的原生模式下，ripgrep 被静态编译进 Bun 内部，通过 argv0='rg' 进行分发调用。
Builtin 模式：使用随 Claude Code 包分发的预编译 ripgrep 二进制文件，按平台（Windows/Linux/macOS）和架构（x64/arm64）自动选择正确的可执行文件。

3.2.2 超时与错误恢复机制

ripgrep 的调用封装了复杂的超时和错误恢复逻辑：

// 源码文件：src/utils/ripgrep.ts（第 110-180 行）
function ripGrepRaw(
  args: string[],
  target: string,
  abortSignal: AbortSignal,
  callback: (error, stdout, stderr) => void,
  singleThread = false,
): ChildProcess {
  const { rgPath, rgArgs, argv0 } = ripgrepCommand()
  const threadArgs = singleThread ? ['-j', '1'] : []
  const fullArgs = [...rgArgs, ...threadArgs, ...args, target]

  // WSL 环境下文件读取性能极差（3-5 倍慢于 WSL2），超时设为 60 秒
  const defaultTimeout = getPlatform() === 'wsl' ? 60_000 : 20_000
  const timeout = parsedSeconds > 0 ? parsedSeconds * 1000 : defaultTimeout

  // SIGTERM 可能无法终止处于不可中断 I/O 中的 ripgrep
  // 因此设置 5 秒后自动升级到 SIGKILL
  let killTimeoutId: ReturnType<typeof setTimeout> | undefined
  const timeoutId = setTimeout(() => {
    if (process.platform === 'win32') {
      child.kill()
    } else {
      child.kill('SIGTERM')
      killTimeoutId = setTimeout(c => c.kill('SIGKILL'), 5_000, child)
    }
  }, timeout)
}

这段代码展示了生产级系统对边缘情况的深度考量：

WSL 特化：检测到 WSL 环境时，默认超时从 20 秒提升到 60 秒。
信号升级策略：先发送 SIGTERM，如果 5 秒内进程仍未退出，则升级为不可忽略的 SIGKILL。这是因为 ripgrep 在进行深层文件系统遍历时可能处于不可中断的 I/O 状态。
EAGAIN 重试：在资源受限环境（如 Docker、CI）中，ripgrep 可能因线程过多而触发 EAGAIN（资源暂时不可用）。此时系统会自动以单线程模式（-j 1）重试一次。

// 源码文件：src/utils/ripgrep.ts（第 280-310 行）
if (!isRetry && isEagainError(stderr)) {
  logForDebugging(`rg EAGAIN error detected, retrying with single-threaded mode (-j 1)`)
  logEvent('tengu_ripgrep_eagain_retry', {})
  ripGrepRaw(args, target, abortSignal, (retryError, retryStdout, retryStderr) => {
    handleResult(retryError, retryStdout, retryStderr, true)
  }, true) // 仅本次重试使用单线程
  return
}

值得注意的是，源码明确注释说明"仅本次重试使用单线程"——曾经的全局单线程模式会导致大型仓库的后续查询超时，因为 EAGAIN 往往只是启动时的瞬态错误。

3.2.3 流式搜索与内存优化

对于超大仓库（如 25 万文件、16MB 路径文本），ripgrep.ts 提供了三种不同粒度的调用接口：

函数	用途	内存策略
`ripGrep()`	通用搜索，返回字符串数组	缓冲完整 stdout，20MB 上限
`ripGrepStream()`	交互式实时搜索（如 fzf 模式）	按 chunk 流式处理，约 64KB 峰值内存
`ripGrepFileCount()`	仅统计文件数量（遥测用）	逐字节计数换行符，不存储路径

// 源码文件：src/utils/ripgrep.ts（第 220-260 行）
export async function ripGrepStream(
  args: string[],
  target: string,
  abortSignal: AbortSignal,
  onLines: (lines: string[]) => void,
): Promise<void> {
  // ...
  let remainder = ''
  child.stdout?.on('data', (chunk: Buffer) => {
    const data = remainder + chunk.toString()
    const lines = data.split('\n')
    remainder = lines.pop() ?? ''
    if (lines.length) onLines(lines.map(stripCR))
  })
}

ripGrepStream 实现了经典的流式行解析模式：每个 stdout chunk 可能与上一个 chunk 的尾部拼接成完整行，未完成的部分暂存到 remainder 中等待下一个 chunk。

3.3 搜索策略：智能排除与过滤

GrepTool 在构建 ripgrep 参数时，会自动排除版本控制目录，避免在 .git、.svn 等目录中产生噪声：

// 源码文件：src/tools/GrepTool/GrepTool.ts（第 98-105 行）
const VCS_DIRECTORIES_TO_EXCLUDE = [
  '.git', '.svn', '.hg', '.bzr', '.jj', '.sl',
] as const

这里的 .jj 和 .sl 分别对应新兴的 Jujutsu（jj）版本控制系统和 Sapling（sl）版本控制系统，体现了 Claude Code 对前沿开发工具生态的跟踪。

GrepTool 同样尊重权限系统的忽略模式，并通过 glob 和 type 参数支持文件类型过滤。type 参数直接映射到 ripgrep 的 --type 参数，对于标准文件类型（如 js、py、rust）比 glob 更高效。

3.4 结果处理：格式化、截断与上下文

3.4.1 三种输出模式的处理逻辑

GrepTool 的核心执行逻辑会根据 output_mode 构建不同的 ripgrep 参数：

files_with_matches 模式：使用 ripgrep 的 -l 参数，仅返回匹配的文件路径。这是默认模式，因为 LLM 通常不需要看到每一行匹配内容，只需要知道"哪些文件包含这个模式"。
content 模式：显示匹配行及其上下文（通过 -B/-A/-C 控制），并包含行号（-n）。这是模型需要深入分析代码时的选择。
count 模式：使用 --count 参数，返回每个文件的匹配计数。

3.4.2 头限制与分页

// 源码文件：src/tools/GrepTool/GrepTool.ts（第 108-125 行）
const DEFAULT_HEAD_LIMIT = 250

function applyHeadLimit<T>(
  items: T[],
  limit: number | undefined,
  offset: number = 0,
): { items: T[]; appliedLimit: number | undefined } {
  if (limit === 0) {
    return { items: items.slice(offset), appliedLimit: undefined }
  }
  const effectiveLimit = limit ?? DEFAULT_HEAD_LIMIT
  const sliced = items.slice(offset, offset + effectiveLimit)
  const wasTruncated = items.length - offset > effectiveLimit
  return {
    items: sliced,
    appliedLimit: wasTruncated ? effectiveLimit : undefined,
  }
}

applyHeadLimit 的设计非常精妙：

显式 0 表示无限制：head_limit: 0 是"逃生舱口"，用于需要完整结果的场景。
默认 250 行：这个阈值经过精心调校——无限制的 content 模式搜索可能填满 20KB 的持久化阈值（约 6-24K tokens），250 行在探索性搜索和上下文膨胀之间取得了平衡。
仅在实际截断时报告 appliedLimit：如果结果总数未超过限制，appliedLimit 为 undefined。这让模型知道"所有结果已返回"，无需分页。

3.5 结果格式化与模型上下文整合

GrepTool 的结果通过 mapToolResultToToolResultBlockParam 方法转换为 Anthropic API 的 tool_result 格式。对于空结果，返回简洁的 "No files found"；对于有结果的情况，会根据是否截断追加提示信息：

// 源码文件：src/tools/GlobTool/GlobTool.ts（第 158-175 行）
mapToolResultToToolResultBlockParam(output, toolUseID) {
  if (output.filenames.length === 0) {
    return {
      tool_use_id: toolUseID,
      type: 'tool_result',
      content: 'No files found',
    }
  }
  return {
    tool_use_id: toolUseID,
    type: 'tool_result',
    content: [
      ...output.filenames,
      ...(output.truncated
        ? ['(Results are truncated. Consider using a more specific path or pattern.)']
        : []),
    ].join('\n'),
  }
}

这种格式化确保了结果对 LLM 友好：路径列表紧凑、截断提示明确、无冗余元数据。

四、搜索工具的工作流与相邻工具关系

GlobTool 和 GrepTool 并非孤立工作，它们与 FileReadTool、BashTool、AgentTool 构成了完整的代码探索工作流。

flowchart TD
    A[用户请求] --> B{搜索类型判断}
    B -->|按文件名查找| C[GlobTool]
    B -->|按内容查找| D[GrepTool]
    B -->|复杂命令| E[BashTool]
    C --> F[FileReadTool 读取]
    D --> F
    E --> G[结果解析]
    G --> F
    F --> H[FileEditTool 修改]
    H --> I{验证}
    I -->|需要更多上下文| D
    I -->|完成| J[结束]

4.1 搜索 → 读取 → 编辑的标准工作流

最典型的代码修改工作流遵循"搜索定位 → 读取内容 → 执行编辑"的三部曲：

GlobTool 定位文件：当用户说"修改所有测试文件"时，模型可能先用 GlobTool 搜索 **/*.test.ts 获取目标文件列表。
GrepTool 精确定位：当用户说"找到处理超时错误的函数"时，模型用 GrepTool 搜索 "timeout.*error" 或 "handleTimeout"，获取包含匹配内容的文件。
FileReadTool 读取完整内容：模型使用 FileReadTool 读取上一步找到的文件，获取完整的代码上下文。这是编辑前的必要步骤——Claude Code 的 FileEditTool 要求提供精确的 old_string 和 new_string，模型必须知道当前文件的完整内容才能生成准确的替换。
FileEditTool 执行修改：在读取并理解代码后，模型调用 FileEditTool 进行精确修改。

4.2 与 BashTool 的分工边界

GrepTool 的 prompt 中明确声明了一条防御性约束："ALWAYS use Grep for search tasks. NEVER invoke grep or rg as a Bash command."

这条约束并非无的放矢。如果模型通过 BashTool 直接调用 rg 或 grep，将会绕过 Claude Code 的权限系统、忽略模式、超时保护和安全策略。GrepTool 的所有调用都经过 checkReadPermissionForTool 检查，确保不会读取用户明确禁止访问的目录。

然而，BashTool 在以下场景仍有其价值：

需要 ripgrep 的某些高级参数（如 --pre 预处理器、--json 结构化输出）而 GrepTool 未暴露时
需要组合多个命令的复杂搜索管道（如 rg pattern | awk '{print $1}' | sort -u）
需要搜索二进制文件或非文本内容时

4.3 与 AgentTool 的协作

当搜索任务需要多轮迭代时（例如"找到所有使用已废弃 API 的地方并给出修改建议"），单个 GrepTool 调用可能无法完成任务。此时 prompt 会引导模型使用 AgentTool：

"When you are doing an open ended search that may require multiple rounds of globbing and grepping, use the Agent tool instead."

AgentTool 可以生成一个子 Agent，在独立的上下文中执行多轮搜索和分析，最后将汇总结果返回给父 Agent。这种模式避免了主对话上下文被大量中间搜索结果污染。

sequenceDiagram
    participant User
    participant MainAgent
    participant GlobTool
    participant GrepTool
    participant FileReadTool
    participant FileEditTool

    User->>MainAgent: 修改所有使用了 deprecated API 的文件
    MainAgent->>GrepTool: 搜索 deprecated API 的调用位置
    GrepTool-->>MainAgent: 返回 12 个匹配文件
    loop 每个匹配文件
        MainAgent->>FileReadTool: 读取文件内容
        FileReadTool-->>MainAgent: 返回文件文本
        MainAgent->>FileEditTool: 替换 deprecated API 为新 API
        FileEditTool-->>MainAgent: 返回 diff 结果
    end
    MainAgent-->>User: 完成修改，总结变更

五、架构设计的深层考量

5.1 为何选择 ripgrep 而非原生实现？

Claude Code 选择将 ripgrep 作为搜索底层而非使用 Node.js 原生 fs 模块或 fast-glob 等库，基于以下考量：

性能：ripgrep 基于 Rust 编写，使用内存映射、SIMD 加速和并行遍历，在大型代码库上的性能远超 JavaScript 实现。
.gitignore 支持：ripgrep 原生支持递归读取并应用 .gitignore 规则，而 Node.js 生态中正确处理嵌套 .gitignore 的库并不多。
跨平台一致性：ripgrep 在 Windows、macOS、Linux 上的行为高度一致，而 Node.js 的 glob 库在不同平台上对路径分隔符、符号链接等的处理常有微妙差异。
正则引擎：ripgrep 使用 Rust 的正则引擎，支持完整的 PCRE2 语法（通过 -P 参数），性能优于 JavaScript 的 RegExp。

5.2 安全设计

搜索工具的安全设计体现在多个层面：

UNC 路径防护：GlobTool 的输入验证会跳过以 \\ 或 // 开头的 UNC 路径，防止在 Windows 网络路径上意外触发 NTLM 凭据泄露。
权限检查：两个工具都通过 checkReadPermissionForTool 进行权限校验，确保遵守用户设置的文件访问规则。
路径相对化：搜索结果默认转换为相对路径，避免将敏感的绝对路径（如 /home/username/...）暴露给模型上下文。
沙盒模式：在嵌入式/沙盒环境中，ripgrep 的调用会被进一步限制，确保无法逃逸出工作目录。

5.3 上下文预算管理

搜索工具的设计始终围绕一个核心约束：LLM 的上下文窗口是有限的。Claude Code 为此采取了多层防御：

maxResultSizeChars：GlobTool 设为 100KB，GrepTool 设为 20KB（后者是工具结果持久化阈值）。
默认截断：GlobTool 默认最多 100 个文件，GrepTool 默认最多 250 行结果。
相对路径：所有路径相对于 cwd，最大限度缩短字符串长度。
截断提示：明确告知模型"结果已被截断"，引导其使用更精确的模式或分页参数。

六、总结

GlobTool 和 GrepTool 的设计充分体现了 Claude Code "简单接口、复杂实现"的工程哲学。对外，它们暴露给 LLM 的是极简的、语义化的参数接口；对内，它们封装了 ripgrep 的高性能搜索能力、完善的错误恢复机制、以及精细的上下文预算管理。

这两个工具的成功之处在于：它们不是简单地将 shell 命令包装成 API，而是深度思考了"LLM 需要什么样的搜索能力"这一问题。三种输出模式的分级设计（files_with_matches → content → count）、语义化参数包装、智能截断与分页、以及明确的工具路由提示（何时用 GrepTool、何时用 AgentTool、何时用 BashTool），共同构成了一个对 AI Agent 极度友好的搜索抽象层。

对于希望构建类似 Agent 编码工具的开发者而言，GlobTool 和 GrepTool 的源码提供了一个绝佳的参考模板：如何将一个成熟的命令行工具（ripgrep）无缝集成到 AI 工作流中，同时保留其全部性能优势，又赋予其 LLM 可理解、可安全调用的接口形态。