测试架构总览：三层测试金字塔与 23 个测试文件全景

这是「GSD 全景代码解析」专题的第 48 篇。

一、GSD 测试哲学

在 AI Agent 工具链中，测试不仅是 Bug 捕获网，更是架构契约的强制执行者。GSD 的测试体系围绕一个核心信念构建：如果代码难以测试，说明架构需要重构。

graph TD
    A[测试金字塔] --> B[单元测试
70%]
    A --> C[集成测试
20%]
    A --> D[E2E 测试
10%]
    B --> E[快速反馈]
    C --> F[组件协作]
    D --> G[用户旅程]

二、三层测试架构

2.1 单元测试层（tests/unit/）

覆盖所有核心模块的纯逻辑单元：

tests/unit/
├── commands/
│   ├── new-project.test.js      # 项目初始化逻辑
│   ├── plan-phase.test.js       # 规划阶段解析
│   └── execute-phase.test.js    # 执行阶段调度
├── parsers/
│   ├── plan-parser.test.js      # plan.json 解析
│   └── prompt-builder.test.js   # 提示词构建
├── core/
│   ├── state-machine.test.js    # 状态机转换
│   ├── context-engine.test.js   # 上下文管理
│   └── agent-delegator.test.js  # Agent 委托逻辑
└── utils/
    ├── git-utils.test.js        # Git 操作工具
    ├── file-utils.test.js       # 文件系统工具
    └── template-utils.test.js   # 模板处理工具

设计原则：

每个测试文件对应一个源文件
使用 Jest 的 describe 按功能分组
Mock 外部依赖（文件系统、Git、LLM API）
单测试执行时间 < 100ms

2.2 集成测试层（tests/integration/）

验证组件间的协作：

tests/integration/
├── workflow/
│   ├── execute-phase.integration.test.js
│   ├── plan-phase.integration.test.js
│   └── verify-work.integration.test.js
├── sdk/
│   ├── phase-runner.integration.test.js
│   ├── hook-system.integration.test.js
│   └── query-layer.integration.test.js
└── agent/
    ├── planner-executor.integration.test.js
    └── reviewer-verifier.integration.test.js

关注点：

数据流完整性（输入 → 处理 → 输出）
错误传播链路
状态持久化与恢复
并发安全性

2.3 E2E 测试层（tests/e2e/）

模拟完整用户旅程：

tests/e2e/
├── project-lifecycle/
│   ├── new-project-to-deploy.test.js
│   ├── plan-to-execute.test.js
│   └── full-cycle-with-audit.test.js
├── cli/
│   ├── command-invocation.test.js
│   └── error-handling.test.js
└── regression/
    ├── issue-42-repro.test.js
    └── issue-78-repro.test.js

三、23 个测试文件全景

#	文件	层级	覆盖范围	优先级
1	`plan-parser.test.js`	单元	plan.json 解析	P0
2	`prompt-builder.test.js`	单元	动态提示词构建	P0
3	`state-machine.test.js`	单元	状态转换逻辑	P0
4	`context-engine.test.js`	单元	上下文截断与注入	P0
5	`agent-delegator.test.js`	单元	Agent 选择与调用	P0
6	`new-project.test.js`	单元	项目初始化	P1
7	`plan-phase.test.js`	单元	规划阶段逻辑	P1
8	`execute-phase.test.js`	单元	执行阶段调度	P1
9	`git-utils.test.js`	单元	Git 操作封装	P1
10	`file-utils.test.js`	单元	文件读写工具	P1
11	`template-utils.test.js`	单元	模板渲染	P2
12	`phase-runner.integration.test.js`	集成	Phase 执行引擎	P0
13	`hook-system.integration.test.js`	集成	钩子注册与触发	P0
14	`query-layer.integration.test.js`	集成	查询接口	P1
15	`planner-executor.integration.test.js`	集成	Planner→Executor 链路	P1
16	`reviewer-verifier.integration.test.js`	集成	Reviewer→Verifier 链路	P1
17	`new-project-to-deploy.test.js`	E2E	完整项目生命周期	P0
18	`plan-to-execute.test.js`	E2E	规划到执行流程	P1
19	`full-cycle-with-audit.test.js`	E2E	含审计的完整流程	P1
20	`command-invocation.test.js`	E2E	CLI 命令调用	P1
21	`error-handling.test.js`	E2E	错误恢复流程	P2
22	`drift-detection.test.js`	专项	配置漂移检测	P1
23	`security-audit.test.js`	专项	安全规则验证	P1

四、测试策略选择矩阵

场景	推荐层级	原因
新增纯函数工具	单元	无副作用，易于断言
修改 Agent 调用链	集成	需要验证协作逻辑
新增 CLI 命令	E2E	验证用户交互完整性
重构状态管理	单元+集成	既要验证逻辑，又要验证持久化
修复并发 Bug	集成+压力	需要模拟并发场景

五、Mock 策略

// LLM API Mock
jest.mock('../src/llm/client', () => ({
  invoke: jest.fn(async (prompt) => {
    // 返回确定性响应，确保测试可重复
    return { content: mockResponses[prompt.type] };
  })
}));

// 文件系统 Mock
jest.mock('fs/promises', () => ({
  readFile: jest.fn(async (path) => mockFs[path]),
  writeFile: jest.fn(async (path, data) => { mockFs[path] = data; }),
  mkdir: jest.fn(async () => {})
}));

// Git Mock
jest.mock('../src/utils/git', () => ({
  commit: jest.fn(async (msg) => ({ hash: 'abc123' })),
  getStatus: jest.fn(async () => ({ modified: [], untracked: [] }))
}));

六、覆盖率目标

单元测试:  statements 85% | branches 80% | functions 90% | lines 85%
集成测试:  statements 60% | branches 50% | functions 70% | lines 60%
E2E 测试:   关键用户旅程 100% 覆盖

七、CI 集成

# .github/workflows/test.yml
jobs:
  test:
    strategy:
      matrix:
        tier: [unit, integration, e2e]
    steps:
      - run: npm test -- --testPathPattern=tests/${{ matrix.tier }}/
      - uses: codecov/codecov-action@v3

单元测试每次提交触发，集成测试每小时触发，E2E 测试每天触发。

下一篇预告： 第 49 篇《关键测试解析（上）：Plan Parser、State Machine、Agent Delegator》

我们将深入解读测试金字塔中最核心的 11 个单元测试文件，解析它们的测试策略、边界条件覆盖和 Mock 设计。单元测试是 GSD 质量体系的基石，敬请期待。