AI 代码审查引擎设计：AST 分析与 LLM 语义理解的融合方案-酒店常州论坛

AI 代码审查引擎设计：AST 分析与 LLM 语义理解的融合方案

一、代码审查的"人力天花板"：规则引擎的盲区与 LLM 的幻觉

前端团队的 Code Review 效率瓶颈显而易见：规则引擎（ESLint、SonarQBE）只能捕获语法和风格问题，无法理解业务逻辑缺陷；人工审查能发现深层问题，但速度跟不上 PR 增长。更关键的是，规则引擎对"代码坏味道"的检测极其有限——一个 500 行的组件，ESLint 可能零告警，但人一眼就能看出职责不清、状态管理混乱。

LLM 理论上能理解代码语义，但直接让 LLM 审查代码，输出质量极不稳定：有时给出精准建议，有时产生幻觉（指出不存在的 bug）。将 AST 静态分析的确定性与 LLM 的语义理解能力结合，才是生产可用的代码审查方案。

二、混合审查架构：AST 确定性 + LLM 语义性

graph TB subgraph 输入层 A[PR Diff] --> B[AST 解析器] A --> C[变更上下文<br/>文件+依赖] end subgraph 静态分析层 B --> D[结构化规则<br/>复杂度/依赖/类型] D --> E[确定性告警<br/>零误报] end subgraph 语义分析层 C --> F[LLM 审查<br/>业务逻辑/设计模式] F --> G[语义建议<br/>需人工确认] end subgraph 融合层 E --> H[告警合并<br/>去重+优先级排序] G --> H H --> I[审查报告<br/>确定性告警+语义建议] end

核心原则：AST 分析负责"确定性"问题（复杂度、依赖方向、类型安全），LLM 负责"语义性"问题（业务逻辑正确性、设计模式合理性）。两者结果在融合层去重和排序，确定性告警直接标记，语义建议标注置信度供人工判断。

三、混合审查引擎实现

3.1 AST 静态分析模块

import * as ts from 'typescript'; interface CodeIssue { type: 'structural' | 'semantic'; severity: 'error' | 'warning' | 'info'; message: string; location: { line: number; column: number }; confidence: number; // 0-1, 结构性问题为1.0 } class ASTAnalyzer { /** 分析组件复杂度 */ analyzeComplexity(sourceFile: ts.SourceFile): CodeIssue[] { const issues: CodeIssue[] = []; const visit = (node: ts.Node) => { // 检测函数圈复杂度 if (ts.isFunctionDeclaration(node) || ts.isArrowFunction(node)) { const complexity = this.calculateCyclomaticComplexity(node); if (complexity > 10) { issues.push({ type: 'structural', severity: complexity > 20 ? 'error' : 'warning', message: `函数圈复杂度 ${complexity}，建议拆分`, location: this.getLocation(node, sourceFile), confidence: 1.0, }); } } // 检测 useEffect 依赖项问题 if (ts.isCallExpression(node)) { const expr = node.expression; if (ts.isIdentifier(expr) && expr.text === 'useEffect') { const deps = node.arguments[1]; if (deps && !ts.isArrayLiteralExpression(deps)) { issues.push({ type: 'structural', severity: 'error', message: 'useEffect 缺少依赖项数组', location: this.getLocation(node, sourceFile), confidence: 1.0, }); } } } ts.forEachChild(node, visit); }; visit(sourceFile); return issues; } private calculateCyclomaticComplexity(node: ts.FunctionLike): number { let complexity = 1; const visit = (n: ts.Node) => { if ( ts.isIfStatement(n) || ts.isConditionalExpression(n) || ts.isForStatement(n) || ts.isWhileStatement(n) || ts.isCaseClause(n) || ts.isCatchClause(n) ) { complexity++; } ts.forEachChild(n, visit); }; ts.forEachChild(node, visit); return complexity; } private getLocation(node: ts.Node, sf: ts.SourceFile) { const { line, character } = sf.getLineAndCharacterOfPosition( node.getStart() ); return { line: line + 1, column: character + 1 }; } }

3.2 LLM 语义审查模块

interface LLMReviewConfig { model: string; maxTokens: number; temperature: number; } class LLMSemanticReviewer { private config: LLMReviewConfig; constructor(config: LLMReviewConfig) { this.config = config; } async review(diff: string, context: string): Promise<CodeIssue[]> { const prompt = this.buildReviewPrompt(diff, context); const response = await this.callLLM(prompt); return this.parseResponse(response); } private buildReviewPrompt(diff: string, context: string): string { return `你是一个前端代码审查专家。审查以下代码变更，关注： 1. 业务逻辑正确性：是否有逻辑错误、边界条件遗漏 2. 状态管理：是否有状态不一致、竞态条件 3. 性能问题：是否有不必要的重渲染、内存泄漏 4. 安全问题：XSS、注入等 代码变更： ${diff} 上下文： ${context} 输出 JSON 数组，每项包含： - severity: error/warning/info - message: 具体问题描述 - line: 行号 - confidence: 0-1 置信度 仅输出你确信存在的问题，不要猜测。`; } private parseResponse(response: string): CodeIssue[] { try { const issues = JSON.parse(response); return issues.map((issue: any) => ({ type: 'semantic' as const, severity: issue.severity, message: issue.message, location: { line: issue.line, column: 0 }, confidence: issue.confidence || 0.7, })); } catch { return []; } } }

3.3 融合与去重

class ReviewMerger { merge(astIssues: CodeIssue[], llmIssues: CodeIssue[]): CodeIssue[] { const all = [...astIssues, ...llmIssues]; // 按位置去重：同一行的问题只保留置信度最高的 const byLocation = new Map<string, CodeIssue>(); for (const issue of all) { const key = `${issue.location.line}`; const existing = byLocation.get(key); if (!existing || issue.confidence > existing.confidence) { byLocation.set(key, issue); } } // 按严重程度和置信度排序 return Array.from(byLocation.values()).sort((a, b) => { const severityOrder = { error: 0, warning: 1, info: 2 }; if (severityOrder[a.severity] !== severityOrder[b.severity]) { return severityOrder[a.severity] - severityOrder[b.severity]; } return b.confidence - a.confidence; }); } }

四、混合审查的 Trade-offs 分析

LLM 审查的延迟：LLM 调用延迟约 2-5 秒，对于大 PR 可能需要 10 秒以上。AST 分析则在 100ms 内完成。解决方案是 AST 分析同步执行、LLM 审查异步执行，先返回确定性告警，语义建议异步追加。

LLM 幻觉控制：LLM 可能"发现"不存在的 bug。通过 Prompt 约束（"仅输出你确信存在的问题"）和置信度阈值过滤（低于 0.6 的建议自动丢弃），可将幻觉率控制在 10% 以内。但无法完全消除，需要人工确认机制。

成本控制：每个 PR 的 LLM 审查约消耗 2000-5000 tokens。按 GPT-4 定价，每次审查约 0.03-0.1 美元。高频项目中，月成本可能达到数百美元。可以通过缓存相似代码的审查结果来降低成本。

适用边界：混合审查适合中大型前端项目（>50 个组件），这类项目的 PR 审查压力最大。小型项目或个人项目，人工审查足够，引入 LLM 反而增加复杂度。

五、总结

AI 代码审查的核心不是"用 LLM 替代人工"，而是"AST 做确定性检测 + LLM 做语义理解 + 人工做最终判断"。AST 保证零误报的基础质量底线，LLM 扩展审查的语义覆盖面，人工确认过滤幻觉并做出业务决策。

落地步骤：先建立 AST 分析管线，覆盖复杂度、依赖方向、类型安全等确定性规则；然后引入 LLM 审查处理语义问题；最后建立融合层，合并两类结果并按优先级排序。全程监控审查准确率和开发者接受率，持续优化 Prompt 和规则。

企业官网建设流程全解析