本章总览
TOKEN_BUDGET feature 让用户在 prompt 中声明「+500k」「use 2M tokens」等目标;utils/tokenBudget.ts 解析用户文本,query/tokenBudget.ts 在 turn 结束时根据累计 output token 决定自动注入 nudge 继续还是停止并上报 tengu_token_budget_completed。这与 API task_budget(QueryParams.taskBudget)是两条独立机制。本章要求你能画出 checkTokenBudget 决策树并说明与 compact 的关系。
学完本章你应该能
- 区分 TOKEN_BUDGET feature 与 API task_budget
- 解释 parseTokenBudget 三种正则与 getBudgetContinuationMessage
- 掌握 checkTokenBudget 的 90% 阈值与 diminishing returns 规则
- 定位 query.ts 中 budget continuation 的 continue 站点
- 说明为何 subagent(agentId)跳过 budget 检查
核心概念(先读懂这些)
+500k 是 turn 级 output 目标,不是 context window
getCurrentTurnTokenBudget() / getTurnOutputTokens() 来自 bootstrap/state,在用户输入解析 budget 后维护。checkTokenBudget 比较的是本轮 agentic turn 累计 output token 与用户目标,与 autocompact 的 context window 阈值正交——两者可同时生效。
Diminishing returns 防无限续跑
continuationCount >= 3 且最近两次 delta 均 < 500 token 时 isDiminishing。此时即使未达 90% 也 stop 并带 diminishingReturns: true。避免模型在目标附近空转消耗 API。
Nudge 是 isMeta user message
continue 分支 yield 前不结束 turn,而是 state.messages 追加 createUserMessage({ content: nudgeMessage, isMeta: true })。模型看到「Stopped at N%… Keep working — do not summarize」被推动继续产出而非收尾。
建议学习步骤
- 阅读 utils/tokenBudget 解析与 continuation 文案源码块 A
- 阅读 BudgetTracker 与 checkTokenBudget 源码块 B
- 阅读 query.ts 中 TOKEN_BUDGET gate 与 continue 站点源码块 C
- 对照 mod-services/compact 理解 context 压缩不重置 turn output 计数
- grep tengu_token_budget_completed 看 analytics 字段
常见误区
注意
agentId 存在时 checkTokenBudget 直接 stop——子代理不应继承主线程 +500k 目标
注意
budget === null 或 <= 0 同理解禁
注意
勿与 taskBudgetRemaining(API beta)混淆——变量名相近语义不同
两条「budget」机制对照
Claude Code 同时存在:
| 机制 | 入口 | 决策模块 | 目的 |
|---|---|---|---|
| User token target (+500k) | 用户 prompt 解析 | query/tokenBudget.ts | 自动 continue nudge |
| API task_budget | QueryParams.taskBudget | claude.ts configureTaskBudgetParams | 服务端 output 预算 |
| Context autocompact | token 逼近 window | services/compact/autoCompact.ts | 压缩历史 |
本章只覆盖第一行。 task_budget 的 remaining 在 query.ts 用 taskBudgetRemaining 局部变量跨 compact 传递——见 config-deps 章。
Turn 结束出口顺序(query.ts L1258+):
assistant 完成且无 tool_use
→ [skip if api error] handleStopHooks
→ [if TOKEN_BUDGET] checkTokenBudget
├─ continue → meta nudge message → continue loop
└─ stop → log tengu_token_budget_completed → return Terminal
用户输入解析:utils/tokenBudget.ts
parseTokenBudget 支持三种 surface form:
- Shorthand start —
^\s*+(\d+)(k|m|b)如消息开头的+500k - Shorthand end — 句末
+500k.(避免与开头重复计数) - Verbose —
use 2M tokens/spend 1.5m tokens
MULTIPLIERS: k=1e3, m=1e6, b=1e9。findTokenBudgetPositions 供 UI 高亮 budget 片段。
getBudgetContinuationMessage 生成续跑 nudge:
Stopped at {pct}% of token target ({turnTokens} / {budget}). Keep working — do not summarize.
「do not summarize」刻意阻止模型在接近目标时提前写总结结束 turn——与 compact 摘要、agent summary 语义不同。
源码引用: src/utils/tokenBudget.ts · 第 1–29 行(共 74 行)
1| // Shorthand (+500k) anchored to start/end to avoid false positives in natural language.
2| // Verbose (use/spend 2M tokens) matches anywhere.
3| const SHORTHAND_START_RE = /^\s*\+(\d+(?:\.\d+)?)\s*(k|m|b)\b/i
4| // Lookbehind (?<=\s) is avoided — it defeats YARR JIT in JSC, and the
5| // interpreter scans O(n) even with the $ anchor. Capture the whitespace
6| // instead; callers offset match.index by 1 where position matters.
7| const SHORTHAND_END_RE = /\s\+(\d+(?:\.\d+)?)\s*(k|m|b)\s*[.!?]?\s*$/i
8| const VERBOSE_RE = /\b(?:use|spend)\s+(\d+(?:\.\d+)?)\s*(k|m|b)\s*tokens?\b/i
9| const VERBOSE_RE_G = new RegExp(VERBOSE_RE.source, 'gi')
10|
11| const MULTIPLIERS: Record<string, number> = {
12| k: 1_000,
13| m: 1_000_000,
14| b: 1_000_000_000,
15| }
16|
17| function parseBudgetMatch(value: string, suffix: string): number {
18| return parseFloat(value) * MULTIPLIERS[suffix.toLowerCase()]!
19| }
20|
21| export function parseTokenBudget(text: string): number | null {
22| const startMatch = text.match(SHORTHAND_START_RE)
23| if (startMatch) return parseBudgetMatch(startMatch[1]!, startMatch[2]!)
24| const endMatch = text.match(SHORTHAND_END_RE)
25| if (endMatch) return parseBudgetMatch(endMatch[1]!, endMatch[2]!)
26| const verboseMatch = text.match(VERBOSE_RE)
27| if (verboseMatch) return parseBudgetMatch(verboseMatch[1]!, verboseMatch[2]!)
28| return null
29| }
源码引用: src/utils/tokenBudget.ts · 第 66–73 行(共 74 行)
66| export function getBudgetContinuationMessage(
67| pct: number,
68| turnTokens: number,
69| budget: number,
70| ): string {
71| const fmt = (n: number): string => new Intl.NumberFormat('en-US').format(n)
72| return `Stopped at ${pct}% of token target (${fmt(turnTokens)} / ${fmt(budget)}). Keep working \u2014 do not summarize.`
73| }
BudgetTracker 与 checkTokenBudget 决策树
createBudgetTracker 初始化:
- continuationCount: 0
- lastDeltaTokens / lastGlobalTurnTokens: 用于 diminishing 检测
- startedAt: completionEvent.durationMs
checkTokenBudget(tracker, agentId, budget, globalTurnTokens) 流程:
if agentId || budget == null || budget <= 0 → stop (completionEvent: null)
turnTokens = globalTurnTokens
pct = round(turnTokens / budget * 100)
delta = globalTurnTokens - tracker.lastGlobalTurnTokens
isDiminishing = continuationCount >= 3
&& delta < 500 && tracker.lastDeltaTokens < 500
if !isDiminishing && turnTokens < budget * 0.9:
→ continue (increment tracker, return nudgeMessage)
if isDiminishing || continuationCount > 0:
→ stop with completionEvent (含 diminishingReturns flag)
else → stop (completionEvent: null)
COMPLETION_THRESHOLD = 0.9 — 达到 90% 目标才考虑停止(除非 diminishing)。
DIMINISHING_THRESHOLD = 500 — 两次检查间 output 增量低于此视为「回报递减」。
源码引用: src/query/tokenBudget.ts · 第 1–20 行(共 94 行)
1| import { getBudgetContinuationMessage } from '../utils/tokenBudget.js'
2|
3| const COMPLETION_THRESHOLD = 0.9
4| const DIMINISHING_THRESHOLD = 500
5|
6| export type BudgetTracker = {
7| continuationCount: number
8| lastDeltaTokens: number
9| lastGlobalTurnTokens: number
10| startedAt: number
11| }
12|
13| export function createBudgetTracker(): BudgetTracker {
14| return {
15| continuationCount: 0,
16| lastDeltaTokens: 0,
17| lastGlobalTurnTokens: 0,
18| startedAt: Date.now(),
19| }
20| }
源码引用: src/query/tokenBudget.ts · 第 45–93 行(共 94 行)
45| export function checkTokenBudget(
46| tracker: BudgetTracker,
47| agentId: string | undefined,
48| budget: number | null,
49| globalTurnTokens: number,
50| ): TokenBudgetDecision {
51| if (agentId || budget === null || budget <= 0) {
52| return { action: 'stop', completionEvent: null }
53| }
54|
55| const turnTokens = globalTurnTokens
56| const pct = Math.round((turnTokens / budget) * 100)
57| const deltaSinceLastCheck = globalTurnTokens - tracker.lastGlobalTurnTokens
58|
59| const isDiminishing =
60| tracker.continuationCount >= 3 &&
61| deltaSinceLastCheck < DIMINISHING_THRESHOLD &&
62| tracker.lastDeltaTokens < DIMINISHING_THRESHOLD
63|
64| if (!isDiminishing && turnTokens < budget * COMPLETION_THRESHOLD) {
65| tracker.continuationCount++
66| tracker.lastDeltaTokens = deltaSinceLastCheck
67| tracker.lastGlobalTurnTokens = globalTurnTokens
68| return {
69| action: 'continue',
70| nudgeMessage: getBudgetContinuationMessage(pct, turnTokens, budget),
71| continuationCount: tracker.continuationCount,
72| pct,
73| turnTokens,
74| budget,
75| }
76| }
77|
78| if (isDiminishing || tracker.continuationCount > 0) {
79| return {
80| action: 'stop',
81| completionEvent: {
82| continuationCount: tracker.continuationCount,
83| pct,
84| turnTokens,
85| budget,
86| diminishingReturns: isDiminishing,
87| durationMs: Date.now() - tracker.startedAt,
88| },
89| }
90| }
91|
92| return { action: 'stop', completionEvent: null }
93| }
query.ts 集成:feature gate 与 continue 站点
queryLoop 入口:
const budgetTracker = feature('TOKEN_BUDGET') ? createBudgetTracker() : null
仅 bundle 含 TOKEN_BUDGET 时启用;外部 build 无此 feature 时整段消除。
Stop hooks 通过后(L1308-1355):
if (feature('TOKEN_BUDGET')) {
const decision = checkTokenBudget(
budgetTracker!,
toolUseContext.agentId,
getCurrentTurnTokenBudget(),
getTurnOutputTokens(),
)
if (decision.action === 'continue') {
incrementBudgetContinuationCount()
state = { ..., transition: { reason: 'token_budget_continuation' } }
continue
}
if (decision.completionEvent) {
logEvent('tengu_token_budget_completed', { ...decision.completionEvent, queryChainId, queryDepth })
}
}
return { reason: 'completed' }
incrementBudgetContinuationCount 在 bootstrap/state,供 UI 或 analytics 读取续跑次数。
源码引用: src/query.ts · 第 280–280 行(共 1730 行)
280| const budgetTracker = feature('TOKEN_BUDGET') ? createBudgetTracker() : null
源码引用: src/query.ts · 第 1308–1357 行(共 1730 行)
1308| if (feature('TOKEN_BUDGET')) {
1309| const decision = checkTokenBudget(
1310| budgetTracker!,
1311| toolUseContext.agentId,
1312| getCurrentTurnTokenBudget(),
1313| getTurnOutputTokens(),
1314| )
1315|
1316| if (decision.action === 'continue') {
1317| incrementBudgetContinuationCount()
1318| logForDebugging(
1319| `Token budget continuation #${decision.continuationCount}: ${decision.pct}% (${decision.turnTokens.toLocaleString()} / ${decision.budget.toLocaleString()})`,
1320| )
1321| state = {
1322| messages: [
1323| ...messagesForQuery,
1324| ...assistantMessages,
1325| createUserMessage({
1326| content: decision.nudgeMessage,
1327| isMeta: true,
1328| }),
1329| ],
1330| toolUseContext,
1331| autoCompactTracking: tracking,
1332| maxOutputTokensRecoveryCount: 0,
1333| hasAttemptedReactiveCompact: false,
1334| maxOutputTokensOverride: undefined,
1335| pendingToolUseSummary: undefined,
1336| stopHookActive: undefined,
1337| turnCount,
1338| transition: { reason: 'token_budget_continuation' },
1339| }
1340| continue
1341| }
1342|
1343| if (decision.completionEvent) {
1344| if (decision.completionEvent.diminishingReturns) {
1345| logForDebugging(
1346| `Token budget early stop: diminishing returns at ${decision.completionEvent.pct}%`,
1347| )
1348| }
1349| logEvent('tengu_token_budget_completed', {
1350| ...decision.completionEvent,
1351| queryChainId: queryChainIdForAnalytics,
1352| queryDepth: queryTracking.depth,
1353| })
1354| }
1355| }
1356|
1357| return { reason: 'completed' }
与 compact 服务的协作边界
Token budget 与 autocompact 不共享阈值,但同一 turn 内顺序交互:
- 每轮 iteration 开头 deps.autocompact 可能压缩 messages(mod-services/compact)
- 同一 iteration 内 callModel 产出 assistant + tool results
- iteration 末尾 若无 tool_use,走 stop hooks → token budget
关键点:
- getTurnOutputTokens 统计的是 turn 级 output token 累计,compact 不重置该计数
- compact 改变 messages 长度但不改变「用户 +500k 目标」
- snip/microcompact 在 autocompact 之前,影响 autocompact 阈值但不影响 budget 百分比
若 turn 因 token_budget_continuation continue,下一 iteration 仍先 autocompact——长任务可能在「续跑」与「压缩」之间交替。
Reactive compact(413)路径与 budget continue 独立;见 transitions 章 reactive_compact_retry。
源码引用: src/query.ts · 第 449–468 行(共 1730 行)
449| const fullSystemPrompt = asSystemPrompt(
450| appendSystemContext(systemPrompt, systemContext),
451| )
452|
453| queryCheckpoint('query_autocompact_start')
454| const { compactionResult, consecutiveFailures } = await deps.autocompact(
455| messagesForQuery,
456| toolUseContext,
457| {
458| systemPrompt,
459| userContext,
460| systemContext,
461| toolUseContext,
462| forkContextMessages: messagesForQuery,
463| },
464| querySource,
465| tracking,
466| snipTokensFreed,
467| )
468| queryCheckpoint('query_autocompact_end')
源码引用: src/services/compact/autoCompact.ts · 第 62–91 行(共 352 行)
62| export const AUTOCOMPACT_BUFFER_TOKENS = 13_000
63| export const WARNING_THRESHOLD_BUFFER_TOKENS = 20_000
64| export const ERROR_THRESHOLD_BUFFER_TOKENS = 20_000
65| export const MANUAL_COMPACT_BUFFER_TOKENS = 3_000
66|
67| // Stop trying autocompact after this many consecutive failures.
68| // BQ 2026-03-10: 1,279 sessions had 50+ consecutive failures (up to 3,272)
69| // in a single session, wasting ~250K API calls/day globally.
70| const MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES = 3
71|
72| export function getAutoCompactThreshold(model: string): number {
73| const effectiveContextWindow = getEffectiveContextWindowSize(model)
74|
75| const autocompactThreshold =
76| effectiveContextWindow - AUTOCOMPACT_BUFFER_TOKENS
77|
78| // Override for easier testing of autocompact
79| const envPercent = process.env.CLAUDE_AUTOCOMPACT_PCT_OVERRIDE
80| if (envPercent) {
81| const parsed = parseFloat(envPercent)
82| if (!isNaN(parsed) && parsed > 0 && parsed <= 100) {
83| const percentageThreshold = Math.floor(
84| effectiveContextWindow * (parsed / 100),
85| )
86| return Math.min(percentageThreshold, autocompactThreshold)
87| }
88| }
89|
90| return autocompactThreshold
91| }
bootstrap/state 中的 budget 状态
query/tokenBudget.ts 不读取 env 或 settings——budget 数值来自 bootstrap/state:
- getCurrentTurnTokenBudget() — 用户声明目标,未声明则 null
- getTurnOutputTokens() — 当前 turn 累计 output
- incrementBudgetContinuationCount() — continue 时 +1
解析发生在用户消息进入 REPL 时(utils/tokenBudget parseTokenBudget 被 prompt 管线调用)。query 模块只消费已解析结果。
subagent 跳过: toolUseContext.agentId 传入 checkTokenBudget 时首行 return stop——子 Agent 有自己的 turn 语义,不应无限续跑主目标的 +500k。
测试建议: 直接测 checkTokenBudget 表驱动;集成测需 mock getCurrentTurnTokenBudget 与 getTurnOutputTokens 或走 E2E prompt「+100k」.
Analytics 与调试
continue 路径:logForDebugging 打印 Token budget continuation #N: pct% (turn / budget)。
stop 路径(有 completionEvent):
- diminishingReturns === true 时额外 debug 行
early stop: diminishing returns - logEvent('tengu_token_budget_completed', { continuationCount, pct, turnTokens, budget, diminishingReturns, durationMs, queryChainId, queryDepth })
排查「为何未续跑」:
- feature('TOKEN_BUDGET') 是否编译进 bundle
- getCurrentTurnTokenBudget() 是否 null
- 是否 subagent(agentId)
- 是否已达 90% 或 diminishing
- stop hook 是否 preventContinuation(budget 检查在 stop hooks 之后)
排查「为何无限续跑」:看 continuationCount 与 delta——diminishing 应在第 3+ 次低增量后停止。
源码引用: src/query.ts · 第 1316–1354 行(共 1730 行)
1316| if (decision.action === 'continue') {
1317| incrementBudgetContinuationCount()
1318| logForDebugging(
1319| `Token budget continuation #${decision.continuationCount}: ${decision.pct}% (${decision.turnTokens.toLocaleString()} / ${decision.budget.toLocaleString()})`,
1320| )
1321| state = {
1322| messages: [
1323| ...messagesForQuery,
1324| ...assistantMessages,
1325| createUserMessage({
1326| content: decision.nudgeMessage,
1327| isMeta: true,
1328| }),
1329| ],
1330| toolUseContext,
1331| autoCompactTracking: tracking,
1332| maxOutputTokensRecoveryCount: 0,
1333| hasAttemptedReactiveCompact: false,
1334| maxOutputTokensOverride: undefined,
1335| pendingToolUseSummary: undefined,
1336| stopHookActive: undefined,
1337| turnCount,
1338| transition: { reason: 'token_budget_continuation' },
1339| }
1340| continue
1341| }
1342|
1343| if (decision.completionEvent) {
1344| if (decision.completionEvent.diminishingReturns) {
1345| logForDebugging(
1346| `Token budget early stop: diminishing returns at ${decision.completionEvent.pct}%`,
1347| )
1348| }
1349| logEvent('tengu_token_budget_completed', {
1350| ...decision.completionEvent,
1351| queryChainId: queryChainIdForAnalytics,
1352| queryDepth: queryTracking.depth,
1353| })
1354| }
源码目录(本主题)
解析逻辑在 utils/tokenBudget.ts(utils 模块);点击 query.ts 查看集成点。
动手练习
- 写表驱动测试:turnTokens=450k budget=500k → continue;550k → stop
- 模拟 continuationCount=3 且 delta=100 → diminishing stop
- 读 mod-services/compact 的 AUTOCOMPACT_BUFFER_TOKENS,对比 90% budget 阈值语义差异
- 在 query.ts 画 turn 结束 DAG:api error / stop hook / token budget / completed 四出口
本章小结与延伸
tokenBudget = 用户声明目标的自动续跑控制器。读完读 stop-hooks 章看 turn 结束完整出口顺序。 继续学习: