本章总览
services/compact/ 负责在对话逼近 context window 时,通过 fork 子 Agent 生成摘要、插入 SystemCompactBoundaryMessage、并 re-inject 文件/skills/MCP delta 附件。autoCompact.ts 决定何时触发;compact.ts 执行 compactConversation 主流程。本章要求你能从 REPL 的「Compacting…」状态追到 autoCompactIfNeeded 与 PreCompact hook。
学完本章你应该能
- 解释 getAutoCompactThreshold 与 AUTOCOMPACT_BUFFER_TOKENS 的含义
- 说明 shouldAutoCompact 的 recursion guard(compact/session_memory querySource)
- 描述 compactConversation 的 fork 摘要与 post-compact 附件重建
- 理解 circuit breaker(MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES)
- 能在 prompt-too-long 路径定位 truncateHeadForPTLRetry
核心概念(先读懂这些)
Effective window 预留 summary 输出
getEffectiveContextWindowSize = contextWindow - min(maxOutput, 20000)。压缩本身是一次 API 调用,需要为摘要输出预留 token;p99.99 摘要约 17387 token。CLAUDE_CODE_AUTO_COMPACT_WINDOW env 可人为缩小 window 便于测试。
compact 与 context collapse 互斥
当 CONTEXT_COLLAPSE feature 开启且 isContextCollapseEnabled() 时,shouldAutoCompact 返回 false,避免 autocompact 在 90%/95% collapse 阈值之间抢跑。reactiveCompact 仍可作为 API 413 fallback,因为它直接读 isAutoCompactEnabled 而非 shouldAutoCompact。
Post-compact 不是简单替换 messages
buildPostCompactMessages 顺序:boundary → summaryMessages → messagesToKeep → attachments → hookResults。annotateBoundaryWithPreservedSegment 写入 preservedSegment 元数据,供 session JSONL loader 修复 parentUuid 链。压缩后 readFileState 清空并按预算是重新 attach 最多 5 个文件。
建议学习步骤
- 阅读源码块 A:阈值常量与 getAutoCompactThreshold
- 阅读源码块 B:shouldAutoCompact 守卫
- 阅读源码块 C:autoCompactIfNeeded 主流程
- 阅读源码块 D:compactConversation 入口与 hooks
- 阅读源码块 E:buildPostCompactMessages 与 boundary 元数据
- 阅读源码块 F:stripImagesFromMessages
- 在源码树打开 services/compact/ 对照行号
常见误区
注意
manual /compact 与 autocompact 共用 compactConversation,但 suppressFollowUpQuestions 不同
注意
session memory compact 成功时也需 notifyCompaction,否则 prompt cache break 误报
注意
DISABLE_COMPACT 关闭一切;DISABLE_AUTO_COMPACT 仅关自动
在架构中的位置
Compact 在 query 循环中的插入点:
每轮 assistant 结束 → tokenCountWithEstimation
→ autoCompactIfNeeded (autoCompact.ts)
→ [可选] trySessionMemoryCompaction
→ compactConversation (compact.ts)
→ buildPostCompactMessages → setMessages 替换 transcript
→ runPostCompactCleanup / notifyCompaction / markPostCompaction
compact/ 子目录还包括 reactiveCompact.ts(413 响应后截断)、microCompact.ts、snipCompact.ts、sessionMemoryCompact.ts 等变体。本章聚焦 autoCompact.ts + compact.ts 主路径。
阈值体系:buffer 与 warning
autoCompact.ts 导出多组 token 常量:
| 常量 | 值 | 用途 |
|---|---|---|
| AUTOCOMPACT_BUFFER_TOKENS | 13000 | 距 effective window 顶部的 autocompact 触发缓冲 |
| WARNING_THRESHOLD_BUFFER_TOKENS | 20000 | UI 黄色警告 |
| ERROR_THRESHOLD_BUFFER_TOKENS | 20000 | UI 红色警告 |
| MANUAL_COMPACT_BUFFER_TOKENS | 3000 | 手动 compact 仍可用的 blocking limit |
getAutoCompactThreshold = effectiveWindow - AUTOCOMPACT_BUFFER。CLAUDE_AUTOCOMPACT_PCT_OVERRIDE 可按百分比降低阈值(测试用)。
calculateTokenWarningState 返回 percentLeft、各级 isAbove* 标志,以及 isAtBlockingLimit(默认 effectiveWindow - 3000,可被 CLAUDE_CODE_BLOCKING_LIMIT_OVERRIDE 覆盖)。REPL 进度条读此结构。
源码引用: src/services/compact/autoCompact.ts · 第 62–91 行(共 352 行)
62| export const AUTOCOMPACT_BUFFER_TOKENS = 13_000
63| export const WARNING_THRESHOLD_BUFFER_TOKENS = 20_000
64| export const ERROR_THRESHOLD_BUFFER_TOKENS = 20_000
65| export const MANUAL_COMPACT_BUFFER_TOKENS = 3_000
66|
67| // Stop trying autocompact after this many consecutive failures.
68| // BQ 2026-03-10: 1,279 sessions had 50+ consecutive failures (up to 3,272)
69| // in a single session, wasting ~250K API calls/day globally.
70| const MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES = 3
71|
72| export function getAutoCompactThreshold(model: string): number {
73| const effectiveContextWindow = getEffectiveContextWindowSize(model)
74|
75| const autocompactThreshold =
76| effectiveContextWindow - AUTOCOMPACT_BUFFER_TOKENS
77|
78| // Override for easier testing of autocompact
79| const envPercent = process.env.CLAUDE_AUTOCOMPACT_PCT_OVERRIDE
80| if (envPercent) {
81| const parsed = parseFloat(envPercent)
82| if (!isNaN(parsed) && parsed > 0 && parsed <= 100) {
83| const percentageThreshold = Math.floor(
84| effectiveContextWindow * (parsed / 100),
85| )
86| return Math.min(percentageThreshold, autocompactThreshold)
87| }
88| }
89|
90| return autocompactThreshold
91| }
源码引用: src/services/compact/autoCompact.ts · 第 93–145 行(共 352 行)
93| export function calculateTokenWarningState(
94| tokenUsage: number,
95| model: string,
96| ): {
97| percentLeft: number
98| isAboveWarningThreshold: boolean
99| isAboveErrorThreshold: boolean
100| isAboveAutoCompactThreshold: boolean
101| isAtBlockingLimit: boolean
102| } {
103| const autoCompactThreshold = getAutoCompactThreshold(model)
104| const threshold = isAutoCompactEnabled()
105| ? autoCompactThreshold
106| : getEffectiveContextWindowSize(model)
107|
108| const percentLeft = Math.max(
109| 0,
110| Math.round(((threshold - tokenUsage) / threshold) * 100),
111| )
112|
113| const warningThreshold = threshold - WARNING_THRESHOLD_BUFFER_TOKENS
114| const errorThreshold = threshold - ERROR_THRESHOLD_BUFFER_TOKENS
115|
116| const isAboveWarningThreshold = tokenUsage >= warningThreshold
117| const isAboveErrorThreshold = tokenUsage >= errorThreshold
118|
119| const isAboveAutoCompactThreshold =
120| isAutoCompactEnabled() && tokenUsage >= autoCompactThreshold
121|
122| const actualContextWindow = getEffectiveContextWindowSize(model)
123| const defaultBlockingLimit =
124| actualContextWindow - MANUAL_COMPACT_BUFFER_TOKENS
125|
126| // Allow override for testing
127| const blockingLimitOverride = process.env.CLAUDE_CODE_BLOCKING_LIMIT_OVERRIDE
128| const parsedOverride = blockingLimitOverride
129| ? parseInt(blockingLimitOverride, 10)
130| : NaN
131| const blockingLimit =
132| !isNaN(parsedOverride) && parsedOverride > 0
133| ? parsedOverride
134| : defaultBlockingLimit
135|
136| const isAtBlockingLimit = tokenUsage >= blockingLimit
137|
138| return {
139| percentLeft,
140| isAboveWarningThreshold,
141| isAboveErrorThreshold,
142| isAboveAutoCompactThreshold,
143| isAtBlockingLimit,
144| }
145| }
shouldAutoCompact:何时不压缩
shouldAutoCompact 在计数 token 前有多层 guard:
querySource === 'session_memory' | 'compact':fork 子 Agent 防死锁marble_origami(ctx-agent)在 CONTEXT_COLLAPSE 下禁用,避免 resetContextCollapse 破坏主线程 logisAutoCompactEnabled():全局 config + DISABLE_COMPACT / DISABLE_AUTO_COMPACT env- REACTIVE_COMPACT + GrowthBook
tengu_cobalt_raccoon:抑制 proactive,改由 reactive 捕 413 - CONTEXT_COLLAPSE enabled 时整段跳过 autocompact
token 计数用 tokenCountWithEstimation(messages) - snipTokensFreed:snip 已删消息但 assistant usage 仍含旧 context 时需减去 rough delta。
调试日志格式:autocompact: tokens=… threshold=… effectiveWindow=…。
源码引用: src/services/compact/autoCompact.ts · 第 147–239 行(共 352 行)
147| export function isAutoCompactEnabled(): boolean {
148| if (isEnvTruthy(process.env.DISABLE_COMPACT)) {
149| return false
150| }
151| // Allow disabling just auto-compact (keeps manual /compact working)
152| if (isEnvTruthy(process.env.DISABLE_AUTO_COMPACT)) {
153| return false
154| }
155| // Check if user has disabled auto-compact in their settings
156| const userConfig = getGlobalConfig()
157| return userConfig.autoCompactEnabled
158| }
159|
160| export async function shouldAutoCompact(
161| messages: Message[],
162| model: string,
163| querySource?: QuerySource,
164| // Snip removes messages but the surviving assistant's usage still reflects
165| // pre-snip context, so tokenCountWithEstimation can't see the savings.
166| // Subtract the rough-delta that snip already computed.
167| snipTokensFreed = 0,
168| ): Promise<boolean> {
169| // Recursion guards. session_memory and compact are forked agents that
170| // would deadlock.
171| if (querySource === 'session_memory' || querySource === 'compact') {
172| return false
173| }
174| // marble_origami is the ctx-agent — if ITS context blows up and
175| // autocompact fires, runPostCompactCleanup calls resetContextCollapse()
176| // which destroys the MAIN thread's committed log (module-level state
177| // shared across forks). Inside feature() so the string DCEs from
178| // external builds (it's in excluded-strings.txt).
179| if (feature('CONTEXT_COLLAPSE')) {
180| if (querySource === 'marble_origami') {
181| return false
182| }
183| }
184|
185| if (!isAutoCompactEnabled()) {
186| return false
187| }
188|
189| // Reactive-only mode: suppress proactive autocompact, let reactive compact
190| // catch the API's prompt-too-long. feature() wrapper keeps the flag string
191| // out of external builds (REACTIVE_COMPACT is ant-only).
192| // Note: returning false here also means autoCompactIfNeeded never reaches
193| // trySessionMemoryCompaction in the query loop — the /compact call site
194| // still tries session memory first. Revisit if reactive-only graduates.
195| if (feature('REACTIVE_COMPACT')) {
196| if (getFeatureValue_CACHED_MAY_BE_STALE('tengu_cobalt_raccoon', false)) {
197| return false
198| }
199| }
200|
201| // Context-collapse mode: same suppression. Collapse IS the context
202| // management system when it's on — the 90% commit / 95% blocking-spawn
203| // flow owns the headroom problem. Autocompact firing at effective-13k
204| // (~93% of effective) sits right between collapse's commit-start (90%)
205| // and blocking (95%), so it would race collapse and usually win, nuking
206| // granular context that collapse was about to save. Gating here rather
207| // than in isAutoCompactEnabled() keeps reactiveCompact alive as the 413
208| // fallback (it consults isAutoCompactEnabled directly) and leaves
209| // sessionMemory + manual /compact working.
210| //
211| // Consult isContextCollapseEnabled (not the raw gate) so the
212| // CLAUDE_CONTEXT_COLLAPSE env override is honored here too. require()
213| // inside the block breaks the init-time cycle (this file exports
214| // getEffectiveContextWindowSize which collapse's index imports).
215| if (feature('CONTEXT_COLLAPSE')) {
216| /* eslint-disable @typescript-eslint/no-require-imports */
217| const { isContextCollapseEnabled } =
218| require('../contextCollapse/index.js') as typeof import('../contextCollapse/index.js')
219| /* eslint-enable @typescript-eslint/no-require-imports */
220| if (isContextCollapseEnabled()) {
221| return false
222| }
223| }
224|
225| const tokenCount = tokenCountWithEstimation(messages) - snipTokensFreed
226| const threshold = getAutoCompactThreshold(model)
227| const effectiveWindow = getEffectiveContextWindowSize(model)
228|
229| logForDebugging(
230| `autocompact: tokens=${tokenCount} threshold=${threshold} effectiveWindow=${effectiveWindow}${snipTokensFreed > 0 ? ` snipFreed=${snipTokensFreed}` : ''}`,
231| )
232|
233| const { isAboveAutoCompactThreshold } = calculateTokenWarningState(
234| tokenCount,
235| model,
236| )
237|
238| return isAboveAutoCompactThreshold
239| }
autoCompactIfNeeded:session memory 与 circuit breaker
autoCompactIfNeeded 是 query 循环调用的入口:
Circuit breaker:tracking.consecutiveFailures >= 3 时直接返回(BQ 2026-03:单 session 最多 3272 次失败浪费 ~250K API calls/day)。
主流程:
shouldAutoCompact为 false → 退出- 构造
RecompactionInfo(是否链式重压缩、turnCounter、turnId) - trySessionMemoryCompaction 优先(实验路径):成功则
setLastSummarizedMessageId(undefined)、runPostCompactCleanup、notifyCompaction、markPostCompaction - 否则
compactConversation(..., isAutoCompact=true) - 失败递增 consecutiveFailures;用户 abort 不 logError
成功返回 compactionResult 供 caller 替换 messages;失败不抛到 UI(除非 manual compact)。
源码引用: src/services/compact/autoCompact.ts · 第 241–351 行(共 352 行)
241| export async function autoCompactIfNeeded(
242| messages: Message[],
243| toolUseContext: ToolUseContext,
244| cacheSafeParams: CacheSafeParams,
245| querySource?: QuerySource,
246| tracking?: AutoCompactTrackingState,
247| snipTokensFreed?: number,
248| ): Promise<{
249| wasCompacted: boolean
250| compactionResult?: CompactionResult
251| consecutiveFailures?: number
252| }> {
253| if (isEnvTruthy(process.env.DISABLE_COMPACT)) {
254| return { wasCompacted: false }
255| }
256|
257| // Circuit breaker: stop retrying after N consecutive failures.
258| // Without this, sessions where context is irrecoverably over the limit
259| // hammer the API with doomed compaction attempts on every turn.
260| if (
261| tracking?.consecutiveFailures !== undefined &&
262| tracking.consecutiveFailures >= MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES
263| ) {
264| return { wasCompacted: false }
265| }
266|
267| const model = toolUseContext.options.mainLoopModel
268| const shouldCompact = await shouldAutoCompact(
269| messages,
270| model,
271| querySource,
272| snipTokensFreed,
273| )
274|
275| if (!shouldCompact) {
276| return { wasCompacted: false }
277| }
278|
279| const recompactionInfo: RecompactionInfo = {
280| isRecompactionInChain: tracking?.compacted === true,
281| turnsSincePreviousCompact: tracking?.turnCounter ?? -1,
282| previousCompactTurnId: tracking?.turnId,
283| autoCompactThreshold: getAutoCompactThreshold(model),
284| querySource,
285| }
286|
287| // EXPERIMENT: Try session memory compaction first
288| const sessionMemoryResult = await trySessionMemoryCompaction(
289| messages,
290| toolUseContext.agentId,
291| recompactionInfo.autoCompactThreshold,
292| )
293| if (sessionMemoryResult) {
294| // Reset lastSummarizedMessageId since session memory compaction prunes messages
295| // and the old message UUID will no longer exist after the REPL replaces messages
296| setLastSummarizedMessageId(undefined)
297| runPostCompactCleanup(querySource)
298| // Reset cache read baseline so the post-compact drop isn't flagged as a
299| // break. compactConversation does this internally; SM-compact doesn't.
300| // BQ 2026-03-01: missing this made 20% of tengu_prompt_cache_break events
301| // false positives (systemPromptChanged=true, timeSinceLastAssistantMsg=-1).
302| if (feature('PROMPT_CACHE_BREAK_DETECTION')) {
303| notifyCompaction(querySource ?? 'compact', toolUseContext.agentId)
304| }
305| markPostCompaction()
306| return {
307| wasCompacted: true,
308| compactionResult: sessionMemoryResult,
309| }
310| }
311|
312| try {
313| const compactionResult = await compactConversation(
314| messages,
315| toolUseContext,
316| cacheSafeParams,
317| true, // Suppress user questions for autocompact
318| undefined, // No custom instructions for autocompact
319| true, // isAutoCompact
320| recompactionInfo,
321| )
322|
323| // Reset lastSummarizedMessageId since legacy compaction replaces all messages
324| // and the old message UUID will no longer exist in the new messages array
325| setLastSummarizedMessageId(undefined)
326| runPostCompactCleanup(querySource)
327|
328| return {
329| wasCompacted: true,
330| compactionResult,
331| // Reset failure count on success
332| consecutiveFailures: 0,
333| }
334| } catch (error) {
335| if (!hasExactErrorMessage(error, ERROR_MESSAGE_USER_ABORT)) {
336| logError(error)
337| }
338| // Increment consecutive failure count for circuit breaker.
339| // The caller threads this through autoCompactTracking so the
340| // next query loop iteration can skip futile retry attempts.
341| const prevFailures = tracking?.consecutiveFailures ?? 0
342| const nextFailures = prevFailures + 1
343| if (nextFailures >= MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES) {
344| logForDebugging(
345| `autocompact: circuit breaker tripped after ${nextFailures} consecutive failures — skipping future attempts this session`,
346| { level: 'warn' },
347| )
348| }
349| return { wasCompacted: false, consecutiveFailures: nextFailures }
350| }
351| }
compactConversation:hooks 与 fork 摘要
compactConversation 是手动 /compact 与 autocompact 共用核心:
- executePreCompactHooks(trigger: auto|manual)合并 customInstructions
getCompactPrompt+createUserMessage构造摘要请求- streamCompactSummary fork 子 Agent(
runForkedAgent),可共享主会话 prompt cache(GrowthBooktengu_compact_cache_prefix,默认 true) - 若摘要以 PROMPT_TOO_LONG 开头 → truncateHeadForPTLRetry 丢弃最老 API round,最多 MAX_PTL_RETRIES
- 成功后清空
readFileState,并行 createPostCompactFileAttachments + async agent attachments - re-inject plan/skill/deferred-tools/MCP instructions delta attachments
- executePostCompactHooks、写 boundary、
logEvent('tengu_compact_*')
stripImagesFromMessages 在发摘要 API 前把 image/document 换成 [image] 文本,防止 compact 请求本身 PTL。
失败 reason 枚举:prompt_too_long、no_summary、api_error,均打 tengu_compact_failed。
源码引用: src/services/compact/compact.ts · 第 387–515 行(共 1706 行)
387| export async function compactConversation(
388| messages: Message[],
389| context: ToolUseContext,
390| cacheSafeParams: CacheSafeParams,
391| suppressFollowUpQuestions: boolean,
392| customInstructions?: string,
393| isAutoCompact: boolean = false,
394| recompactionInfo?: RecompactionInfo,
395| ): Promise<CompactionResult> {
396| try {
397| if (messages.length === 0) {
398| throw new Error(ERROR_MESSAGE_NOT_ENOUGH_MESSAGES)
399| }
400|
401| const preCompactTokenCount = tokenCountWithEstimation(messages)
402|
403| const appState = context.getAppState()
404| void logPermissionContextForAnts(appState.toolPermissionContext, 'summary')
405|
406| context.onCompactProgress?.({
407| type: 'hooks_start',
408| hookType: 'pre_compact',
409| })
410|
411| // Execute PreCompact hooks
412| context.setSDKStatus?.('compacting')
413| const hookResult = await executePreCompactHooks(
414| {
415| trigger: isAutoCompact ? 'auto' : 'manual',
416| customInstructions: customInstructions ?? null,
417| },
418| context.abortController.signal,
419| )
420| customInstructions = mergeHookInstructions(
421| customInstructions,
422| hookResult.newCustomInstructions,
423| )
424| const userDisplayMessage = hookResult.userDisplayMessage
425|
426| // Show requesting mode with up arrow and custom message
427| context.setStreamMode?.('requesting')
428| context.setResponseLength?.(() => 0)
429| context.onCompactProgress?.({ type: 'compact_start' })
430|
431| // 3P default: true — forked-agent path reuses main conversation's prompt cache.
432| // Experiment (Jan 2026) confirmed: false path is 98% cache miss, costs ~0.76% of
433| // fleet cache_creation (~38B tok/day), concentrated in ephemeral envs (CCR/GHA/SDK)
434| // with cold GB cache and 3P providers where GB is disabled. GB gate kept as kill-switch.
435| const promptCacheSharingEnabled = getFeatureValue_CACHED_MAY_BE_STALE(
436| 'tengu_compact_cache_prefix',
437| true,
438| )
439|
440| const compactPrompt = getCompactPrompt(customInstructions)
441| const summaryRequest = createUserMessage({
442| content: compactPrompt,
443| })
444|
445| let messagesToSummarize = messages
446| let retryCacheSafeParams = cacheSafeParams
447| let summaryResponse: AssistantMessage
448| let summary: string | null
449| let ptlAttempts = 0
450| for (;;) {
451| summaryResponse = await streamCompactSummary({
452| messages: messagesToSummarize,
453| summaryRequest,
454| appState,
455| context,
456| preCompactTokenCount,
457| cacheSafeParams: retryCacheSafeParams,
458| })
459| summary = getAssistantMessageText(summaryResponse)
460| if (!summary?.startsWith(PROMPT_TOO_LONG_ERROR_MESSAGE)) break
461|
462| // CC-1180: compact request itself hit prompt-too-long. Truncate the
463| // oldest API-round groups and retry rather than leaving the user stuck.
464| ptlAttempts++
465| const truncated =
466| ptlAttempts <= MAX_PTL_RETRIES
467| ? truncateHeadForPTLRetry(messagesToSummarize, summaryResponse)
468| : null
469| if (!truncated) {
470| logEvent('tengu_compact_failed', {
471| reason:
472| 'prompt_too_long' as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
473| preCompactTokenCount,
474| promptCacheSharingEnabled,
475| ptlAttempts,
476| })
477| throw new Error(ERROR_MESSAGE_PROMPT_TOO_LONG)
478| }
479| logEvent('tengu_compact_ptl_retry', {
480| attempt: ptlAttempts,
481| droppedMessages: messagesToSummarize.length - truncated.length,
482| remainingMessages: truncated.length,
483| })
484| messagesToSummarize = truncated
485| // The forked-agent path reads from cacheSafeParams.forkContextMessages,
486| // not the messages param — thread the truncated set through both paths.
487| retryCacheSafeParams = {
488| ...retryCacheSafeParams,
489| forkContextMessages: truncated,
490| }
491| }
492|
493| if (!summary) {
494| logForDebugging(
495| `Compact failed: no summary text in response. Response: ${jsonStringify(summaryResponse)}`,
496| { level: 'error' },
497| )
498| logEvent('tengu_compact_failed', {
499| reason:
500| 'no_summary' as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
501| preCompactTokenCount,
502| promptCacheSharingEnabled,
503| })
504| throw new Error(
505| `Failed to generate conversation summary - response did not contain valid text content`,
506| )
507| } else if (startsWithApiErrorPrefix(summary)) {
508| logEvent('tengu_compact_failed', {
509| reason:
510| 'api_error' as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
511| preCompactTokenCount,
512| promptCacheSharingEnabled,
513| })
514| throw new Error(summary)
515| }
Post-compact 消息组装
buildPostCompactMessages 保证所有压缩路径输出顺序一致,避免 REPL 与 sessionStorage loader 分叉。
annotateBoundaryWithPreservedSegment 在 partial compact / session-memory 保留 suffix 时,把 headUuid / anchorUuid / tailUuid 写入 compactMetadata.preservedSegment。磁盘上 preserved messages 保留原 parentUuid(dedup 跳过),loader 靠此元数据 patch 链。
mergeHookInstructions 把用户 custom instructions 与 PreCompact hook 输出拼接(用户在前)。
Post-compact 预算常量(compact.ts 顶部):
- 最多恢复 5 个文件,总 token 预算 50000,单文件 5000
- Skills 预算 25000,单 skill 5000(截断保留文件头部指令)
故意 不 reset sentSkillNames,避免每轮 compact 重新注入 4K skill_listing 纯 cache_creation。
源码引用: src/services/compact/compact.ts · 第 122–131 行(共 1706 行)
122| export const POST_COMPACT_MAX_FILES_TO_RESTORE = 5
123| export const POST_COMPACT_TOKEN_BUDGET = 50_000
124| export const POST_COMPACT_MAX_TOKENS_PER_FILE = 5_000
125| // Skills can be large (verify=18.7KB, claude-api=20.1KB). Previously re-injected
126| // unbounded on every compact → 5-10K tok/compact. Per-skill truncation beats
127| // dropping — instructions at the top of a skill file are usually the critical
128| // part. Budget sized to hold ~5 skills at the per-skill cap.
129| export const POST_COMPACT_MAX_TOKENS_PER_SKILL = 5_000
130| export const POST_COMPACT_SKILLS_TOKEN_BUDGET = 25_000
131| const MAX_COMPACT_STREAMING_RETRIES = 2
源码引用: src/services/compact/compact.ts · 第 330–381 行(共 1706 行)
330| export function buildPostCompactMessages(result: CompactionResult): Message[] {
331| return [
332| result.boundaryMarker,
333| ...result.summaryMessages,
334| ...(result.messagesToKeep ?? []),
335| ...result.attachments,
336| ...result.hookResults,
337| ]
338| }
339|
340| /**
341| * Annotate a compact boundary with relink metadata for messagesToKeep.
342| * Preserved messages keep their original parentUuids on disk (dedup-skipped);
343| * the loader uses this to patch head→anchor and anchor's-other-children→tail.
344| *
345| * `anchorUuid` = what sits immediately before keep[0] in the desired chain:
346| * - suffix-preserving (reactive/session-memory): last summary message
347| * - prefix-preserving (partial compact): the boundary itself
348| */
349| export function annotateBoundaryWithPreservedSegment(
350| boundary: SystemCompactBoundaryMessage,
351| anchorUuid: UUID,
352| messagesToKeep: readonly Message[] | undefined,
353| ): SystemCompactBoundaryMessage {
354| const keep = messagesToKeep ?? []
355| if (keep.length === 0) return boundary
356| return {
357| ...boundary,
358| compactMetadata: {
359| ...boundary.compactMetadata,
360| preservedSegment: {
361| headUuid: keep[0]!.uuid,
362| anchorUuid,
363| tailUuid: keep.at(-1)!.uuid,
364| },
365| },
366| }
367| }
368|
369| /**
370| * Merges user-supplied custom instructions with hook-provided instructions.
371| * User instructions come first; hook instructions are appended.
372| * Empty strings normalize to undefined.
373| */
374| export function mergeHookInstructions(
375| userInstructions: string | undefined,
376| hookInstructions: string | undefined,
377| ): string | undefined {
378| if (!hookInstructions) return userInstructions || undefined
379| if (!userInstructions) return hookInstructions
380| return `${userInstructions}\n\n${hookInstructions}`
381| }
stripImagesFromMessages 与 PTL 防御
CCD 等场景用户频繁贴图,若把完整 image block 发给摘要模型,compact 请求自身可能 prompt too long。
stripImagesFromMessages 只处理 type === 'user' 消息(assistant 不含 image)。对顶层 image/document 与 tool_result 嵌套媒体替换为 text marker,保留「曾分享图片」语义供摘要。
与 truncateHeadForPTLRetry 配合:前者减单次请求体积,后者在仍 PTL 时按 groupMessagesByApiRound 从头部丢弃整轮。
读 CC-1180 类 issue 时,先确认 compact 前是否已 strip,再查 fork context messages 是否与 truncated 集合同步(retryCacheSafeParams.forkContextMessages)。
源码引用: src/services/compact/compact.ts · 第 133–199 行(共 1706 行)
133| /**
134| * Strip image blocks from user messages before sending for compaction.
135| * Images are not needed for generating a conversation summary and can
136| * cause the compaction API call itself to hit the prompt-too-long limit,
137| * especially in CCD sessions where users frequently attach images.
138| * Replaces image blocks with a text marker so the summary still notes
139| * that an image was shared.
140| *
141| * Note: Only user messages contain images (either directly attached or within
142| * tool_result content from tools). Assistant messages contain text, tool_use,
143| * and thinking blocks but not images.
144| */
145| export function stripImagesFromMessages(messages: Message[]): Message[] {
146| return messages.map(message => {
147| if (message.type !== 'user') {
148| return message
149| }
150|
151| const content = message.message.content
152| if (!Array.isArray(content)) {
153| return message
154| }
155|
156| let hasMediaBlock = false
157| const newContent = content.flatMap(block => {
158| if (block.type === 'image') {
159| hasMediaBlock = true
160| return [{ type: 'text' as const, text: '[image]' }]
161| }
162| if (block.type === 'document') {
163| hasMediaBlock = true
164| return [{ type: 'text' as const, text: '[document]' }]
165| }
166| // Also strip images/documents nested inside tool_result content arrays
167| if (block.type === 'tool_result' && Array.isArray(block.content)) {
168| let toolHasMedia = false
169| const newToolContent = block.content.map(item => {
170| if (item.type === 'image') {
171| toolHasMedia = true
172| return { type: 'text' as const, text: '[image]' }
173| }
174| if (item.type === 'document') {
175| toolHasMedia = true
176| return { type: 'text' as const, text: '[document]' }
177| }
178| return item
179| })
180| if (toolHasMedia) {
181| hasMediaBlock = true
182| return [{ ...block, content: newToolContent }]
183| }
184| }
185| return [block]
186| })
187|
188| if (!hasMediaBlock) {
189| return message
190| }
191|
192| return {
193| ...message,
194| message: {
195| ...message.message,
196| content: newContent,
197| },
198| } as typeof message
199| })
源码目录与关联文件
强关联:services/compact/prompt.ts、utils/forkedAgent.ts、utils/hooks.ts(Pre/PostCompact)、services/api/promptCacheBreakDetection.ts、services/SessionMemory/。
动手练习
- 设置 CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=50 观察提前触发
- 手动 /compact 带 custom instructions,验证 mergeHookInstructions 顺序
- 在 ant 环境读 BQ tengu_compact_failed 的 reason 分布
- compact 后检查 MCP instructions delta attachment 是否 re-announce 全量工具
本章小结与延伸
compact = 上下文生命周期管理。下一章 analytics,读 tengu_compact_failed 等事件如何上报。 继续学习: