compact · 上下文压缩与自动触发

本章总览

services/compact/ 负责在对话逼近 context window 时，通过 fork 子 Agent 生成摘要、插入 SystemCompactBoundaryMessage、并 re-inject 文件/skills/MCP delta 附件。autoCompact.ts 决定何时触发；compact.ts 执行 compactConversation 主流程。本章要求你能从 REPL 的「Compacting…」状态追到 autoCompactIfNeeded 与 PreCompact hook。

学完本章你应该能

解释 getAutoCompactThreshold 与 AUTOCOMPACT_BUFFER_TOKENS 的含义
说明 shouldAutoCompact 的 recursion guard（compact/session_memory querySource）
描述 compactConversation 的 fork 摘要与 post-compact 附件重建
理解 circuit breaker（MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES）
能在 prompt-too-long 路径定位 truncateHeadForPTLRetry

核心概念（先读懂这些）

Effective window 预留 summary 输出

getEffectiveContextWindowSize = contextWindow - min(maxOutput, 20000)。压缩本身是一次 API 调用，需要为摘要输出预留 token；p99.99 摘要约 17387 token。CLAUDE_CODE_AUTO_COMPACT_WINDOW env 可人为缩小 window 便于测试。

compact 与 context collapse 互斥

当 CONTEXT_COLLAPSE feature 开启且 isContextCollapseEnabled() 时，shouldAutoCompact 返回 false，避免 autocompact 在 90%/95% collapse 阈值之间抢跑。reactiveCompact 仍可作为 API 413 fallback，因为它直接读 isAutoCompactEnabled 而非 shouldAutoCompact。

Post-compact 不是简单替换 messages

buildPostCompactMessages 顺序：boundary → summaryMessages → messagesToKeep → attachments → hookResults。annotateBoundaryWithPreservedSegment 写入 preservedSegment 元数据，供 session JSONL loader 修复 parentUuid 链。压缩后 readFileState 清空并按预算是重新 attach 最多 5 个文件。

建议学习步骤

阅读源码块 A：阈值常量与 getAutoCompactThreshold
阅读源码块 B：shouldAutoCompact 守卫
阅读源码块 C：autoCompactIfNeeded 主流程
阅读源码块 D：compactConversation 入口与 hooks
阅读源码块 E：buildPostCompactMessages 与 boundary 元数据
阅读源码块 F：stripImagesFromMessages
在源码树打开 services/compact/ 对照行号

常见误区

注意

manual /compact 与 autocompact 共用 compactConversation，但 suppressFollowUpQuestions 不同

注意

session memory compact 成功时也需 notifyCompaction，否则 prompt cache break 误报

注意

DISABLE_COMPACT 关闭一切；DISABLE_AUTO_COMPACT 仅关自动

在架构中的位置

Compact 在 query 循环中的插入点：

每轮 assistant 结束 → tokenCountWithEstimation
  → autoCompactIfNeeded (autoCompact.ts)
  → [可选] trySessionMemoryCompaction
  → compactConversation (compact.ts)
  → buildPostCompactMessages → setMessages 替换 transcript
  → runPostCompactCleanup / notifyCompaction / markPostCompaction

compact/ 子目录还包括 reactiveCompact.ts（413 响应后截断）、microCompact.ts、snipCompact.ts、sessionMemoryCompact.ts 等变体。本章聚焦 autoCompact.ts + compact.ts 主路径。

阈值体系：buffer 与 warning

autoCompact.ts 导出多组 token 常量：

常量	值	用途
AUTOCOMPACT_BUFFER_TOKENS	13000	距 effective window 顶部的 autocompact 触发缓冲
WARNING_THRESHOLD_BUFFER_TOKENS	20000	UI 黄色警告
ERROR_THRESHOLD_BUFFER_TOKENS	20000	UI 红色警告
MANUAL_COMPACT_BUFFER_TOKENS	3000	手动 compact 仍可用的 blocking limit

getAutoCompactThreshold = effectiveWindow - AUTOCOMPACT_BUFFER。CLAUDE_AUTOCOMPACT_PCT_OVERRIDE 可按百分比降低阈值（测试用）。

calculateTokenWarningState 返回 percentLeft、各级 isAbove* 标志，以及 isAtBlockingLimit（默认 effectiveWindow - 3000，可被 CLAUDE_CODE_BLOCKING_LIMIT_OVERRIDE 覆盖）。REPL 进度条读此结构。

源码引用： src/services/compact/autoCompact.ts · 第 62–91 行（共 352 行）

  62| export const AUTOCOMPACT_BUFFER_TOKENS = 13_000
  63| export const WARNING_THRESHOLD_BUFFER_TOKENS = 20_000
  64| export const ERROR_THRESHOLD_BUFFER_TOKENS = 20_000
  65| export const MANUAL_COMPACT_BUFFER_TOKENS = 3_000
  66| 
  67| // Stop trying autocompact after this many consecutive failures.
  68| // BQ 2026-03-10: 1,279 sessions had 50+ consecutive failures (up to 3,272)
  69| // in a single session, wasting ~250K API calls/day globally.
  70| const MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES = 3
  71| 
  72| export function getAutoCompactThreshold(model: string): number {
  73|   const effectiveContextWindow = getEffectiveContextWindowSize(model)
  74| 
  75|   const autocompactThreshold =
  76|     effectiveContextWindow - AUTOCOMPACT_BUFFER_TOKENS
  77| 
  78|   // Override for easier testing of autocompact
  79|   const envPercent = process.env.CLAUDE_AUTOCOMPACT_PCT_OVERRIDE
  80|   if (envPercent) {
  81|     const parsed = parseFloat(envPercent)
  82|     if (!isNaN(parsed) && parsed > 0 && parsed <= 100) {
  83|       const percentageThreshold = Math.floor(
  84|         effectiveContextWindow * (parsed / 100),
  85|       )
  86|       return Math.min(percentageThreshold, autocompactThreshold)
  87|     }
  88|   }
  89| 
  90|   return autocompactThreshold
  91| }

源码引用： src/services/compact/autoCompact.ts · 第 93–145 行（共 352 行）

  93| export function calculateTokenWarningState(
  94|   tokenUsage: number,
  95|   model: string,
  96| ): {
  97|   percentLeft: number
  98|   isAboveWarningThreshold: boolean
  99|   isAboveErrorThreshold: boolean
 100|   isAboveAutoCompactThreshold: boolean
 101|   isAtBlockingLimit: boolean
 102| } {
 103|   const autoCompactThreshold = getAutoCompactThreshold(model)
 104|   const threshold = isAutoCompactEnabled()
 105|     ? autoCompactThreshold
 106|     : getEffectiveContextWindowSize(model)
 107| 
 108|   const percentLeft = Math.max(
 109|     0,
 110|     Math.round(((threshold - tokenUsage) / threshold) * 100),
 111|   )
 112| 
 113|   const warningThreshold = threshold - WARNING_THRESHOLD_BUFFER_TOKENS
 114|   const errorThreshold = threshold - ERROR_THRESHOLD_BUFFER_TOKENS
 115| 
 116|   const isAboveWarningThreshold = tokenUsage >= warningThreshold
 117|   const isAboveErrorThreshold = tokenUsage >= errorThreshold
 118| 
 119|   const isAboveAutoCompactThreshold =
 120|     isAutoCompactEnabled() && tokenUsage >= autoCompactThreshold
 121| 
 122|   const actualContextWindow = getEffectiveContextWindowSize(model)
 123|   const defaultBlockingLimit =
 124|     actualContextWindow - MANUAL_COMPACT_BUFFER_TOKENS
 125| 
 126|   // Allow override for testing
 127|   const blockingLimitOverride = process.env.CLAUDE_CODE_BLOCKING_LIMIT_OVERRIDE
 128|   const parsedOverride = blockingLimitOverride
 129|     ? parseInt(blockingLimitOverride, 10)
 130|     : NaN
 131|   const blockingLimit =
 132|     !isNaN(parsedOverride) && parsedOverride > 0
 133|       ? parsedOverride
 134|       : defaultBlockingLimit
 135| 
 136|   const isAtBlockingLimit = tokenUsage >= blockingLimit
 137| 
 138|   return {
 139|     percentLeft,
 140|     isAboveWarningThreshold,
 141|     isAboveErrorThreshold,
 142|     isAboveAutoCompactThreshold,
 143|     isAtBlockingLimit,
 144|   }
 145| }

shouldAutoCompact：何时不压缩

shouldAutoCompact 在计数 token 前有多层 guard：

querySource === 'session_memory' | 'compact'：fork 子 Agent 防死锁
marble_origami（ctx-agent）在 CONTEXT_COLLAPSE 下禁用，避免 resetContextCollapse 破坏主线程 log
isAutoCompactEnabled()：全局 config + DISABLE_COMPACT / DISABLE_AUTO_COMPACT env
REACTIVE_COMPACT + GrowthBook tengu_cobalt_raccoon：抑制 proactive，改由 reactive 捕 413
CONTEXT_COLLAPSE enabled 时整段跳过 autocompact

token 计数用 tokenCountWithEstimation(messages) - snipTokensFreed：snip 已删消息但 assistant usage 仍含旧 context 时需减去 rough delta。

调试日志格式：autocompact: tokens=… threshold=… effectiveWindow=…。

源码引用： src/services/compact/autoCompact.ts · 第 147–239 行（共 352 行）

 147| export function isAutoCompactEnabled(): boolean {
 148|   if (isEnvTruthy(process.env.DISABLE_COMPACT)) {
 149|     return false
 150|   }
 151|   // Allow disabling just auto-compact (keeps manual /compact working)
 152|   if (isEnvTruthy(process.env.DISABLE_AUTO_COMPACT)) {
 153|     return false
 154|   }
 155|   // Check if user has disabled auto-compact in their settings
 156|   const userConfig = getGlobalConfig()
 157|   return userConfig.autoCompactEnabled
 158| }
 159| 
 160| export async function shouldAutoCompact(
 161|   messages: Message[],
 162|   model: string,
 163|   querySource?: QuerySource,
 164|   // Snip removes messages but the surviving assistant's usage still reflects
 165|   // pre-snip context, so tokenCountWithEstimation can't see the savings.
 166|   // Subtract the rough-delta that snip already computed.
 167|   snipTokensFreed = 0,
 168| ): Promise<boolean> {
 169|   // Recursion guards. session_memory and compact are forked agents that
 170|   // would deadlock.
 171|   if (querySource === 'session_memory' || querySource === 'compact') {
 172|     return false
 173|   }
 174|   // marble_origami is the ctx-agent — if ITS context blows up and
 175|   // autocompact fires, runPostCompactCleanup calls resetContextCollapse()
 176|   // which destroys the MAIN thread's committed log (module-level state
 177|   // shared across forks). Inside feature() so the string DCEs from
 178|   // external builds (it's in excluded-strings.txt).
 179|   if (feature('CONTEXT_COLLAPSE')) {
 180|     if (querySource === 'marble_origami') {
 181|       return false
 182|     }
 183|   }
 184| 
 185|   if (!isAutoCompactEnabled()) {
 186|     return false
 187|   }
 188| 
 189|   // Reactive-only mode: suppress proactive autocompact, let reactive compact
 190|   // catch the API's prompt-too-long. feature() wrapper keeps the flag string
 191|   // out of external builds (REACTIVE_COMPACT is ant-only).
 192|   // Note: returning false here also means autoCompactIfNeeded never reaches
 193|   // trySessionMemoryCompaction in the query loop — the /compact call site
 194|   // still tries session memory first. Revisit if reactive-only graduates.
 195|   if (feature('REACTIVE_COMPACT')) {
 196|     if (getFeatureValue_CACHED_MAY_BE_STALE('tengu_cobalt_raccoon', false)) {
 197|       return false
 198|     }
 199|   }
 200| 
 201|   // Context-collapse mode: same suppression. Collapse IS the context
 202|   // management system when it's on — the 90% commit / 95% blocking-spawn
 203|   // flow owns the headroom problem. Autocompact firing at effective-13k
 204|   // (~93% of effective) sits right between collapse's commit-start (90%)
 205|   // and blocking (95%), so it would race collapse and usually win, nuking
 206|   // granular context that collapse was about to save. Gating here rather
 207|   // than in isAutoCompactEnabled() keeps reactiveCompact alive as the 413
 208|   // fallback (it consults isAutoCompactEnabled directly) and leaves
 209|   // sessionMemory + manual /compact working.
 210|   //
 211|   // Consult isContextCollapseEnabled (not the raw gate) so the
 212|   // CLAUDE_CONTEXT_COLLAPSE env override is honored here too. require()
 213|   // inside the block breaks the init-time cycle (this file exports
 214|   // getEffectiveContextWindowSize which collapse's index imports).
 215|   if (feature('CONTEXT_COLLAPSE')) {
 216|     /* eslint-disable @typescript-eslint/no-require-imports */
 217|     const { isContextCollapseEnabled } =
 218|       require('../contextCollapse/index.js') as typeof import('../contextCollapse/index.js')
 219|     /* eslint-enable @typescript-eslint/no-require-imports */
 220|     if (isContextCollapseEnabled()) {
 221|       return false
 222|     }
 223|   }
 224| 
 225|   const tokenCount = tokenCountWithEstimation(messages) - snipTokensFreed
 226|   const threshold = getAutoCompactThreshold(model)
 227|   const effectiveWindow = getEffectiveContextWindowSize(model)
 228| 
 229|   logForDebugging(
 230|     `autocompact: tokens=${tokenCount} threshold=${threshold} effectiveWindow=${effectiveWindow}${snipTokensFreed > 0 ? ` snipFreed=${snipTokensFreed}` : ''}`,
 231|   )
 232| 
 233|   const { isAboveAutoCompactThreshold } = calculateTokenWarningState(
 234|     tokenCount,
 235|     model,
 236|   )
 237| 
 238|   return isAboveAutoCompactThreshold
 239| }

autoCompactIfNeeded：session memory 与 circuit breaker

autoCompactIfNeeded 是 query 循环调用的入口：

Circuit breaker：tracking.consecutiveFailures >= 3 时直接返回（BQ 2026-03：单 session 最多 3272 次失败浪费 ~250K API calls/day）。

主流程：

shouldAutoCompact 为 false → 退出
构造 RecompactionInfo（是否链式重压缩、turnCounter、turnId）
trySessionMemoryCompaction 优先（实验路径）：成功则 setLastSummarizedMessageId(undefined)、runPostCompactCleanup、notifyCompaction、markPostCompaction
否则 compactConversation(..., isAutoCompact=true)
失败递增 consecutiveFailures；用户 abort 不 logError

成功返回 compactionResult 供 caller 替换 messages；失败不抛到 UI（除非 manual compact）。

源码引用： src/services/compact/autoCompact.ts · 第 241–351 行（共 352 行）

 241| export async function autoCompactIfNeeded(
 242|   messages: Message[],
 243|   toolUseContext: ToolUseContext,
 244|   cacheSafeParams: CacheSafeParams,
 245|   querySource?: QuerySource,
 246|   tracking?: AutoCompactTrackingState,
 247|   snipTokensFreed?: number,
 248| ): Promise<{
 249|   wasCompacted: boolean
 250|   compactionResult?: CompactionResult
 251|   consecutiveFailures?: number
 252| }> {
 253|   if (isEnvTruthy(process.env.DISABLE_COMPACT)) {
 254|     return { wasCompacted: false }
 255|   }
 256| 
 257|   // Circuit breaker: stop retrying after N consecutive failures.
 258|   // Without this, sessions where context is irrecoverably over the limit
 259|   // hammer the API with doomed compaction attempts on every turn.
 260|   if (
 261|     tracking?.consecutiveFailures !== undefined &&
 262|     tracking.consecutiveFailures >= MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES
 263|   ) {
 264|     return { wasCompacted: false }
 265|   }
 266| 
 267|   const model = toolUseContext.options.mainLoopModel
 268|   const shouldCompact = await shouldAutoCompact(
 269|     messages,
 270|     model,
 271|     querySource,
 272|     snipTokensFreed,
 273|   )
 274| 
 275|   if (!shouldCompact) {
 276|     return { wasCompacted: false }
 277|   }
 278| 
 279|   const recompactionInfo: RecompactionInfo = {
 280|     isRecompactionInChain: tracking?.compacted === true,
 281|     turnsSincePreviousCompact: tracking?.turnCounter ?? -1,
 282|     previousCompactTurnId: tracking?.turnId,
 283|     autoCompactThreshold: getAutoCompactThreshold(model),
 284|     querySource,
 285|   }
 286| 
 287|   // EXPERIMENT: Try session memory compaction first
 288|   const sessionMemoryResult = await trySessionMemoryCompaction(
 289|     messages,
 290|     toolUseContext.agentId,
 291|     recompactionInfo.autoCompactThreshold,
 292|   )
 293|   if (sessionMemoryResult) {
 294|     // Reset lastSummarizedMessageId since session memory compaction prunes messages
 295|     // and the old message UUID will no longer exist after the REPL replaces messages
 296|     setLastSummarizedMessageId(undefined)
 297|     runPostCompactCleanup(querySource)
 298|     // Reset cache read baseline so the post-compact drop isn't flagged as a
 299|     // break. compactConversation does this internally; SM-compact doesn't.
 300|     // BQ 2026-03-01: missing this made 20% of tengu_prompt_cache_break events
 301|     // false positives (systemPromptChanged=true, timeSinceLastAssistantMsg=-1).
 302|     if (feature('PROMPT_CACHE_BREAK_DETECTION')) {
 303|       notifyCompaction(querySource ?? 'compact', toolUseContext.agentId)
 304|     }
 305|     markPostCompaction()
 306|     return {
 307|       wasCompacted: true,
 308|       compactionResult: sessionMemoryResult,
 309|     }
 310|   }
 311| 
 312|   try {
 313|     const compactionResult = await compactConversation(
 314|       messages,
 315|       toolUseContext,
 316|       cacheSafeParams,
 317|       true, // Suppress user questions for autocompact
 318|       undefined, // No custom instructions for autocompact
 319|       true, // isAutoCompact
 320|       recompactionInfo,
 321|     )
 322| 
 323|     // Reset lastSummarizedMessageId since legacy compaction replaces all messages
 324|     // and the old message UUID will no longer exist in the new messages array
 325|     setLastSummarizedMessageId(undefined)
 326|     runPostCompactCleanup(querySource)
 327| 
 328|     return {
 329|       wasCompacted: true,
 330|       compactionResult,
 331|       // Reset failure count on success
 332|       consecutiveFailures: 0,
 333|     }
 334|   } catch (error) {
 335|     if (!hasExactErrorMessage(error, ERROR_MESSAGE_USER_ABORT)) {
 336|       logError(error)
 337|     }
 338|     // Increment consecutive failure count for circuit breaker.
 339|     // The caller threads this through autoCompactTracking so the
 340|     // next query loop iteration can skip futile retry attempts.
 341|     const prevFailures = tracking?.consecutiveFailures ?? 0
 342|     const nextFailures = prevFailures + 1
 343|     if (nextFailures >= MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES) {
 344|       logForDebugging(
 345|         `autocompact: circuit breaker tripped after ${nextFailures} consecutive failures — skipping future attempts this session`,
 346|         { level: 'warn' },
 347|       )
 348|     }
 349|     return { wasCompacted: false, consecutiveFailures: nextFailures }
 350|   }
 351| }

compactConversation：hooks 与 fork 摘要

compactConversation 是手动 /compact 与 autocompact 共用核心：

executePreCompactHooks（trigger: auto|manual）合并 customInstructions
getCompactPrompt + createUserMessage 构造摘要请求
streamCompactSummary fork 子 Agent（runForkedAgent），可共享主会话 prompt cache（GrowthBook tengu_compact_cache_prefix，默认 true）
若摘要以 PROMPT_TOO_LONG 开头 → truncateHeadForPTLRetry 丢弃最老 API round，最多 MAX_PTL_RETRIES
成功后清空 readFileState，并行 createPostCompactFileAttachments + async agent attachments
re-inject plan/skill/deferred-tools/MCP instructions delta attachments
executePostCompactHooks、写 boundary、logEvent('tengu_compact_*')

stripImagesFromMessages 在发摘要 API 前把 image/document 换成 [image] 文本，防止 compact 请求本身 PTL。

失败 reason 枚举：prompt_too_long、no_summary、api_error，均打 tengu_compact_failed。

源码引用： src/services/compact/compact.ts · 第 387–515 行（共 1706 行）

 387| export async function compactConversation(
 388|   messages: Message[],
 389|   context: ToolUseContext,
 390|   cacheSafeParams: CacheSafeParams,
 391|   suppressFollowUpQuestions: boolean,
 392|   customInstructions?: string,
 393|   isAutoCompact: boolean = false,
 394|   recompactionInfo?: RecompactionInfo,
 395| ): Promise<CompactionResult> {
 396|   try {
 397|     if (messages.length === 0) {
 398|       throw new Error(ERROR_MESSAGE_NOT_ENOUGH_MESSAGES)
 399|     }
 400| 
 401|     const preCompactTokenCount = tokenCountWithEstimation(messages)
 402| 
 403|     const appState = context.getAppState()
 404|     void logPermissionContextForAnts(appState.toolPermissionContext, 'summary')
 405| 
 406|     context.onCompactProgress?.({
 407|       type: 'hooks_start',
 408|       hookType: 'pre_compact',
 409|     })
 410| 
 411|     // Execute PreCompact hooks
 412|     context.setSDKStatus?.('compacting')
 413|     const hookResult = await executePreCompactHooks(
 414|       {
 415|         trigger: isAutoCompact ? 'auto' : 'manual',
 416|         customInstructions: customInstructions ?? null,
 417|       },
 418|       context.abortController.signal,
 419|     )
 420|     customInstructions = mergeHookInstructions(
 421|       customInstructions,
 422|       hookResult.newCustomInstructions,
 423|     )
 424|     const userDisplayMessage = hookResult.userDisplayMessage
 425| 
 426|     // Show requesting mode with up arrow and custom message
 427|     context.setStreamMode?.('requesting')
 428|     context.setResponseLength?.(() => 0)
 429|     context.onCompactProgress?.({ type: 'compact_start' })
 430| 
 431|     // 3P default: true — forked-agent path reuses main conversation's prompt cache.
 432|     // Experiment (Jan 2026) confirmed: false path is 98% cache miss, costs ~0.76% of
 433|     // fleet cache_creation (~38B tok/day), concentrated in ephemeral envs (CCR/GHA/SDK)
 434|     // with cold GB cache and 3P providers where GB is disabled. GB gate kept as kill-switch.
 435|     const promptCacheSharingEnabled = getFeatureValue_CACHED_MAY_BE_STALE(
 436|       'tengu_compact_cache_prefix',
 437|       true,
 438|     )
 439| 
 440|     const compactPrompt = getCompactPrompt(customInstructions)
 441|     const summaryRequest = createUserMessage({
 442|       content: compactPrompt,
 443|     })
 444| 
 445|     let messagesToSummarize = messages
 446|     let retryCacheSafeParams = cacheSafeParams
 447|     let summaryResponse: AssistantMessage
 448|     let summary: string | null
 449|     let ptlAttempts = 0
 450|     for (;;) {
 451|       summaryResponse = await streamCompactSummary({
 452|         messages: messagesToSummarize,
 453|         summaryRequest,
 454|         appState,
 455|         context,
 456|         preCompactTokenCount,
 457|         cacheSafeParams: retryCacheSafeParams,
 458|       })
 459|       summary = getAssistantMessageText(summaryResponse)
 460|       if (!summary?.startsWith(PROMPT_TOO_LONG_ERROR_MESSAGE)) break
 461| 
 462|       // CC-1180: compact request itself hit prompt-too-long. Truncate the
 463|       // oldest API-round groups and retry rather than leaving the user stuck.
 464|       ptlAttempts++
 465|       const truncated =
 466|         ptlAttempts <= MAX_PTL_RETRIES
 467|           ? truncateHeadForPTLRetry(messagesToSummarize, summaryResponse)
 468|           : null
 469|       if (!truncated) {
 470|         logEvent('tengu_compact_failed', {
 471|           reason:
 472|             'prompt_too_long' as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
 473|           preCompactTokenCount,
 474|           promptCacheSharingEnabled,
 475|           ptlAttempts,
 476|         })
 477|         throw new Error(ERROR_MESSAGE_PROMPT_TOO_LONG)
 478|       }
 479|       logEvent('tengu_compact_ptl_retry', {
 480|         attempt: ptlAttempts,
 481|         droppedMessages: messagesToSummarize.length - truncated.length,
 482|         remainingMessages: truncated.length,
 483|       })
 484|       messagesToSummarize = truncated
 485|       // The forked-agent path reads from cacheSafeParams.forkContextMessages,
 486|       // not the messages param — thread the truncated set through both paths.
 487|       retryCacheSafeParams = {
 488|         ...retryCacheSafeParams,
 489|         forkContextMessages: truncated,
 490|       }
 491|     }
 492| 
 493|     if (!summary) {
 494|       logForDebugging(
 495|         `Compact failed: no summary text in response. Response: ${jsonStringify(summaryResponse)}`,
 496|         { level: 'error' },
 497|       )
 498|       logEvent('tengu_compact_failed', {
 499|         reason:
 500|           'no_summary' as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
 501|         preCompactTokenCount,
 502|         promptCacheSharingEnabled,
 503|       })
 504|       throw new Error(
 505|         `Failed to generate conversation summary - response did not contain valid text content`,
 506|       )
 507|     } else if (startsWithApiErrorPrefix(summary)) {
 508|       logEvent('tengu_compact_failed', {
 509|         reason:
 510|           'api_error' as AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
 511|         preCompactTokenCount,
 512|         promptCacheSharingEnabled,
 513|       })
 514|       throw new Error(summary)
 515|     }

Post-compact 消息组装

buildPostCompactMessages 保证所有压缩路径输出顺序一致，避免 REPL 与 sessionStorage loader 分叉。

annotateBoundaryWithPreservedSegment 在 partial compact / session-memory 保留 suffix 时，把 headUuid / anchorUuid / tailUuid 写入 compactMetadata.preservedSegment。磁盘上 preserved messages 保留原 parentUuid（dedup 跳过），loader 靠此元数据 patch 链。

mergeHookInstructions 把用户 custom instructions 与 PreCompact hook 输出拼接（用户在前）。

Post-compact 预算常量（compact.ts 顶部）：

最多恢复 5 个文件，总 token 预算 50000，单文件 5000
Skills 预算 25000，单 skill 5000（截断保留文件头部指令）

故意不 reset sentSkillNames，避免每轮 compact 重新注入 4K skill_listing 纯 cache_creation。

源码引用： src/services/compact/compact.ts · 第 122–131 行（共 1706 行）

 122| export const POST_COMPACT_MAX_FILES_TO_RESTORE = 5
 123| export const POST_COMPACT_TOKEN_BUDGET = 50_000
 124| export const POST_COMPACT_MAX_TOKENS_PER_FILE = 5_000
 125| // Skills can be large (verify=18.7KB, claude-api=20.1KB). Previously re-injected
 126| // unbounded on every compact → 5-10K tok/compact. Per-skill truncation beats
 127| // dropping — instructions at the top of a skill file are usually the critical
 128| // part. Budget sized to hold ~5 skills at the per-skill cap.
 129| export const POST_COMPACT_MAX_TOKENS_PER_SKILL = 5_000
 130| export const POST_COMPACT_SKILLS_TOKEN_BUDGET = 25_000
 131| const MAX_COMPACT_STREAMING_RETRIES = 2

源码引用： src/services/compact/compact.ts · 第 330–381 行（共 1706 行）

 330| export function buildPostCompactMessages(result: CompactionResult): Message[] {
 331|   return [
 332|     result.boundaryMarker,
 333|     ...result.summaryMessages,
 334|     ...(result.messagesToKeep ?? []),
 335|     ...result.attachments,
 336|     ...result.hookResults,
 337|   ]
 338| }
 339| 
 340| /**
 341|  * Annotate a compact boundary with relink metadata for messagesToKeep.
 342|  * Preserved messages keep their original parentUuids on disk (dedup-skipped);
 343|  * the loader uses this to patch head→anchor and anchor's-other-children→tail.
 344|  *
 345|  * `anchorUuid` = what sits immediately before keep[0] in the desired chain:
 346|  *   - suffix-preserving (reactive/session-memory): last summary message
 347|  *   - prefix-preserving (partial compact): the boundary itself
 348|  */
 349| export function annotateBoundaryWithPreservedSegment(
 350|   boundary: SystemCompactBoundaryMessage,
 351|   anchorUuid: UUID,
 352|   messagesToKeep: readonly Message[] | undefined,
 353| ): SystemCompactBoundaryMessage {
 354|   const keep = messagesToKeep ?? []
 355|   if (keep.length === 0) return boundary
 356|   return {
 357|     ...boundary,
 358|     compactMetadata: {
 359|       ...boundary.compactMetadata,
 360|       preservedSegment: {
 361|         headUuid: keep[0]!.uuid,
 362|         anchorUuid,
 363|         tailUuid: keep.at(-1)!.uuid,
 364|       },
 365|     },
 366|   }
 367| }
 368| 
 369| /**
 370|  * Merges user-supplied custom instructions with hook-provided instructions.
 371|  * User instructions come first; hook instructions are appended.
 372|  * Empty strings normalize to undefined.
 373|  */
 374| export function mergeHookInstructions(
 375|   userInstructions: string | undefined,
 376|   hookInstructions: string | undefined,
 377| ): string | undefined {
 378|   if (!hookInstructions) return userInstructions || undefined
 379|   if (!userInstructions) return hookInstructions
 380|   return `${userInstructions}\n\n${hookInstructions}`
 381| }

stripImagesFromMessages 与 PTL 防御

CCD 等场景用户频繁贴图，若把完整 image block 发给摘要模型，compact 请求自身可能 prompt too long。

stripImagesFromMessages 只处理 type === 'user' 消息（assistant 不含 image）。对顶层 image/document 与 tool_result 嵌套媒体替换为 text marker，保留「曾分享图片」语义供摘要。

与 truncateHeadForPTLRetry 配合：前者减单次请求体积，后者在仍 PTL 时按 groupMessagesByApiRound 从头部丢弃整轮。

读 CC-1180 类 issue 时，先确认 compact 前是否已 strip，再查 fork context messages 是否与 truncated 集合同步（retryCacheSafeParams.forkContextMessages）。

源码引用： src/services/compact/compact.ts · 第 133–199 行（共 1706 行）

 133| /**
 134|  * Strip image blocks from user messages before sending for compaction.
 135|  * Images are not needed for generating a conversation summary and can
 136|  * cause the compaction API call itself to hit the prompt-too-long limit,
 137|  * especially in CCD sessions where users frequently attach images.
 138|  * Replaces image blocks with a text marker so the summary still notes
 139|  * that an image was shared.
 140|  *
 141|  * Note: Only user messages contain images (either directly attached or within
 142|  * tool_result content from tools). Assistant messages contain text, tool_use,
 143|  * and thinking blocks but not images.
 144|  */
 145| export function stripImagesFromMessages(messages: Message[]): Message[] {
 146|   return messages.map(message => {
 147|     if (message.type !== 'user') {
 148|       return message
 149|     }
 150| 
 151|     const content = message.message.content
 152|     if (!Array.isArray(content)) {
 153|       return message
 154|     }
 155| 
 156|     let hasMediaBlock = false
 157|     const newContent = content.flatMap(block => {
 158|       if (block.type === 'image') {
 159|         hasMediaBlock = true
 160|         return [{ type: 'text' as const, text: '[image]' }]
 161|       }
 162|       if (block.type === 'document') {
 163|         hasMediaBlock = true
 164|         return [{ type: 'text' as const, text: '[document]' }]
 165|       }
 166|       // Also strip images/documents nested inside tool_result content arrays
 167|       if (block.type === 'tool_result' && Array.isArray(block.content)) {
 168|         let toolHasMedia = false
 169|         const newToolContent = block.content.map(item => {
 170|           if (item.type === 'image') {
 171|             toolHasMedia = true
 172|             return { type: 'text' as const, text: '[image]' }
 173|           }
 174|           if (item.type === 'document') {
 175|             toolHasMedia = true
 176|             return { type: 'text' as const, text: '[document]' }
 177|           }
 178|           return item
 179|         })
 180|         if (toolHasMedia) {
 181|           hasMediaBlock = true
 182|           return [{ ...block, content: newToolContent }]
 183|         }
 184|       }
 185|       return [block]
 186|     })
 187| 
 188|     if (!hasMediaBlock) {
 189|       return message
 190|     }
 191| 
 192|     return {
 193|       ...message,
 194|       message: {
 195|         ...message.message,
 196|         content: newContent,
 197|       },
 198|     } as typeof message
 199|   })

源码目录与关联文件

强关联：services/compact/prompt.ts、utils/forkedAgent.ts、utils/hooks.ts（Pre/PostCompact）、services/api/promptCacheBreakDetection.ts、services/SessionMemory/。

动手练习

设置 CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=50 观察提前触发
手动 /compact 带 custom instructions，验证 mergeHookInstructions 顺序
在 ant 环境读 BQ tengu_compact_failed 的 reason 分布
compact 后检查 MCP instructions delta attachment 是否 re-announce 全量工具

本章小结与延伸

compact = 上下文生命周期管理。下一章 analytics，读 tengu_compact_failed 等事件如何上报。 继续学习：