What do the full distributions of cached prefix, adjusted append, and output tokens look like side by side, drawn as transparent “spindle” (violin-like) shapes on one shared token axis?
Each row in the trace is one agent step. This experiment collapses three per-step token quantities into three stacked spindles on a single compressed token axis, so their typical sizes and tails can be compared at a glance.
Method and assumptions:
- prefix is the step’s
prefix_tokens(the cached / cache-read portion of the input). - adjusted append =
newly_append_tokens − prior-step output, applied only when the previous step is Claude or Codexgpt-5.5(the providers whose previous output is echoed back into the next input). The Codex output proxy used for the subtraction is visible output (output_tokens − reasoning_output_tokens); the result is clamped at 0. Pairs are formed only between adjacent steps in the same session (round_indexdiffers by exactly 1). - output is the true
output_tokensof every parsed invocation row — it is not paired or adjusted, so its spindle covers all steps, not just the adjacency-paired ones. - Ordering. Rows are consumed in file order (the DB’s ingestion ordinal), grouped by
session_id, and ordered within a session byround_indexwith the ingestion ordinal as the tie-break (ORDER BY round_index, ingest_seq) — reproducing the old stable sort over the line-ordered JSONL. The spindle statistics (histogram density, quantiles, min/max) are order-independent, so the per-CSV output is byte-identical regardless of session iteration order. - Axis is a compressed binary token scale,
log2(tokens + 32) − log2(32), so the dense 0–32 token region is not visually over-expanded. Percentiles use linear interpolation over the full data (no sampling).