Real Claude Code and Codex sessions, organized by requests, agent steps, tokens, tool calls, cache behavior, and where time is spent.
Agent-step counts across 357,161 model calls.
First and latest observed rows in the public pool.
How agents choose tools, how often they call them, and how long those calls take.
Cumulative distribution of per-call tool latency.
Read moreWhich tools the agents use the most.
Read moreFast-call counts versus where aggregate tool time accumulates.
Read moreWhich tool kinds account for the most attributed work.
Read moreToken composition, output length, and end-to-end generation timing.
Cached prefix against freshly appended input.
Read moreHow long the agents' completions run.
Read moreWall-clock time to produce a full response.
Read moreMedian and tail length for cached prefix and fresh append.
Read moreShort agent steps by count, large agent steps by appended-token mass.
Read morePrefix, adjusted append, and output distributions on one axis.
Read moreFresh context after subtracting replayed prior output.
Read moreWhether a long response returns as fresh append or cached-prefix growth.
Read moreWhere summed model-generation time accumulates.
Read moreCache reuse and the share of context kept active across agent steps.
How much input is served from the prefix cache.
Read moreShare of context kept warm in the KV cache.
Read morePrefix-cache hit rate against the preceding human idle gap.
Read morePrefix-cache hit rate after tool-triggered waits.
Read moreWhere human waiting time appears in the workload timeline.
How long the agent waits on a human.
Read moreTime the agent sits idle waiting for the next human message.
Read moreHow quickly human-response waits resolve by provider.
Read moreWhere the summed human idle time accumulates.
Read moreHow context evolves across a full agent session.
Provider-level sessions, requests, agent steps, tool use, cache reuse, context growth, and human waits.
| Metric | Claude | Codex | DeepSeekMoonshotGLMQwen |
|---|---|---|---|
| Trace facts sessions, requests & agent-step coverage | |||
| Coverage | |||
| Agent steps | 140,338 steps | 216,823 steps | — |
| Sessions | 2,676 | 1,589 | — |
| Distinct users | 37 | 22 | — |
| Collection window | Oct 3 2025 — Jun 4 2026 | Sep 23 2025 — Jun 4 2026 | — |
| Requests | 21,407 | 20,040 | — |
| Tool-triggered steps | 120,760 (86.0%) | 195,268 (90.1%) | — |
| Models | |||
| Models represented | 9 | 14 | — |
| Top model | Opus 4.7 (63.1%) | gpt-5.5 (47.5%) | — |
| LLM generation tokens and timing per agent step | |||
| Token distributions | |||
| Total input tokens | 28.5B tok | 26.4B tok | — |
| Cached-read input tokens | 27.3B tok | 25.3B tok | — |
| Append input tokens | 1.19B tok | 1.15B tok | — |
| Avg total input / agent step | 202,840 tok | 121,907 tok | — |
| Avg cached-read input / agent step | 194,361 tok | 116,623 tok | — |
| Avg append input / agent step | 8,479 tok | 5,283 tok | — |
| Input by step trigger | |||
| User-initiated avg total input | 275,716 tok | 114,510 tok | — |
| User-initiated avg append input | 36,212 tok | 24,945 tok | — |
| Tool-triggered avg total input | 191,083 tok | 122,143 tok | — |
| Tool-triggered avg append input | 3,998 tok | 3,422 tok | — |
| Output tokens | |||
| Total output tokens | 96.9M tok | 90.1M tok | — |
| Avg output / agent step | 690 tok | 415 tok | — |
| Reasoning tokens | — | 36.8M tok | — |
| Avg reasoning / reasoning step | — | 239 | — |
| Timing | |||
| Generation time p50↓ | 5.7s | 5.8s | — |
| Generation time p90↓ | 25.8s | 19.9s | — |
| Total generation time | 574 h | 567 h | — |
| Output decode throughput↑ | 46.8 tok/s | 33.9 tok/s | — |
| Post-reasoning decode throughput↑ | — | 72.0 tok/s | — |
| Estimated TTFT from reasoning tokens↓ | — | 4.6s | — |
| Tool calls tool volume and latency across agent steps | |||
| Activity | |||
| Tool calls | 142,388 | 290,122 | — |
| Agent steps with tool calls | 121,145 (86.3%) | 198,650 (91.6%) | — |
| Tool calls / request | 6.7 | 14.5 | — |
| Timing | |||
| Tool latency p50↓ | 125ms | 626ms | — |
| Tool latency p90↓ | 14.5s | 12.3s | — |
| Total attributed tool time | 1.3K h | 413 h | — |
| Prefix cache cache reuse by agent-step trigger | |||
| Cache rates | |||
| Overall prefix hit rate↑ | 95.8% | 95.7% | — |
| User-initiated step hit rate↑ | 86.9% | 78.2% | — |
| Tool-triggered step hit rate↑ | 97.9% | 97.2% | — |
| Append vs context growth | |||
| User-initiated append tokens | 707.0M tok | 464.5M tok | — |
| User-initiated context increase | 25.4M tok | 32.1M tok | — |
| User-initiated context / append↑ | 3.6% | 6.9% | — |
| Tool-triggered append tokens | 482.8M tok | 661.2M tok | — |
| Tool-triggered context increase | 210.3M tok | 341.1M tok | — |
| Tool-triggered context / append↑ | 43.6% | 51.6% | — |
| All classified append tokens | 1.19B tok | 1.13B tok | — |
| All classified context increase | 235.7M tok | 373.1M tok | — |
| All classified context / append↑ | 19.8% | 33.1% | — |
| Session context context growth across sessions and agent steps | |||
| Step-level context growth | |||
| Total context increase | 235.7M tok | 373.1M tok | — |
| User-initiated context increase avg / p50 / p90 | 1,499 / 703 / 3,121 tok | 1,882 / 492 / 5,680 tok | — |
| Tool-triggered context increase avg / p50 / p90 | 1,742 / 801 / 3,879 tok | 1,766 / 556 / 4,454 tok | — |
| Growth / reductions | |||
| User-initiated growth share | 98.2% | 68.6% | — |
| User-initiated reduction share | 1.7% | 31.3% | — |
| User-initiated major compaction share | 0.8% | 0.8% | — |
| Tool-triggered growth share | 99.8% | 99.0% | — |
| Tool-triggered reduction share | 0.2% | 1.0% | — |
| Tool-triggered major compaction share | 0.2% | 0.6% | — |
| Human in the loop human waits before the next model response | |||
| Timing | |||
| Total human wait time | 15K h | 15K h | — |
| Human wait avg / p50 / p90 | 2815.9 / 116.0 / 1365.6 s | 3368.4 / 103.0 / 1400.7 s | — |
Drop Claude/Codex session files or a sanitized export. The browser normalizes, sanitizes, and computes the analysis locally — then renders it as an interactive dashboard.
Claude Code keeps sessions in ~/.claude/projects; Codex keeps them in ~/.codex/sessions. This compresses whichever you have into a single trace.tar.gz in your home folder — then drop that file above.
cd ~ && tar -czf trace.tar.gz $([ -d .claude/projects ] && echo .claude/projects) $([ -d .codex/sessions ] && echo .codex/sessions) Skip the download. Clone the toolkit onto that machine and launch it there — it detects this machine's ~/.claude + ~/.codex and analyzes them in place; nothing is uploaded. Open the URL it prints (forward the port over SSH if the box is remote).
git clone https://github.com/uw-syfi/TraceLab.git && cd TraceLab && ./launch.sh Steps per day across the trace.
Hour of day × weekday — darker means more agent steps in that slot.
—
—
Share sanitized, pseudonymous rows with the community pool. Here's exactly what each shared row contains — and what it never does.
_pathRe-validated on upload — rejected if anything sensitive slipped through.
Each contribution adds coverage to the public workload map. Uploaded rows are validated, deduplicated, and credited pseudonymously.
Claude and Codex agent steps in contributed traces.
Placeholder until contribution history is available.
Drop raw Claude/Codex sessions or a sanitized .gz. Raw files are normalized and sanitized locally before upload.
| Contributor | When | Agent steps | Providers | Status |
|---|---|---|---|---|
| No contributions yet — be the first from the Analyze tab. | ||||