SyFI TraceLab
Trace assistant
Reading the public SYFI pool
357,161 agent steps across Claude & Codex — public, shareable.
Answers run real DuckDB/Python in a sandbox · code is shown
All figures
Session
one continuous trace of work, often spanning multiple requests or problems.
Request
one user input through the agent's final response.
Agent step
one model call inside a request.
User-initiated step
an agent step started by user input.
Tool-triggered step
an agent step started by a tool result.
Problem

How much work does one coding session, and one request, contain?

Computes the count distributions behind tab:session_internal_counts (src/04_SessionContext.tex): requests, user-/tool-initiated steps, and tool calls per session; tool-initiated steps and tool calls per request; and tool calls per step — each as avg / p25 / p50 / p90 / p99.

Tables

Per session

MetricAvgP25P50P90P99
Requests9.21118137
User-initiated steps†8.91117129
Tool-initiated steps73.64151351,107
Tool calls101.48251761,438

Per request

MetricAvgP25P50P90P99
User-initiated steps0.91111
Tool-initiated steps7.8012086
Tool calls10.80230113

Per step

MetricAvgP25P50P90P99
Tool calls1.21124
Table 1Per-session, per-request, and per-step counts (avg / p25 / p50 / p90 / p99).

Coding sessions are persistent and overwhelmingly autonomous (the paper’s tab:session_internal_counts). A session averages 9.2 requests but with a long tail (p99 = 137), so the human keeps coming back to the same session over and over. Per session there are far more tool-initiated steps (avg 73.6) than user-initiated ones (avg 8.9), so once a request lands the loop runs itself: resolving one request takes ~8 tool-initiated steps and ~11 tool calls on average. At the step level each round issues just over one tool call (avg 1.2, p50 1, p90 2), so parallel tool calling does happen but is the exception, not the norm.

Reference
Running it
# default merged trace (materialized to a temp DuckDB cache on first use)
uv run python artifacts/session/session_internal_counts/analyze.py

# the pinned public trace
uv run python artifacts/session/session_internal_counts/analyze.py -i trace/syfi_coding_trace.jsonl

# a prebuilt DB, into a chosen dir
uv run python artifacts/session/session_internal_counts/analyze.py --db /tmp/trace.duckdb -o /tmp/out
Outputs
  • session_internal_counts.tex — the merged three-line (booktabs) table, ready to \input or paste into src/04_SessionContext.tex.
  • session_internal_counts.md — GFM Markdown mirror of the table, rendered on the web detail page.
  • headline.json — the few headline numbers for the Overview gallery card.
  • stdout — the full merged + per-provider (Claude / Codex) breakdown.

No figures.

SyFI TraceLab · experiment detail