Session internal counts · SyFI TraceLab

Trace assistant

Reading the public SYFI pool

357,161 agent steps across Claude & Codex — public, shareable.

Answers run real DuckDB/Python in a sandbox · code is shown

All figures

Session

one continuous trace of work, often spanning multiple requests or problems.

Request

one user input through the agent's final response.

Agent step

one model call inside a request.

User-initiated step

an agent step started by user input.

Tool-triggered step

an agent step started by a tool result.

Problem

How much work does one coding session, and one request, contain?

Computes the count distributions behind tab:session_internal_counts (src/04_SessionContext.tex): requests, user-/tool-initiated steps, and tool calls per session; tool-initiated steps and tool calls per request; and tool calls per step — each as avg / p25 / p50 / p90 / p99.

Tables

Per session

Metric	Avg	P25	P50	P90	P99
Requests	9.2	1	1	18	137
User-initiated steps†	8.9	1	1	17	129
Tool-initiated steps	73.6	4	15	135	1,107
Tool calls	101.4	8	25	176	1,438

Per request

Metric	Avg	P25	P50	P90	P99
User-initiated steps	0.9	1	1	1	1
Tool-initiated steps	7.8	0	1	20	86
Tool calls	10.8	0	2	30	113

Per step

Metric	Avg	P25	P50	P90	P99
Tool calls	1.2	1	1	2	4

Table 1Per-session, per-request, and per-step counts (avg / p25 / p50 / p90 / p99).

Coding sessions are persistent and overwhelmingly autonomous (the paper’s tab:session_internal_counts). A session averages 9.2 requests but with a long tail (p99 = 137), so the human keeps coming back to the same session over and over. Per session there are far more tool-initiated steps (avg 73.6) than user-initiated ones (avg 8.9), so once a request lands the loop runs itself: resolving one request takes ~8 tool-initiated steps and ~11 tool calls on average. At the step level each round issues just over one tool call (avg 1.2, p50 1, p90 2), so parallel tool calling does happen but is the exception, not the norm.

Reference

Running it

# default merged trace (materialized to a temp DuckDB cache on first use)
uv run python artifacts/session/session_internal_counts/analyze.py

# the pinned public trace
uv run python artifacts/session/session_internal_counts/analyze.py -i trace/syfi_coding_trace.jsonl

# a prebuilt DB, into a chosen dir
uv run python artifacts/session/session_internal_counts/analyze.py --db /tmp/trace.duckdb -o /tmp/out

Outputs

session_internal_counts.tex — the merged three-line (booktabs) table, ready to \input or paste into src/04_SessionContext.tex.
session_internal_counts.md — GFM Markdown mirror of the table, rendered on the web detail page.
headline.json — the few headline numbers for the Overview gallery card.
stdout — the full merged + per-provider (Claude / Codex) breakdown.

No figures.