Import
The import command converts agent session transcripts and selected external datasets into AgentV formats. Transcript imports let you grade past runs offline without re-running the agent. Dataset imports help seed AgentV YAML from portable case sources.
AgentV no longer maintains agentv import promptfoo as a first-class core import path. Migrate Promptfoo configs by rewriting the relevant prompts, tests, and assertions as native AgentV eval YAML, or keep any one-off conversion logic outside the AgentV CLI.
Supported Sources
Section titled “Supported Sources”| Source | Command | Input |
|---|---|---|
| Claude Code | agentv import claude | ~/.claude/projects/<path>/<uuid>.jsonl |
| Codex CLI | agentv import codex | ~/.codex/sessions/<YYYY>/<MM>/<DD>/rollout-*.jsonl |
| Copilot CLI | agentv import copilot | ~/.copilot/session-state/<uuid>/events.jsonl |
| HuggingFace datasets | agentv import huggingface | Dataset repository and split |
import claude
Section titled “import claude”Import a Claude Code session transcript.
List available sessions
Section titled “List available sessions”agentv import claude --listOutput:
Found 5 session(s):
4c4f9e4e-e6f1-490b-a1b1-9aef543ebf22 2m ago -home-user-myproject 087b801a-7a63-48ff-b348-62563a290b23 1h ago -home-user-myproject ed8b8c62-4414-49fb-8739-006d809c8588 3h ago -home-user-other-projectImport a specific session
Section titled “Import a specific session”agentv import claude --session-id 4c4f9e4e-e6f1-490b-a1b1-9aef543ebf22Filter by project path
Section titled “Filter by project path”agentv import claude --list --project-path /home/user/myprojectCustom output path
Section titled “Custom output path”agentv import claude --session-id <uuid> -o transcripts/my-session.jsonlDefault output: .agentv/transcripts/claude-<session-id-short>.jsonl
import codex
Section titled “import codex”Import a Codex CLI session transcript.
List available sessions
Section titled “List available sessions”agentv import codex --listImport a specific session
Section titled “Import a specific session”agentv import codex --session-id 019d5cff-9f02-7bc3-8f98-2071ba17ef0eimport copilot
Section titled “import copilot”Import a Copilot CLI session transcript.
List available sessions
Section titled “List available sessions”agentv import copilot --listImport a specific session
Section titled “Import a specific session”agentv import copilot --session-id 9ca6d90c-1d80-40d1-b805-c59ee31fc007import huggingface
Section titled “import huggingface”Import a HuggingFace dataset into AgentV eval YAML files.
agentv import huggingface --repo SWE-bench/SWE-bench_Verified --split test --limit 10 --output evals/swebench/Options
Section titled “Options”The transcript providers share the same core flags:
| Flag | Description |
|---|---|
--session-id <uuid> | Import a specific session by UUID |
--list | List available sessions instead of importing |
--output, -o <path> | Custom output file path |
Provider-specific flags:
| Flag | Provider | Description |
|---|---|---|
--project-path <path> | Claude | Filter sessions by project path |
--projects-dir <dir> | Claude | Override ~/.claude/projects directory |
--date <YYYY-MM-DD> | Codex | Filter sessions by date |
--sessions-dir <dir> | Codex | Override ~/.codex/sessions directory |
--session-state-dir <dir> | Copilot | Override ~/.copilot/session-state directory |
HuggingFace dataset import uses dataset-specific flags:
| Flag | Description |
|---|---|
--repo <name> | HuggingFace dataset repository |
--split <name> | Dataset split to load |
--limit <number> | Maximum number of instances to import |
--output, -o <dir> | Output directory for generated eval YAML files |
Output Format
Section titled “Output Format”Imported transcripts are written as AgentV transcript JSONL. Each row is a
provider-neutral agentv.transcript.v1 message row grouped by test_id and
ordered by message_index:
{"schema_version":"agentv.transcript.v1","test_id":"claude-session-1","target":"claude","message_index":0,"role":"user","content":"Fix the bug in auth.ts","capture":{"content":"full","redaction_level":"none"},"source":{"kind":"imported_transcript","provider":"claude","session_id":"claude-session-1"}}{"schema_version":"agentv.transcript.v1","test_id":"claude-session-1","target":"claude","message_index":1,"role":"assistant","content":"I'll fix the authentication bug.","tool_calls":[{"tool":"Read","id":"toolu_01...","input":{"file_path":"src/auth.ts"},"output":"...file contents..."}],"capture":{"content":"full","redaction_level":"none"},"source":{"kind":"imported_transcript","provider":"claude","session_id":"claude-session-1"}}Stable top-level fields are schema_version, test_id, target,
message_index, role, optional name, content, tool_calls,
start_time, end_time, duration_ms, metadata, token_usage,
transcript-level transcript_token_usage, transcript_duration_ms,
transcript_cost_usd, capture, optional trace, and source.
Provider-native details stay inside opaque nested fields such as metadata,
source.metadata, tool input, or tool output; they are not custom top-level
row keys.
Rows without schema_version, capture, or trace from older AgentV transcript
exports remain replayable. New eval run artifacts write the v1 shape.
For eval run artifacts, transcript.jsonl is the portable message/event
projection. AgentV does not persist a public trace.json run sidecar, and the
transcript is not a provider-native session dump. Provider-native session or
stream logs, when captured during a new eval run, are preserved in
transcript-raw.jsonl and referenced by transcript_raw_path;
raw_provider_log_path is a legacy/imported pointer when older bundles or
external sources already provide one. Agent Skills import, convert, transpile,
and run paths do not require those legacy log pointers.
What Gets Parsed
Section titled “What Gets Parsed”| Claude Event | AgentV Message |
|---|---|
user | { role: 'user', content } |
assistant | { role: 'assistant', content, toolCalls } |
tool_use blocks | ToolCall { tool, input, id } |
tool_result blocks | Paired with matching tool_use by ID |
progress, system | Skipped |
| Subagent events | Filtered out (v1) |
Token usage is aggregated from the final cumulative value per LLM request. Duration is computed from first-to-last event timestamp.
Workflow
Section titled “Workflow”Import a session, then run graders against it:
# 1. List sessions and pick oneagentv import claude --list
# 2. Import a session by IDagentv import claude --session-id 4c4f9e4e-e6f1-490b-a1b1-9aef543ebf22
# 3. Run graders against the imported transcriptagentv eval evals/my-eval.yaml --transcript .agentv/transcripts/claude-4c4f9e4e.jsonlSee examples/features/import-claude/ for a complete working example.
HuggingFace Datasets (SWE-bench)
Section titled “HuggingFace Datasets (SWE-bench)”Use scripts/import-huggingface.py to convert HuggingFace benchmark datasets into AgentV eval files. Currently supports SWE-bench-style datasets.
uv run scripts/import-huggingface.py \ --repo SWE-bench/SWE-bench_Verified \ --split test \ --limit 10 \ --output evals/swebench/Each instance becomes an EVAL.yaml with:
input— the problem statementworkspace.docker.image— the pre-built SWE-bench Docker image (ghcr.io/epoch-research/swe-bench.eval.x86_64.<instance_id>:latest)workspace.repos[].base_commit— the commit to reset to before the agent runsassertions—code-gradertasks that runFAIL_TO_PASSandPASS_TO_PASSpytest suites inside the container
Run an imported SWE-bench eval against any coding agent target:
# Import one instanceuv run scripts/import-huggingface.py \ --repo SWE-bench/SWE-bench_Verified \ --limit 1 \ --output /tmp/swebench-eval/
# Run with a coding agent targetagentv eval /tmp/swebench-eval/*.EVAL.yaml --target codexThe Docker workspace spins up the pre-built SWE-bench image, checks out base_commit, runs the agent to apply a patch, then grades by running the test suite inside the container.