Run History

Spec Training Results

Manifest-driven visibility for the training lines. This page is generated from local run ledgers and probe reports, while spec-training-method.html stays focused on the stable method and promotion rules.

Page split: stable method lives on Spec Training Method, generated run evidence lives here, and bulky raw artifacts are intended to move under training-archive/ as a future submodule mount instead of growing the main docs tree.
Specs With Probe Data
19

Generated from 2 source roots and refreshed at 2026-04-13 12:46 UTC.

Run Directories
112

Stage records: 214. Probe reports: 128.

Current Champion
spec12

r17 at 100.0% visible exact and 100.0% hidden exact.

History Split

Method Page

Keep stable policy here: spec/rung definitions, reset-line rules, promotion gates, and decode-repair versus retrain decisions.

Open spec-training-method.html

Generated History

This page is generated from version/v7/reports/spec_training_manifest.json, not hand-maintained prose.

It should be the first stop for “which rung is best?” and “what regressed?”

Archive Mount

Raw ledgers, probe reports, tested-prompt reports, and large run folders can move under training-archive/ as a submodule without breaking the docs surface.

The docs should depend on the compact manifest, not on browsing the archive directly.

Best Rungs By Spec

Spec Best Rung Visible Exact Hidden Exact Renderable Run Dirs Status / Lesson
spec04
Structured Atoms
spec04_structured_scenes_ctx512_d64_h128_v2240.0%100.0%2regressed
failed – no correct outputs
spec05
Structured Scenes
r280.8%92.3%1iterating
strong – minor gaps remain
spec06
Structured Infographics
r691.7%100.0%7iterating
strong – minor gaps remain
spec07
Scene DSL v1
r238.9%75.0%2regressed
weak – fundamental issues likely
spec08
Rich Scene DSL
r280.6%91.7%2iterating
strong – minor gaps remain
spec10
Asset Scene DSL
r494.3%100.0%5iterating
strong – minor gaps remain
spec11
Keyed Scene DSL
r2100.0%100.0%3gold
converged – perfect exact match
spec12
Scene DSL (gold)
r17100.0%100.0%100.0%20gold
converged – perfect exact match
spec13a
Intent-Prompt Bridge
r275.0%41.7%100.0%6iterating
partial – further iteration needed
spec13b
Decision-Tree Scene IR
r450.0%66.7%100.0%8iterating
partial – further iteration needed
spec14a
Comparison Board Family
r10100.0%90.0%100.0%11gold
converged – perfect exact match
spec14b
Timeline Family
r3100.0%100.0%100.0%3gold
converged – perfect exact match
spec15a
Memory Map Family
r9100.0%100.0%100.0%11gold
converged – perfect exact match
spec15b
System Diagram Family
r2100.0%100.0%100.0%2gold
converged – perfect exact match
spec16
Generalized Visual Bundle
r977.8%100.0%86.1%12iterating
partial – further iteration needed
spec17
spec17
r323.3%0.0%53.3%4regressed
weak – fundamental issues likely
spec18
spec18
r16.7%0.0%53.3%1regressed
weak – fundamental issues likely
spec19
spec19
spec19_scene_bundle_l3_d192_h384_ctx768_r3d_sft_b_instruction87.5%91.7%93.8%11iterating
strong – minor gaps remain
spec_broader_1_scene_dsl_l3_d192_h384_ctx512_r1
spec_broader_1_scene_dsl_l3_d192_h384_ctx512_r1
r12.4%54.8%1regressed
weak – fundamental issues likely

Recent Run Ledger

Run Latest Stage Finished Visible Exact Hidden Exact Renderable Artifact Dir
spec_broader_1_scene_dsl_l3_d192_h384_ctx512_r1
r1
midtrain2026-04-04 14:44 UTC2.4%54.8%spec_broader_1_scene_dsl_l3_d192_h384_ctx512_r1
spec19
spec19_scene_bundle_l3_d192_h384_ctx768_r3d_sft_b_instruction
sft2026-04-04 07:35 UTC87.5%91.7%93.8%spec19_scene_bundle_l3_d192_h384_ctx768_r3d_sft_b_instruction
spec19
spec19_scene_bundle_l3_d192_h384_ctx768_r3d_sft_instruction
sft2026-04-03 22:43 UTC81.2%91.7%87.5%spec19_scene_bundle_l3_d192_h384_ctx768_r3d_sft_instruction
spec19
spec19_scene_bundle_l3_d192_h384_ctx768_r4_unified_curriculum
midtrain2026-04-03 20:44 UTC59.4%66.7%84.4%spec19_scene_bundle_l3_d192_h384_ctx768_r4_unified_curriculum
spec19
spec19_scene_bundle_l3_d192_h384_ctx768_r3f_cumulative_balanced
midtrain2026-04-03 03:18 UTC81.2%75.0%87.5%spec19_scene_bundle_l3_d192_h384_ctx768_r3f_cumulative_balanced
spec19
spec19_scene_bundle_l3_d192_h384_ctx768_r3e_route_recovery
midtrain2026-04-02 22:25 UTC78.1%66.7%90.6%spec19_scene_bundle_l3_d192_h384_ctx768_r3e_route_recovery
spec19
spec19_scene_bundle_l3_d192_h384_ctx768_r3d_balanced_coverage
midtrain2026-04-02 16:11 UTC84.4%91.7%93.8%spec19_scene_bundle_l3_d192_h384_ctx768_r3d_balanced_coverage
spec19
spec19_scene_bundle_l3_d192_h384_ctx768_r3c_cumulative_neighbors
midtrain2026-04-02 13:43 UTC81.2%83.3%90.6%spec19_scene_bundle_l3_d192_h384_ctx768_r3c_cumulative_neighbors
spec19
spec19_scene_bundle_l3_d192_h384_ctx768_r3b_coherent_replay
midtrain2026-04-02 06:08 UTC81.2%83.3%93.8%spec19_scene_bundle_l3_d192_h384_ctx768_r3b_coherent_replay
spec19
spec19_scene_bundle_l3_d192_h384_ctx768_r3a_delta_replay
midtrain2026-04-02 05:50 UTC86.1%83.3%91.7%spec19_scene_bundle_l3_d192_h384_ctx768_r3a_delta_replay
spec19
r2
midtrain2026-04-02 05:34 UTC71.9%83.3%84.4%spec19_scene_bundle_l3_d192_h384_ctx768_r2
spec19
r1
midtrain2026-04-02 04:37 UTC6.7%0.0%56.7%spec19_scene_bundle_l3_d192_h384_ctx768_r1
spec18
r1
midtrain2026-04-01 23:09 UTC6.7%0.0%53.3%spec18_scene_bundle_l3_d192_h384_ctx768_r1
spec17
r4
midtrain2026-04-01 21:11 UTC13.3%0.0%66.7%spec17_scene_bundle_l3_d192_h384_ctx768_r4
spec17
r3
midtrain2026-04-01 20:22 UTC23.3%0.0%53.3%spec17_scene_bundle_l3_d192_h384_ctx768_r3
spec17
r2
midtrain2026-04-01 19:43 UTC13.3%0.0%26.7%spec17_scene_bundle_l3_d192_h384_ctx768_r2

Archive Policy

Submodule-ready split: keep the compact manifest and generated docs page in the main repo, then mount a future raw artifact archive at training-archive/.

Current source roots:

Refresh command:

bash docs/site/build.sh
Image
100% | |
Scroll to zoom | Drag to pan | W/H to fit | 0 to reset | ESC to close