Spec Training Results
Manifest-driven visibility for the training lines. This page is generated from local run ledgers and probe reports, while spec-training-method.html stays focused on the stable method and promotion rules.
training-archive/
as a future submodule mount instead of growing the main docs tree.
Generated from 2 source roots and refreshed at 2026-04-13 12:46 UTC.
Stage records: 214. Probe reports: 128.
r17 at 100.0% visible exact and 100.0% hidden exact.
History Split
Method Page
Keep stable policy here: spec/rung definitions, reset-line rules, promotion gates, and decode-repair versus retrain decisions.
Generated History
This page is generated from version/v7/reports/spec_training_manifest.json, not hand-maintained prose.
It should be the first stop for “which rung is best?” and “what regressed?”
Archive Mount
Raw ledgers, probe reports, tested-prompt reports, and large run folders can move under training-archive/ as a submodule without breaking the docs surface.
The docs should depend on the compact manifest, not on browsing the archive directly.
Best Rungs By Spec
| Spec | Best Rung | Visible Exact | Hidden Exact | Renderable | Run Dirs | Status / Lesson |
|---|---|---|---|---|---|---|
| spec04 Structured Atoms | spec04_structured_scenes_ctx512_d64_h128_v224 | 0.0% | — | 100.0% | 2 | regressed failed – no correct outputs |
| spec05 Structured Scenes | r2 | 80.8% | — | 92.3% | 1 | iterating strong – minor gaps remain |
| spec06 Structured Infographics | r6 | 91.7% | — | 100.0% | 7 | iterating strong – minor gaps remain |
| spec07 Scene DSL v1 | r2 | 38.9% | — | 75.0% | 2 | regressed weak – fundamental issues likely |
| spec08 Rich Scene DSL | r2 | 80.6% | — | 91.7% | 2 | iterating strong – minor gaps remain |
| spec10 Asset Scene DSL | r4 | 94.3% | — | 100.0% | 5 | iterating strong – minor gaps remain |
| spec11 Keyed Scene DSL | r2 | 100.0% | — | 100.0% | 3 | gold converged – perfect exact match |
| spec12 Scene DSL (gold) | r17 | 100.0% | 100.0% | 100.0% | 20 | gold converged – perfect exact match |
| spec13a Intent-Prompt Bridge | r2 | 75.0% | 41.7% | 100.0% | 6 | iterating partial – further iteration needed |
| spec13b Decision-Tree Scene IR | r4 | 50.0% | 66.7% | 100.0% | 8 | iterating partial – further iteration needed |
| spec14a Comparison Board Family | r10 | 100.0% | 90.0% | 100.0% | 11 | gold converged – perfect exact match |
| spec14b Timeline Family | r3 | 100.0% | 100.0% | 100.0% | 3 | gold converged – perfect exact match |
| spec15a Memory Map Family | r9 | 100.0% | 100.0% | 100.0% | 11 | gold converged – perfect exact match |
| spec15b System Diagram Family | r2 | 100.0% | 100.0% | 100.0% | 2 | gold converged – perfect exact match |
| spec16 Generalized Visual Bundle | r9 | 77.8% | 100.0% | 86.1% | 12 | iterating partial – further iteration needed |
| spec17 spec17 | r3 | 23.3% | 0.0% | 53.3% | 4 | regressed weak – fundamental issues likely |
| spec18 spec18 | r1 | 6.7% | 0.0% | 53.3% | 1 | regressed weak – fundamental issues likely |
| spec19 spec19 | spec19_scene_bundle_l3_d192_h384_ctx768_r3d_sft_b_instruction | 87.5% | 91.7% | 93.8% | 11 | iterating strong – minor gaps remain |
| spec_broader_1_scene_dsl_l3_d192_h384_ctx512_r1 spec_broader_1_scene_dsl_l3_d192_h384_ctx512_r1 | r1 | 2.4% | — | 54.8% | 1 | regressed weak – fundamental issues likely |
Recent Run Ledger
| Run | Latest Stage | Finished | Visible Exact | Hidden Exact | Renderable | Artifact Dir |
|---|---|---|---|---|---|---|
| spec_broader_1_scene_dsl_l3_d192_h384_ctx512_r1 r1 | midtrain | 2026-04-04 14:44 UTC | 2.4% | — | 54.8% | spec_broader_1_scene_dsl_l3_d192_h384_ctx512_r1 |
| spec19 spec19_scene_bundle_l3_d192_h384_ctx768_r3d_sft_b_instruction | sft | 2026-04-04 07:35 UTC | 87.5% | 91.7% | 93.8% | spec19_scene_bundle_l3_d192_h384_ctx768_r3d_sft_b_instruction |
| spec19 spec19_scene_bundle_l3_d192_h384_ctx768_r3d_sft_instruction | sft | 2026-04-03 22:43 UTC | 81.2% | 91.7% | 87.5% | spec19_scene_bundle_l3_d192_h384_ctx768_r3d_sft_instruction |
| spec19 spec19_scene_bundle_l3_d192_h384_ctx768_r4_unified_curriculum | midtrain | 2026-04-03 20:44 UTC | 59.4% | 66.7% | 84.4% | spec19_scene_bundle_l3_d192_h384_ctx768_r4_unified_curriculum |
| spec19 spec19_scene_bundle_l3_d192_h384_ctx768_r3f_cumulative_balanced | midtrain | 2026-04-03 03:18 UTC | 81.2% | 75.0% | 87.5% | spec19_scene_bundle_l3_d192_h384_ctx768_r3f_cumulative_balanced |
| spec19 spec19_scene_bundle_l3_d192_h384_ctx768_r3e_route_recovery | midtrain | 2026-04-02 22:25 UTC | 78.1% | 66.7% | 90.6% | spec19_scene_bundle_l3_d192_h384_ctx768_r3e_route_recovery |
| spec19 spec19_scene_bundle_l3_d192_h384_ctx768_r3d_balanced_coverage | midtrain | 2026-04-02 16:11 UTC | 84.4% | 91.7% | 93.8% | spec19_scene_bundle_l3_d192_h384_ctx768_r3d_balanced_coverage |
| spec19 spec19_scene_bundle_l3_d192_h384_ctx768_r3c_cumulative_neighbors | midtrain | 2026-04-02 13:43 UTC | 81.2% | 83.3% | 90.6% | spec19_scene_bundle_l3_d192_h384_ctx768_r3c_cumulative_neighbors |
| spec19 spec19_scene_bundle_l3_d192_h384_ctx768_r3b_coherent_replay | midtrain | 2026-04-02 06:08 UTC | 81.2% | 83.3% | 93.8% | spec19_scene_bundle_l3_d192_h384_ctx768_r3b_coherent_replay |
| spec19 spec19_scene_bundle_l3_d192_h384_ctx768_r3a_delta_replay | midtrain | 2026-04-02 05:50 UTC | 86.1% | 83.3% | 91.7% | spec19_scene_bundle_l3_d192_h384_ctx768_r3a_delta_replay |
| spec19 r2 | midtrain | 2026-04-02 05:34 UTC | 71.9% | 83.3% | 84.4% | spec19_scene_bundle_l3_d192_h384_ctx768_r2 |
| spec19 r1 | midtrain | 2026-04-02 04:37 UTC | 6.7% | 0.0% | 56.7% | spec19_scene_bundle_l3_d192_h384_ctx768_r1 |
| spec18 r1 | midtrain | 2026-04-01 23:09 UTC | 6.7% | 0.0% | 53.3% | spec18_scene_bundle_l3_d192_h384_ctx768_r1 |
| spec17 r4 | midtrain | 2026-04-01 21:11 UTC | 13.3% | 0.0% | 66.7% | spec17_scene_bundle_l3_d192_h384_ctx768_r4 |
| spec17 r3 | midtrain | 2026-04-01 20:22 UTC | 23.3% | 0.0% | 53.3% | spec17_scene_bundle_l3_d192_h384_ctx768_r3 |
| spec17 r2 | midtrain | 2026-04-01 19:43 UTC | 13.3% | 0.0% | 26.7% | spec17_scene_bundle_l3_d192_h384_ctx768_r2 |
Archive Policy
Submodule-ready split: keep the compact manifest and generated docs page in the main repo, then mount a future raw artifact archive at training-archive/.
- Stable docs stay in
docs/site/_pages/. - Run summaries stay in
version/v7/reports/spec_training_manifest.json. - Heavy artifacts move to
training-archive/once you are ready to create the separate repo.
Current source roots:
/home/antshiv/.cache/ck-engine-v7/models/train(cache)/home/antshiv/Workspace/C-Kernel-Engine/version/v7/runs(repo)
Refresh command:
bash docs/site/build.sh