v7 Training Parity Checklist
Use this checklist as the operator gate for runbook execution readiness. It answers one operational question: can the runbook proceed on this run directory right now?
Scope
This checklist is for runbook readiness, not final numerical signoff.
Dataset Gate
dataset_qc.json must be present and pass.
Tokenizer Gate
tokenizer_roundtrip.json must report exact_match == true.
Parity Regimen Gate
D1, E1, and F1 must pass in the latest regimen summary.
Canary Gate
The row1/row2 parity canary must pass.
A1/A2 may still fail today. Treat those as an active kernel-harness bug track,
currently suspected around the SwiGLU harness path, separate from runbook execution readiness.
0) Set Run Path
export RUN="$HOME/.cache/ck-engine-v7/models/train/v7_svg_assets_bpe_l24_full_e1_seq128" cd /home/antshiv/Workspace/C-Kernel-Engine
1) Dataset + Tokenizer Gates
jq '{status, checks, non_empty_lines, path}' "$RUN/dataset_qc.json"
jq '{status, exact_match, line_eval, tokenizer_json_path}' "$RUN/tokenizer_roundtrip.json"
Pass criteria:
dataset_qc.status == "pass"tokenizer_roundtrip.status == "pass"tokenizer_roundtrip.exact_match == true
2) Canary Parity Gate (row1/row2)
Run Step 3.1 from the main runbook, then verify pass lines:
python3 - <<'PY'
import json
import os
from pathlib import Path
from statistics import mean
TH_MAX = 1e-4
TH_MEAN = 5e-5
TH_PARAM = 1e-4
run_env = os.environ.get("RUN", "").strip()
if not run_env:
print("[FAIL] RUN env var is empty")
raise SystemExit(1)
root = Path(run_env)
ok = True
for idx in (1, 2):
run_dir = root / f"parity_svg_row{idx}" / ".ck_pipeline"
work_dirs = sorted([p for p in run_dir.glob("ascii_bpe_*") if p.is_dir()])
if not work_dirs:
print(f"[FAIL] row{idx}: missing {run_dir}/ascii_bpe_*")
ok = False
continue
w = work_dirs[-1]
ck = json.loads((w / "train_ck.json").read_text())
pt = json.loads((w / "train_torch_ref.json").read_text())
c = [float(x["loss_ck"]) for x in ck.get("loss_curve", [])]
t = [float(x["loss"]) for x in pt.get("loss_curve", [])]
n = min(len(c), len(t))
if n == 0:
print(f"[FAIL] row{idx}: empty loss curves")
ok = False
continue
diffs = [abs(c[i] - t[i]) for i in range(n)]
max_abs = max(diffs)
mean_abs = mean(diffs)
final_param = float(ck.get("final_param_max_abs_diff", 1.0))
passed = max_abs <= TH_MAX and mean_abs <= TH_MEAN and final_param <= TH_PARAM
print(f"[row{idx}] max_abs={max_abs:.6e} mean_abs={mean_abs:.6e} final_param={final_param:.6e} pass={passed}")
ok = ok and passed
print("CANARY_PARITY_GATE=PASS" if ok else "CANARY_PARITY_GATE=FAIL")
PY
3) Run Full Parity Regimen
python3 version/v7/scripts/run_training_parity_regimen_v7.py \ --run-dir "$RUN" \ --force
Inspect summary:
jq '.summary' "$RUN/training_parity_regimen_latest.json"
Inspect stage table quickly:
jq '.stages[] | {id,name,status,metrics,artifact_json,artifact_log}' \
"$RUN/training_parity_regimen_latest.json"
Check generated-runtime stages:
jq '.stages[] | select(.id=="D1" or .id=="E1" or .id=="F1") | {id,status,metrics}' \
"$RUN/training_parity_regimen_latest.json"
4) One-Shot GO Evaluation
python3 - <<'PY'
import json
import os
from pathlib import Path
run_env = os.environ.get("RUN", "").strip()
if not run_env:
print("[FAIL] RUN env var is empty")
raise SystemExit(1)
run = Path(run_env)
def load_json(path: Path):
return json.loads(path.read_text()) if path.exists() else None
ds = load_json(run / "dataset_qc.json")
rt = load_json(run / "tokenizer_roundtrip.json")
reg = load_json(run / "training_parity_regimen_latest.json")
checks = {}
checks["dataset_qc_pass"] = bool(ds and ds.get("status") == "pass")
checks["tokenizer_exact_match"] = bool(rt and rt.get("status") == "pass" and rt.get("exact_match") is True)
d1e1f1_ok = False
if reg and isinstance(reg.get("stages"), list):
st = {s.get("id"): s.get("status") for s in reg["stages"] if isinstance(s, dict)}
d1e1f1_ok = st.get("D1") == "PASS" and st.get("E1") == "PASS" and st.get("F1") == "PASS"
checks["D1_E1_F1_pass"] = d1e1f1_ok
def row_pass(idx: int) -> bool:
p = run / f"parity_svg_row{idx}" / "parity_pipeline.json"
if not p.exists():
return False
j = json.loads(p.read_text())
return bool(j.get("status") == "pass")
checks["canary_row1_row2_pass"] = row_pass(1) and row_pass(2)
go = all(checks.values())
print(json.dumps({"GO": go, "checks": checks}, indent=2))
print("GO_EVIDENCE=PASS" if go else "GO_EVIDENCE=FAIL")
PY
5) A1/A2 Caveat and Bug Track
Backend xray is produced by the same regimen run:
jq '.summary, .improvement' "$RUN/regimen_backend_xray.json"
Read suspected source:
jq '.summary.suspected_source, .summary.rationale' "$RUN/regimen_backend_xray.json"
Inspect first-step gradient drift:
jq '{step, global_max_abs_diff, global_mean_abs_diff, worst_tensor, top5: (.per_tensor|sort_by(-.max_abs_diff)|.[0:5])}' \
"$RUN/regimen_debug_step_grads/step_00000001_grad_diff_summary.json"
Interpretation:
A1/A2failing does not block runbook execution readiness under this checklist.A1/A2still blocks strict kernel-harness parity signoff.- Track and fix
A1/A2in parallel while continuing operator runbook validation.
6) Operator Code Touchpoints
Regimen orchestration
version/v7/scripts/run_training_parity_regimen_v7.py
CK/PyTorch parity harness
version/v7/scripts/train_parity_epochs_v7.py
RMSNorm kernels
src/kernels/rmsnorm_kernels.c
SwiGLU kernels
src/kernels/swiglu_kernels.c
7) Go / No-Go
- dataset QC pass
- tokenizer exact roundtrip pass
D1/E1/F1pass- canary
row1/row2pass
A1/A2 closure.
8) Exploratory Training
Exploratory means:
- training is functional and loss can decrease
- outputs can improve
- CK-vs-PyTorch numerical equivalence is not guaranteed over horizon
- behavior and regression conclusions are therefore provisional
Use exploratory mode for idea testing, not for final parity claims.