v7 Python Authoring Guide

Thin Python Front Door

The Python authoring layer is a guided front door into the existing v7 training pipeline. Python owns project specification, model/template/tokenizer planning, notebook UX, and authoring-side sidecars; the existing v7 scripts still own manifest creation, IR lowering, code generation, compiled C runtime execution, and viewer refresh.

Use this page when you want three things in one place: the notebook launch order, the exact Python syntax for both authoring surfaces, and the concrete handoff boundary into ck_run_v7.py.

Current scope: this is not a separate eager runtime and it is not a general Python autograd surface. Today it is a thin authoring and orchestration layer over the working v7 train/runtime stack.

Choose Your Route

Demo Route

Use this for onboarding, screenshots, or a live walkthrough of the current authoring story.

01 story 02 quickstart 04 artifact walkthrough

This route shows the smallest successful Python-authored run and then opens the run-dir artifact surface.

Module API Route

Use this when you want the thin ck.nn graph adapter and the exported graph/config/pass-trace sidecars.

01 story 02 quickstart 05 module API

This route is the best introduction to ck.models.qwen3_tiny(...) plus ck.v7.compile(...).

Dataset Route

Use this when the question is about SVG/DSL workspace staging, manifests, and run-local dataset artifacts.

01 story 03 dataset prep v7 SVG handoff

This route leads into the SVG dataset runbook when you need the full curriculum/data path.

Notebook Lane

Canonical folder: notebooks/python_authoring/v7_training/

Compatibility alias: notebooks/v7_training/ still points at the same lane for older docs and demo commands.

Open the full lane in JupyterLab:

.venv/bin/jupyter lab notebooks/python_authoring/v7_training/

Launch from the repo root so the notebooks can resolve ckernel_engine/, version/v7/, and the site/run artifact paths correctly.

01. Experiment Story Walkthrough

Presenter-first overview of the spec02 -> spec19 arc, the current training surface, and the live-demo handoff into the rest of the lane.

story history demo opener

.venv/bin/jupyter lab notebooks/python_authoring/v7_training/01_v7_experiment_story_walkthrough.ipynb

02. Python Authoring Quickstart

Smallest end-to-end Python-authored run: materialize -> train -> prepare_viewers() with a run artifact dashboard.

quickstart TrainingProject artifact dashboard

.venv/bin/jupyter lab notebooks/python_authoring/v7_training/02_v7_python_authoring_quickstart.ipynb

03. DSL Dataset Preparation

Workspace inspection, artifact materialization, staging into $RUN/dataset/, and dataset_viewer.html refresh for the SVG/DSL flow.

dataset workspace staging

.venv/bin/jupyter lab notebooks/python_authoring/v7_training/03_v7_dsl_dataset_preparation.ipynb

04. Artifact Walkthrough

Run-dir inspection notebook for python_authoring_plan.json, manifests, IR, layout, codegen, and generated reports.

IR layout codegen

.venv/bin/jupyter lab notebooks/python_authoring/v7_training/04_v7_python_authoring_artifact_walkthrough.ipynb

05. Module API Quickstart

Thin ck.nn graph capture plus ck.v7.compile(...), then the same current v7 materialize/train/viewer flow with graph/config/pass-trace sidecars.

ck.nn compile() sidecars

.venv/bin/jupyter lab notebooks/python_authoring/v7_training/05_v7_python_module_api_quickstart.ipynb

Authoring Syntax

1. Template-First Python UI

This surface is the most explicit. You author the tiny run contract directly, then call the existing v7 actions from Python.

from ckernel_engine.v7 import (
    DataSource,
    MaterializeOptions,
    TemplateSpec,
    TinyModelSpec,
    TokenizerPlan,
    TrainConfig,
    TrainingProject,
)

project = TrainingProject(
    run_name="python-ui-demo",
    model=TinyModelSpec(
        init="xavier_uniform",
        layers=2,
        vocab_size=256,
        embed_dim=128,
        hidden_dim=256,
        num_heads=8,
        num_kv_heads=4,
        context_len=128,
    ),
    template=TemplateSpec.builtin_template("qwen3"),
    tokenizer=TokenizerPlan(
        family="runtime_default",
        notes="Keep tokenizer ownership in the existing v7 runtime.",
    ),
)

project.materialize(MaterializeOptions(generate_ir=True, generate_runtime=True, strict=True))
result = project.train(
    DataSource.inline_text("C-Kernel-Engine from Python."),
    TrainConfig(backend="ck", strict=True, epochs=1, seq_len=8, total_tokens=64),
)
viewer = project.prepare_viewers()

Use this when you want direct control over the tiny model/template/tokenizer spec rather than a symbolic module graph.

2. Module-First `ck.nn` Adapter

This surface captures a supported tiny qwen-style graph, records authoring-side metadata, and compiles it into the same existing v7 pipeline.

import ckernel_engine as ck

model = ck.models.qwen3_tiny(
    vocab=256,
    dim=128,
    layers=2,
    hidden=256,
    heads=8,
    kv_heads=4,
    context_len=128,
    init="xavier_uniform",
)

run = ck.v7.compile(
    model,
    run_name="py-module-demo",
    family="qwen3",
    config=ck.CompileConfig(
        target=ck.TargetConfig(name="cpu", isa="auto"),
        vectorize=True,
        pack_weights=True,
        dump_pass_trace=True,
        kernel_policy="fp32_reference_first",
    ),
)

run.materialize()
report = run.train(
    "C-Kernel-Engine module API example.",
    ck.v7.TrainConfig(backend="ck", strict=True, epochs=1, seq_len=8, total_tokens=64),
)
viewer = run.prepare_viewers()

Use this when you want python_authoring_graph.json, python_authoring_compile_config.json, and python_authoring_pass_trace.json beside the normal run outputs.

Important: CompileConfig is mostly recorded as authoring/lowering intent today. The concrete lowering, kernel binding, generated C runtime codegen, compilation, and execution still belong to the existing v7 scripts.

What Happens Under the Hood

Python Call	What It Triggers	Primary Outputs
`materialize()`	Calls `ck_run_v7.py init` with the authored tiny-model/template contract.	`weights_manifest.json`, `weights.bump`, `ir1_train_forward.json`, `ir2_train_backward.json`, `generated_train_runtime_v7.c`
`train()`, `sanity()`, `parity()`	Calls the corresponding `ck_run_v7.py` action with Python-authored train/data settings.	`train_e2e_latest.json` or peer report, training curves, runtime artifacts, checkpoints when enabled
`prepare_viewers()`	Runs `open_ir_visualizer.py`, `prepare_run_viewer.py`, and `open_ir_hub.py`.	`ir_report.html`, `embeddings.json`, optional `dataset_viewer.html`, shared `ir_hub.html`
`ck.v7.compile(...)`	Captures the supported `ck.nn` graph and materializes authoring sidecars before the normal `v7` handoff.	`python_authoring_graph.json`, `python_authoring_graph.md`, `python_authoring_compile_config.json`, `python_authoring_pass_trace.json`

Artifacts to Inspect First

Artifact	Why It Matters
`python_authoring_plan.json`	Best first file for command history, authored config, and resolved artifact locations.
`python_authoring_graph.json`	Module graph exported by the `ck.nn` adapter before the normal `v7` handoff.
`python_authoring_compile_config.json`	Recorded target/pass intent from the authoring side.
`weights_manifest.json`	Source of truth for run dimensions, optimizer settings, and train/runtime shape contracts.
`ir1_train_forward.json` / `ir2_train_backward.json`	The lowered train graphs the current `v7` scripts actually execute.
`generated_train_runtime_v7.c`	Concrete generated C runtime emitted from the lowered training plan.
`ir_report.html`	Main visualizer for the current run.
`ir_hub.html`	Cross-run dashboard under the parent models root.

Current Boundary

What Works Now

Tiny-model authoring from Python through the existing v7 runtime and training path.
Notebook-driven materialize, train, and viewer refresh workflows.
Thin ck.nn graph capture for supported qwen-style tiny LM topologies.
Authoring-side sidecars for graph, compile intent, and pass trace.

What Is Still Out of Scope

Arbitrary Python-defined autograd graphs lowered directly into v7 IR.
A general eager tensor runtime or PyTorch-compatible authoring surface.
Public custom op/backprop registration for arbitrary module graphs.
Guaranteed dataset_viewer.html or attention.json on runs that do not include dataset/tokenizer/probe artifacts.

Current Module Adapter Contract

ck.v7.compile(...) currently targets supported tiny qwen-style LM graphs.
The working preset families are qwen3 and qwen35.
The body is expected to be Embedding -> TransformerBlock* -> RMSNorm -> Linear.
For the current qwen-style path, blocks must use activation="swiglu".

CLI Entrypoints

Example Scripts

python3 version/v7/examples/python_authoring_tiny_lm_v7.py --run-name py-ui-demo
python3 version/v7/examples/python_module_api_tiny_lm_v7.py --run-name py-module-demo

These are the fastest non-notebook entrypoints for smoke-testing the two Python authoring surfaces.

v7 Inference + Training Runbook: operator workflow for the main v7 path.
v7 SVG Dataset Runbook: dataset generation and staging for the SVG/DSL path.
Architecture Links: index of related runbooks and architecture pages.