v7 Python Authoring Guide

Thin Python Front Door

The Python authoring layer is a guided front door into the existing v7 training pipeline. Python owns project specification, model/template/tokenizer planning, notebook UX, and authoring-side sidecars; the existing v7 scripts still own manifest creation, IR lowering, code generation, compiled C runtime execution, and viewer refresh.

Use this page when you want three things in one place: the notebook launch order, the exact Python syntax for both authoring surfaces, and the concrete handoff boundary into ck_run_v7.py.

Current scope: this is not a separate eager runtime and it is not a general Python autograd surface. Today it is a thin authoring and orchestration layer over the working v7 train/runtime stack.

Choose Your Route

Demo Route

Use this for onboarding, screenshots, or a live walkthrough of the current authoring story.

01 story 02 quickstart 04 artifact walkthrough

This route shows the smallest successful Python-authored run and then opens the run-dir artifact surface.

Module API Route

Use this when you want the thin ck.nn graph adapter and the exported graph/config/pass-trace sidecars.

01 story 02 quickstart 05 module API

This route is the best introduction to ck.models.qwen3_tiny(...) plus ck.v7.compile(...).

Dataset Route

Use this when the question is about SVG/DSL workspace staging, manifests, and run-local dataset artifacts.

01 story 03 dataset prep v7 SVG handoff

This route leads into the SVG dataset runbook when you need the full curriculum/data path.

Notebook Lane

Canonical folder: notebooks/python_authoring/v7_training/

Compatibility alias: notebooks/v7_training/ still points at the same lane for older docs and demo commands.

Open the full lane in JupyterLab:

.venv/bin/jupyter lab notebooks/python_authoring/v7_training/

Launch from the repo root so the notebooks can resolve ckernel_engine/, version/v7/, and the site/run artifact paths correctly.

01. Experiment Story Walkthrough

Presenter-first overview of the spec02 -> spec19 arc, the current training surface, and the live-demo handoff into the rest of the lane.

story history demo opener
.venv/bin/jupyter lab notebooks/python_authoring/v7_training/01_v7_experiment_story_walkthrough.ipynb

02. Python Authoring Quickstart

Smallest end-to-end Python-authored run: materialize -> train -> prepare_viewers() with a run artifact dashboard.

quickstart TrainingProject artifact dashboard
.venv/bin/jupyter lab notebooks/python_authoring/v7_training/02_v7_python_authoring_quickstart.ipynb

03. DSL Dataset Preparation

Workspace inspection, artifact materialization, staging into $RUN/dataset/, and dataset_viewer.html refresh for the SVG/DSL flow.

dataset workspace staging
.venv/bin/jupyter lab notebooks/python_authoring/v7_training/03_v7_dsl_dataset_preparation.ipynb

04. Artifact Walkthrough

Run-dir inspection notebook for python_authoring_plan.json, manifests, IR, layout, codegen, and generated reports.

IR layout codegen
.venv/bin/jupyter lab notebooks/python_authoring/v7_training/04_v7_python_authoring_artifact_walkthrough.ipynb

05. Module API Quickstart

Thin ck.nn graph capture plus ck.v7.compile(...), then the same current v7 materialize/train/viewer flow with graph/config/pass-trace sidecars.

ck.nn compile() sidecars
.venv/bin/jupyter lab notebooks/python_authoring/v7_training/05_v7_python_module_api_quickstart.ipynb

Authoring Syntax

1. Template-First Python UI

This surface is the most explicit. You author the tiny run contract directly, then call the existing v7 actions from Python.

from ckernel_engine.v7 import (
    DataSource,
    MaterializeOptions,
    TemplateSpec,
    TinyModelSpec,
    TokenizerPlan,
    TrainConfig,
    TrainingProject,
)

project = TrainingProject(
    run_name="python-ui-demo",
    model=TinyModelSpec(
        init="xavier_uniform",
        layers=2,
        vocab_size=256,
        embed_dim=128,
        hidden_dim=256,
        num_heads=8,
        num_kv_heads=4,
        context_len=128,
    ),
    template=TemplateSpec.builtin_template("qwen3"),
    tokenizer=TokenizerPlan(
        family="runtime_default",
        notes="Keep tokenizer ownership in the existing v7 runtime.",
    ),
)

project.materialize(MaterializeOptions(generate_ir=True, generate_runtime=True, strict=True))
result = project.train(
    DataSource.inline_text("C-Kernel-Engine from Python."),
    TrainConfig(backend="ck", strict=True, epochs=1, seq_len=8, total_tokens=64),
)
viewer = project.prepare_viewers()

Use this when you want direct control over the tiny model/template/tokenizer spec rather than a symbolic module graph.

2. Module-First ck.nn Adapter

This surface captures a supported tiny qwen-style graph, records authoring-side metadata, and compiles it into the same existing v7 pipeline.

import ckernel_engine as ck

model = ck.models.qwen3_tiny(
    vocab=256,
    dim=128,
    layers=2,
    hidden=256,
    heads=8,
    kv_heads=4,
    context_len=128,
    init="xavier_uniform",
)

run = ck.v7.compile(
    model,
    run_name="py-module-demo",
    family="qwen3",
    config=ck.CompileConfig(
        target=ck.TargetConfig(name="cpu", isa="auto"),
        vectorize=True,
        pack_weights=True,
        dump_pass_trace=True,
        kernel_policy="fp32_reference_first",
    ),
)

run.materialize()
report = run.train(
    "C-Kernel-Engine module API example.",
    ck.v7.TrainConfig(backend="ck", strict=True, epochs=1, seq_len=8, total_tokens=64),
)
viewer = run.prepare_viewers()

Use this when you want python_authoring_graph.json, python_authoring_compile_config.json, and python_authoring_pass_trace.json beside the normal run outputs.

Important: CompileConfig is mostly recorded as authoring/lowering intent today. The concrete lowering, kernel binding, generated C runtime codegen, compilation, and execution still belong to the existing v7 scripts.

What Happens Under the Hood

Python Call What It Triggers Primary Outputs
materialize() Calls ck_run_v7.py init with the authored tiny-model/template contract. weights_manifest.json, weights.bump, ir1_train_forward.json, ir2_train_backward.json, generated_train_runtime_v7.c
train(), sanity(), parity() Calls the corresponding ck_run_v7.py action with Python-authored train/data settings. train_e2e_latest.json or peer report, training curves, runtime artifacts, checkpoints when enabled
prepare_viewers() Runs open_ir_visualizer.py, prepare_run_viewer.py, and open_ir_hub.py. ir_report.html, embeddings.json, optional dataset_viewer.html, shared ir_hub.html
ck.v7.compile(...) Captures the supported ck.nn graph and materializes authoring sidecars before the normal v7 handoff. python_authoring_graph.json, python_authoring_graph.md, python_authoring_compile_config.json, python_authoring_pass_trace.json

Artifacts to Inspect First

Artifact Why It Matters
python_authoring_plan.json Best first file for command history, authored config, and resolved artifact locations.
python_authoring_graph.json Module graph exported by the ck.nn adapter before the normal v7 handoff.
python_authoring_compile_config.json Recorded target/pass intent from the authoring side.
weights_manifest.json Source of truth for run dimensions, optimizer settings, and train/runtime shape contracts.
ir1_train_forward.json / ir2_train_backward.json The lowered train graphs the current v7 scripts actually execute.
generated_train_runtime_v7.c Concrete generated C runtime emitted from the lowered training plan.
ir_report.html Main visualizer for the current run.
ir_hub.html Cross-run dashboard under the parent models root.

Current Boundary

What Works Now

  • Tiny-model authoring from Python through the existing v7 runtime and training path.
  • Notebook-driven materialize, train, and viewer refresh workflows.
  • Thin ck.nn graph capture for supported qwen-style tiny LM topologies.
  • Authoring-side sidecars for graph, compile intent, and pass trace.

What Is Still Out of Scope

  • Arbitrary Python-defined autograd graphs lowered directly into v7 IR.
  • A general eager tensor runtime or PyTorch-compatible authoring surface.
  • Public custom op/backprop registration for arbitrary module graphs.
  • Guaranteed dataset_viewer.html or attention.json on runs that do not include dataset/tokenizer/probe artifacts.

Current Module Adapter Contract

  • ck.v7.compile(...) currently targets supported tiny qwen-style LM graphs.
  • The working preset families are qwen3 and qwen35.
  • The body is expected to be Embedding -> TransformerBlock* -> RMSNorm -> Linear.
  • For the current qwen-style path, blocks must use activation="swiglu".

CLI Entrypoints

Example Scripts

python3 version/v7/examples/python_authoring_tiny_lm_v7.py --run-name py-ui-demo
python3 version/v7/examples/python_module_api_tiny_lm_v7.py --run-name py-module-demo

These are the fastest non-notebook entrypoints for smoke-testing the two Python authoring surfaces.

Related Pages

Image
100% | |
Scroll to zoom | Drag to pan | W/H to fit | 0 to reset | ESC to close