v7 Visualizer Test Runbook | C-Kernel-Engine

Visualizer Test Runbook

Four-level verification pyramid for the IR visualizer, dataset viewer, and IR hub. Catches broken tabs, missing functions, numeric regressions, and rendering failures — all without npm or Playwright.

Zero dependencies — Python 3.8+ and Node.js only. Every level runs from make v7-visualizer-health.

Static Health

Tabs, panels, render functions, required functions, DOM targets, JS syntax (node --check)

< 1 s

JS Unit Tests

Pure functions extracted & run through Node.js with test vectors. formatBytes, attnColor, cosineSim, embNormalise…

< 2 s

Generated E2E

Generate tiny model → build IR visualizer & dataset viewer → validate generated HTML

~ 60 s

Browser Runtime

Open generated HTML in headless browser, verify tabs render, click interactions, canvas output

future

Test Flow

L1 Static Health → L2 JS Units → L3 Generate Model → L3 Build Visualizers → L3 Validate Output → L4 Browser

L1+L2 run in pre-push (< 3 s). L3 runs in make visualizer-full or nightly. L4 is future work.

L1 Static Health Gate pre-push

Python static analysis of HTML source templates. Zero runtime — no browser, no model needed. Catches missing tabs, undefined functions, broken DOM targets.

python3 version/v7/scripts/test_visualizer_health_v7.py --source

Or via Makefile (runs L1 + L2 together):

make v7-visualizer-health

What It Checks

Category	Visualizer	Checks	Catches
Tab Existence	IR (11) · Dataset (12)	23	Tab button removed or renamed without updating HTML
Panel Existence	IR (11) · Dataset (12)	23	Panel container missing for a tab
Render Functions	IR (10) · Dataset (12)	22	Tab has no matching render function
Required Functions	IR (19) · Dataset (30)	49	Missing attnColor, setElText, embNormalise, etc.
Undefined Call Detection	IR · Dataset · Hub	3	Calling a function that doesn’t exist (the attnColor bug)
DOM Target Coverage	IR	7+	getElementById() with no matching element
JS Syntax	IR · Dataset	1	Syntax errors via `node --check`
Hub Structure	Hub	5	Missing run-card, link templates

Total: ~151 checks across 3 visualizers. All must pass for pre-push.

Produces → visualizer_health_latest.json

L2 JS Unit Tests pre-push

Extracts pure JavaScript functions from the IR visualizer and dataset viewer generator source, writes them to a temp .js file with test vectors, and runs via node. No npm, no bundler, no browser.

python3 version/v7/scripts/test_visualizer_js_units_v7.py

IR Visualizer Functions

Function	Tests	What It Validates
`formatBytes`	4	B/KB/MB/GB formatting
`normalizeShapeInput`	7	Array/string/object → normalised shape array
`formatShapeDisplay`	3	Shape → “2 × 3 × 4” display string
`normalizeMode`	3	Mode canonicalization (prefill/decode)
`escapeHtml`	3	XSS-safe HTML entity escaping
`quoteShell`	3	Shell-safe quoting for command generation
`normalizePathString`	2	Backslash → forward slash, trailing strip
`pathDirname`	3	POSIX parent-directory extraction
`extractGgufStem`	3	Model filename stem from path/URL
`relativePathFromTo`	4	Relative path computation between absolute paths

Dataset Viewer Functions

Function	Tests	What It Validates
`attnColor`	9	Attention heatmap colormaps (orange/blue/green/heatmap)
`embColor`	3	Embedding heatmap blue→mid→orange interpolation
`cosineSim`	4	Cosine similarity: identical, orthogonal, opposite, scaled
`attnEntropy`	4	Attention entropy: uniform, peaked, with zeros
`avgMatrices`	3	Matrix averaging, single, null safety
`embNormalise`	4	Global/col/row normalisation modes + null guard

Total: 78 unit tests across 16 pure functions. Extracts from source → runs via Node.js.

Produces → visualizer_js_units_latest.json

L3 Generated-File E2E nightly

Generates a tiny model, runs inference and training, builds both the IR visualizer and dataset viewer, then runs L1 health checks on the generated HTML output. This catches template regressions where source looks fine but generated output is stale.

Step 3a — Generate & Train Tiny Model

# Initialize tiny model (vocab=256, d=64, layers=2)
python3 version/v7/scripts/ck_run_v7.py init \
  --run-name test_viz_e2e \
  --generate-ir --generate-runtime

# Quick sanity train (1 epoch, 1024 tokens, ~30 seconds)
python3 version/v7/scripts/ck_run_v7.py sanity \
  --run ~/.cache/ck-engine-v7/models/train/test_viz_e2e \
  --train-epochs 1 --train-total-tokens 1024

Step 3b — Generate Visualizers

# IR Visualizer (inference mode)
make visualizer
# IR Visualizer (inference + training)
make visualizer-full

Step 3c — Validate Generated Output

# Run health checks on source + all generated files
python3 version/v7/scripts/test_visualizer_health_v7.py --all

--all scans ~/.cache/ck-engine-v7/models/ for generated ir_report.html and dataset_viewer.html files and validates each against the same 151-check contract. Stale files that are missing new required functions (e.g., setElText) will fail.

Or use the existing E2E harness

# Runs test_ir_visualizer_e2e_v7.py with training runtime
make visualizer-full

Produces → ir_report.html dataset_viewer.html ir_visualizer_e2e_latest.json

L4 Browser Runtime Smoke future

Headless browser validation using Playwright or Puppeteer. Not yet implemented — L1+L2+L3 currently cover all contract and numeric correctness without a browser dependency.

When to add L4

Canvas rendering bugs that L1-L3 cannot catch (pixel-level attention heatmaps)
Tab-switch JavaScript errors that only manifest with a live DOM
CSS layout regressions (e.g., panels overlapping, scrollbars missing)
Interactive features: modal open/close, search/filter, copy buttons

Candidate Implementation

# Future: headless Playwright smoke test
# npx playwright test tests/visualizer-smoke.spec.ts
#   ✓ IR visualizer: all 11 tabs render non-empty content
#   ✓ Dataset viewer: all 12 tabs render non-empty content
#   ✓ Attention heatmap canvas has non-zero pixel data
#   ✓ Tab switch does not throw JS errors

Until L4 is needed, the combination of static analysis (L1), numeric unit tests (L2), and generated-file E2E (L3) provides robust coverage.

Integration Map

Hook / Target	Levels	Command
`.githooks/pre-push` [0.5/6]	L1 + L2	`test_visualizer_health_v7.py --source --quiet && test_visualizer_js_units_v7.py --quiet`
`make v7-visualizer-health`	L1 + L2	Static health + JS unit tests with JSON reports
`make visualizer`	L3 (inference)	Generate + validate IR visualizer (inference mode)
`make visualizer-full`	L3 (train)	Generate + validate IR visualizer (inference + training)
`make v7-visualizer-e2e-nightly`	L3 (full)	Nightly: full E2E with training, skip inference parity

Failure Playbook

Failure	Level	Meaning	Fix
`tab_exists:attention FAIL`	L1	Tab button removed from HTML source	Restore tab in generator or update test contract
`required_fn:attnColor FAIL`	L1	Function definition deleted or renamed	Restore function in source; if renamed, update callers + contract
`ir:extract:formatBytes FAIL`	L2	Function body cannot be parsed from source	Check for syntax changes in function; may need extraction update
`ds:cosineSim:orthogonal FAIL`	L2	Numeric regression in cosineSim implementation	Review recent changes to cosineSim function
`no_undefined_fn_calls FAIL`	L1	Code calls a function that isn’t defined	Add the missing function or fix the caller
`L3 generated ir_report.html missing setElText`	L3	Generated file is stale (built before latest source change)	Regenerate: `make visualizer`

Adding New Tests

Add a new L1 contract

Edit test_visualizer_health_v7.py. Add the function name to the required_fns list for the appropriate visualizer (IR or dataset viewer). The test will verify the function exists in source.

Add a new L2 unit test

Edit test_visualizer_js_units_v7.py. Add the function name to fns_needed and add test cases to the test_cases list. Each test case has a name and a body (JS code that returns true on pass).

# Example: adding a test for a new function myHelper()
{"name": "ds:myHelper:basic", "body": "return assertDeepEq(myHelper('input'), 'expected');"},
{"name": "ds:myHelper:null",  "body": "return assertDeepEq(myHelper(null), '');"},

Add a new tab to a visualizer

When adding a new tab to the IR visualizer or dataset viewer:

Add the tab name to the TABS list in test_visualizer_health_v7.py
Add the render function to RENDER_FNS and required_fns
Run make v7-visualizer-health — new checks are automatically included