PyTorch Parity Tests
C-Kernel-Engine validates every kernel against PyTorch's autograd implementation. Each test loads the C library via ctypes, runs identical operations in both C and PyTorch, and compares the results.
Target: max diff < 1e-5
All kernels must match PyTorch output within this tolerance for both forward and backward passes.
All kernels must match PyTorch output within this tolerance for both forward and backward passes.
How Tests Work
Forward Pass Validation
# Python test (simplified)
import ctypes
import torch
# Load C library
lib = ctypes.CDLL("libckernel_engine.so")
# Create random input
x = torch.randn(8, 64)
# Run PyTorch reference
y_torch = F.gelu(x)
# Run C kernel
lib.gelu_fast_inplace(x_ptr, n)
# Compare
max_diff = (y_torch - y_c).abs().max()
assert max_diff < 1e-5
Backward Pass Validation
# Backward test (simplified) x = torch.randn(8, 64, requires_grad=True) y = F.gelu(x) y.backward(torch.ones_like(y)) grad_torch = x.grad # C backward lib.gelu_backward_fast(x_ptr, d_out_ptr, d_in_ptr, n) # Compare gradients max_diff = (grad_torch - grad_c).abs().max() assert max_diff < 1e-5
Running Tests
Run All Tests
# Build the library first make # Run all tests with pytest make test # Or run individual test files python3 -m pytest unittest/test_attention.py -v python3 -m pytest unittest/test_rope.py -v
Litmus Tests (Quick Smoke Test)
# Run minimal sanity checks make litmus
Litmus tests run a subset of tests to quickly verify the build is working.
Test Coverage
| Kernel | Forward | Backward | Test File |
|---|---|---|---|
attention |
Yes | Yes | test_attention.py, test_attention_backward.py |
rope |
Yes | Yes | test_rope.py |
rmsnorm |
Yes | Yes | test_rmsnorm.py |
layernorm |
Yes | Yes | test_layernorm.py |
gelu |
Yes | Yes | test_gelu.py |
softmax |
Yes | Yes | test_softmax.py, test_softmax_backward.py |
swiglu |
Yes | Yes | test_swiglu.py |
sigmoid |
Yes | Yes | test_sigmoid.py |
mlp |
Yes | Yes | test_mlp.py |
gemm |
Yes | N/A | test_gemm.py |
Integration Tests
Orchestration Layer
Tests the full forward pass pipeline: RMSNorm → QKV → RoPE → Attention → Output projection.
python3 unittest/test_orchestration_layer.py
LM Head Litmus
End-to-end test loading a model config and running inference.
python3 unittest/test_lm_head_litmus.py