Files | |
| file | attention_mlp_fused.c [code] |
| Mega-Fused Attention + MLP Block. | |
| file | fused_rmsnorm_linear.c [code] |
| Fused RMSNorm + Linear (GEMV) kernel. | |
| file | gemv_fused_quant_bias.c [code] |
| Fused GEMV kernels with online quantization and bias. | |
| file | mega_fused_attention_avx.c [code] |
| Mega-Fused Attention for AVX (256-bit) and AVX-512 (512-bit) | |
| file | mega_fused_attention_decode_q5_0.c [code] |
| Mega-fused attention decode with Q5_0 weights. | |
| file | mega_fused_attention_decode_q5_0.h [code] |
| Mega-fused attention decode with Q5_0 weights - Header. | |
| file | mega_fused_attention_prefill.c [code] |
| Mega-fused prefill attention kernel. | |
| file | mega_fused_attention_prefill_q8_0.c [code] |
| Mega-fused prefill attention kernel with Q8_0 out-proj. | |
| file | mega_fused_outproj_mlp_prefill.c [code] |
| Mega-fused post-attention block for prefill. | |
| file | prefill_fused_gemm.c [code] |
| Fused kernels for prefill phase with proper 2D tiling. | |
| file | rmsnorm_q8_k_fused.c [code] |
| Fused RMSNorm + Q8_K Quantization kernel. | |
| file | rmsnorm_qkv.c [code] |
| Fused RMSNorm + QKV Projection. | |