← Back to C-Kernel-Engine Docs Doxygen Source Documentation
ck_features.h File Reference

CPU feature detection and dispatch macros. More...

#include <stdint.h>

Go to the source code of this file.

Data Structures

struct  ck_capability_t
 CPU capability information structure. More...
 

Macros

#define CK_GEMM_DISPATCH(...)   gemm_ref(__VA_ARGS__)
 Dispatch to best available GEMM kernel. More...
 
#define CK_GEMV_DISPATCH(...)   gemv_ref(__VA_ARGS__)
 Dispatch to best available GEMV kernel. More...
 
#define CK_HAS_AI_ACCEL   0
 
#define CK_HAS_BEST_VECTOR   0
 
#define CK_QGEMM_DISPATCH(...)   qgemm_ref(__VA_ARGS__)
 Dispatch to best available quantized GEMV kernel For INT8/INT4 quantization with VNNI/AMX acceleration. More...
 
#define CK_VECTOR_WIDTH   32 /* Scalar fallback */
 

Functions

static ck_capability_t ck_get_capabilities (void)
 Get current platform capabilities. More...
 

Detailed Description

CPU feature detection and dispatch macros.

Defines standardized macros for SIMD instruction set detection and kernel dispatch. Use these instead of CPU model checks.

Feature Priority (best available): AMX (512-bit tile ops, Intel Sapphire Rapids+) AVX-512 (512-bit vector, Intel Skylake-X+) AVX2 (256-bit with FMA, Intel Haswell+) AVX (256-bit, Intel Sandy Bridge+) NEON/SVE2 (ARM) DSA (PowerPC) Reference (fallback)

Definition in file ck_features.h.

Macro Definition Documentation

◆ CK_GEMM_DISPATCH

#define CK_GEMM_DISPATCH (   ...)    gemm_ref(__VA_ARGS__)

Dispatch to best available GEMM kernel.

Usage: CK_GEMM_DISPATCH(y, W, x, M, K); expands to appropriate kernel call

Definition at line 174 of file ck_features.h.

◆ CK_GEMV_DISPATCH

#define CK_GEMV_DISPATCH (   ...)    gemv_ref(__VA_ARGS__)

Dispatch to best available GEMV kernel.

Definition at line 189 of file ck_features.h.

◆ CK_HAS_AI_ACCEL

#define CK_HAS_AI_ACCEL   0

Definition at line 150 of file ck_features.h.

◆ CK_HAS_BEST_VECTOR

#define CK_HAS_BEST_VECTOR   0

Definition at line 139 of file ck_features.h.

◆ CK_QGEMM_DISPATCH

#define CK_QGEMM_DISPATCH (   ...)    qgemm_ref(__VA_ARGS__)

Dispatch to best available quantized GEMV kernel For INT8/INT4 quantization with VNNI/AMX acceleration.

Definition at line 205 of file ck_features.h.

◆ CK_VECTOR_WIDTH

#define CK_VECTOR_WIDTH   32 /* Scalar fallback */

Definition at line 138 of file ck_features.h.

Function Documentation

◆ ck_get_capabilities()

static ck_capability_t ck_get_capabilities ( void  )
inlinestatic

Get current platform capabilities.

Definition at line 226 of file ck_features.h.

226  {
227  ck_capability_t cap = {
228  .name = "unknown",
229  .width = 32,
230  .has_fma = 0,
231  .has_ai_accel = 0,
232  .best_kernel = "gemm_ref"
233  };
234 
235 #if defined(CK_HAS_AMX)
236  cap.name = "AMX (Intel Sapphire Rapids+)";
237  cap.width = 512;
238  cap.has_fma = 1;
239  cap.has_ai_accel = 1;
240  cap.best_kernel = "gemm_amx";
241 #elif defined(CK_HAS_AVX512)
242  cap.name = "AVX-512 (Intel Skylake-X+)";
243  cap.width = 512;
244  cap.has_fma = 1;
245  cap.has_ai_accel = 1;
246  cap.best_kernel = "gemm_avx512";
247 #elif defined(CK_HAS_AVX2_FMA)
248  cap.name = "AVX2+FMA (Intel Haswell+)";
249  cap.width = 256;
250  cap.has_fma = 1;
251  cap.has_ai_accel = 0;
252  cap.best_kernel = "gemm_avx2";
253 #elif defined(CK_HAS_AVX)
254  cap.name = "AVX (Intel Sandy Bridge+)";
255  cap.width = 256;
256  cap.has_fma = 0;
257  cap.has_ai_accel = 0;
258  cap.best_kernel = "gemm_avx";
259 #elif defined(CK_HAS_NEON)
260  cap.name = "NEON (ARM)";
261  cap.width = 128;
262  cap.has_fma = 1;
263  cap.has_ai_accel = 0;
264  cap.best_kernel = "gemm_neon";
265 #elif defined(CK_HAS_ALTIVEC)
266  cap.name = "AltiVec (PowerPC)";
267  cap.width = 128;
268  cap.has_fma = 1;
269  cap.has_ai_accel = 0;
270  cap.best_kernel = "gemm_altivec";
271 #endif
272 
273  return cap;
274 }
CPU capability information structure.
Definition: ck_features.h:215
const char * best_kernel
Definition: ck_features.h:220
const char * name
Definition: ck_features.h:216

References ck_capability_t::best_kernel, ck_capability_t::has_ai_accel, ck_capability_t::has_fma, ck_capability_t::name, and ck_capability_t::width.

Referenced by main().