FP32 <-> FP16 SIMD conversion utilities. More...
#include <stdint.h>#include <stddef.h>#include <math.h>Go to the source code of this file.
Functions | |
| void | ck_fma_f32_to_f16 (const float *a, const float *b, const float *c, uint16_t *dst, int n) |
| FMA in FP32, store result as FP16: dst = a * b + c. More... | |
| void | ck_fp16_to_fp32_2d (const uint16_t *src, float *dst, int rows, int cols, int src_stride, int dst_stride) |
| Convert 2D FP16 matrix to FP32 with strided access. More... | |
| void | ck_fp16_to_fp32_row (const uint16_t *src, float *dst, int n) |
| Convert FP16 row to FP32 (auto-select best implementation) More... | |
| static float | ck_fp16_to_fp32_scalar (uint16_t h) |
| void | ck_fp32_to_fp16_2d (const float *src, uint16_t *dst, int rows, int cols, int src_stride, int dst_stride) |
| Convert 2D FP32 matrix to FP16 with strided access. More... | |
| void | ck_fp32_to_fp16_inplace (float *data, void *scratch, int n) |
| Convert FP32 to FP16 in-place using scratch buffer. More... | |
| void | ck_fp32_to_fp16_row (const float *src, uint16_t *dst, int n) |
| Convert FP32 row to FP16 (auto-select best implementation) More... | |
| static uint16_t | ck_fp32_to_fp16_scalar (float f) |
| void | ck_scale_f32_to_f16 (const float *src, float scale, uint16_t *dst, int n) |
| Scale FP32 array and store as FP16: dst = scale * src. More... | |
FP32 <-> FP16 SIMD conversion utilities.
After changes: make test && make llamacpp-parity-full
These conversion functions use F16C hardware instructions (available on Intel Ivy Bridge and later, AMD Piledriver and later) for fast FP16/FP32 conversion. FP16 (IEEE 754 half-precision) provides 2x memory savings with ~0.1% precision loss for KV cache storage.
FP16 KV cache doubles the context that fits in L3 cache:
Definition in file fp16_convert.c.
| void ck_fma_f32_to_f16 | ( | const float * | a, |
| const float * | b, | ||
| const float * | c, | ||
| uint16_t * | dst, | ||
| int | n | ||
| ) |
FMA in FP32, store result as FP16: dst = a * b + c.
| a | First FP32 operand array |
| b | Second FP32 operand array |
| c | Third FP32 operand array |
| dst | Destination FP16 array |
| n | Number of elements |
Definition at line 350 of file fp16_convert.c.
References ck_fp32_to_fp16_scalar().
| void ck_fp16_to_fp32_2d | ( | const uint16_t * | src, |
| float * | dst, | ||
| int | rows, | ||
| int | cols, | ||
| int | src_stride, | ||
| int | dst_stride | ||
| ) |
Convert 2D FP16 matrix to FP32 with strided access.
| src | Source FP16 matrix [rows, src_stride] |
| dst | Destination FP32 matrix [rows, dst_stride] |
| rows | Number of rows |
| cols | Number of columns (actual data per row) |
| src_stride | Source stride (elements per row) |
| dst_stride | Destination stride (elements per row) |
Definition at line 298 of file fp16_convert.c.
References ck_fp16_to_fp32_row().
| void ck_fp16_to_fp32_row | ( | const uint16_t * | src, |
| float * | dst, | ||
| int | n | ||
| ) |
Convert FP16 row to FP32 (auto-select best implementation)
| src | Source FP16 array |
| dst | Destination FP32 array (caller-allocated) |
| n | Number of elements |
Definition at line 250 of file fp16_convert.c.
References ck_fp16_to_fp32_scalar().
Referenced by ck_fp16_to_fp32_2d().
|
inlinestatic |
| void ck_fp32_to_fp16_2d | ( | const float * | src, |
| uint16_t * | dst, | ||
| int | rows, | ||
| int | cols, | ||
| int | src_stride, | ||
| int | dst_stride | ||
| ) |
Convert 2D FP32 matrix to FP16 with strided access.
| src | Source FP32 matrix [rows, src_stride] |
| dst | Destination FP16 matrix [rows, dst_stride] |
| rows | Number of rows |
| cols | Number of columns (actual data per row) |
| src_stride | Source stride (elements per row) |
| dst_stride | Destination stride (elements per row) |
Definition at line 277 of file fp16_convert.c.
References ck_fp32_to_fp16_row().
| void ck_fp32_to_fp16_inplace | ( | float * | data, |
| void * | scratch, | ||
| int | n | ||
| ) |
Convert FP32 to FP16 in-place using scratch buffer.
Useful when you want to downcast in place but need FP32 for computation. Writes FP16 to the lower half of scratch, then copies back.
| data | FP32 array to convert (will contain FP16 in lower bits) |
| scratch | Temporary buffer, must be >= n * sizeof(uint16_t) |
| n | Number of elements |
Definition at line 325 of file fp16_convert.c.
References ck_fp32_to_fp16_row().
| void ck_fp32_to_fp16_row | ( | const float * | src, |
| uint16_t * | dst, | ||
| int | n | ||
| ) |
Convert FP32 row to FP16 (auto-select best implementation)
| src | Source FP32 array |
| dst | Destination FP16 array (caller-allocated) |
| n | Number of elements |
Definition at line 230 of file fp16_convert.c.
References ck_fp32_to_fp16_scalar().
Referenced by ck_fp32_to_fp16_2d(), and ck_fp32_to_fp16_inplace().
|
inlinestatic |
Definition at line 66 of file fp16_convert.c.
Referenced by ck_fma_f32_to_f16(), ck_fp32_to_fp16_row(), and ck_scale_f32_to_f16().
| void ck_scale_f32_to_f16 | ( | const float * | src, |
| float | scale, | ||
| uint16_t * | dst, | ||
| int | n | ||
| ) |
Scale FP32 array and store as FP16: dst = scale * src.
| src | Source FP32 array |
| scale | Scalar multiplier |
| dst | Destination FP16 array |
| n | Number of elements |
Definition at line 398 of file fp16_convert.c.
References ck_fp32_to_fp16_scalar().