Softmax forward/backward kernels with SIMD (SSE/AVX/AVX512) More...
#include <math.h>Go to the source code of this file.
Functions | |
| void | backward_causal_softmax_head_major (float *d_scores, const float *weights, int num_heads, int num_tokens, int aligned_context_window) |
| void | causal_softmax_head_major (float *scores, int num_heads, int num_tokens, int aligned_context_window) |
| void | causal_softmax_head_major_exact (float *scores, int num_heads, int num_tokens, int aligned_context_window) |
Softmax forward/backward kernels with SIMD (SSE/AVX/AVX512)
After changes: make test && make llamacpp-parity-full
Softmax: y[i] = exp(x[i] - max(x)) / sum(exp(x - max(x)))
Definition in file softmax_kernels.c.
| void backward_causal_softmax_head_major | ( | float * | d_scores, |
| const float * | weights, | ||
| int | num_heads, | ||
| int | num_tokens, | ||
| int | aligned_context_window | ||
| ) |
Definition at line 382 of file softmax_kernels.c.
Referenced by backward_causal_softmax_head_major_bf16().
| void causal_softmax_head_major | ( | float * | scores, |
| int | num_heads, | ||
| int | num_tokens, | ||
| int | aligned_context_window | ||
| ) |
Causal softmax (in-place, row-wise)
test_softmax.py::TestSoftmaxForward::test_causal_softmax
test_softmax.py::TestSoftmaxForward::test_causal_vs_softmax
test_attention.py::TestAttentionForward::test_softmax_correctness
Applies causal mask (j > i => 0) and softmax to scores matrix. In-place on [num_heads, T, T] scores matrix.
After changes: make test && make llamacpp-parity-full
Definition at line 144 of file softmax_kernels.c.
Referenced by attention_forward_causal_head_major(), attention_forward_causal_head_major_gqa(), and causal_softmax_head_major_bf16().
| void causal_softmax_head_major_exact | ( | float * | scores, |
| int | num_heads, | ||
| int | num_tokens, | ||
| int | aligned_context_window | ||
| ) |
Causal softmax (exact version using stdlib expf)
test_softmax.py::TestSoftmaxForward::test_causal_softmax_exact
test_softmax.py::TestSoftmaxForward::test_exact_vs_fast
Exact causal softmax using standard library expf for numerical accuracy reference.
After changes: make test
Definition at line 339 of file softmax_kernels.c.
Referenced by attention_forward_causal_head_major_exact(), and attention_forward_causal_head_major_gqa_exact().