← Back to C-Kernel-Engine Docs Doxygen Source Documentation
loss_kernels_bf16.c File Reference

Loss function kernels for BF16 tensors. More...

#include <stddef.h>
#include <stdint.h>
#include "bf16_utils.h"
#include "ckernel_engine.h"

Go to the source code of this file.

Functions

void softmax_cross_entropy_loss_bf16 (const uint16_t *logits, const int32_t *targets, int tokens, int vocab_size, uint16_t *d_logits, float *loss_out, float *scratch_logits, float *scratch_d_logits)
 

Detailed Description

Loss function kernels for BF16 tensors.

CK-ENGINE KERNEL RULES:

  1. NO malloc/free - memory via bump allocator, pointers passed in
  2. NO OpenMP - parallelization at orchestrator/codegen layer
  3. API must define: inputs, outputs, workspace, and memory layouts
  4. Pure computation - deterministic, no side effects

After changes: make test && make llamacpp-parity-full

Definition in file loss_kernels_bf16.c.

Function Documentation

◆ softmax_cross_entropy_loss_bf16()

void softmax_cross_entropy_loss_bf16 ( const uint16_t *  logits,
const int32_t *  targets,
int  tokens,
int  vocab_size,
uint16_t *  d_logits,
float *  loss_out,
float *  scratch_logits,
float *  scratch_d_logits 
)

Definition at line 25 of file loss_kernels_bf16.c.

33 {
34  if (!logits || !targets || !d_logits || tokens <= 0 || vocab_size <= 0) {
35  if (loss_out) *loss_out = 0.0f;
36  return;
37  }
38  if (!scratch_logits || !scratch_d_logits) {
39  if (loss_out) *loss_out = 0.0f;
40  return;
41  }
42 
43  const size_t count = (size_t)tokens * (size_t)vocab_size;
44 
45  bf16_tensor_to_float(logits, scratch_logits, count);
46  softmax_cross_entropy_loss(scratch_logits, targets, tokens, vocab_size, scratch_d_logits, loss_out);
47  float_tensor_to_bf16(scratch_d_logits, d_logits, count);
48 }
static void float_tensor_to_bf16(const float *src, uint16_t *dst, size_t count)
Definition: bf16_utils.h:271
static void bf16_tensor_to_float(const uint16_t *src, float *dst, size_t count)
Definition: bf16_utils.h:250
void softmax_cross_entropy_loss(const float *logits, const int32_t *targets, int tokens, int vocab_size, float *d_logits, float *loss_out)
Definition: loss_kernels.c:21
int vocab_size
Definition: true_bpe.h:185

References bf16_tensor_to_float(), float_tensor_to_bf16(), softmax_cross_entropy_loss(), and vocab_size.