← Back to C-Kernel-Engine Docs Doxygen Source Documentation
relu_kernels_bf16.c File Reference

ReLU activation kernels for BF16 tensors. More...

#include <stddef.h>
#include <stdint.h>
#include "bf16_utils.h"
#include "ckernel_engine.h"

Go to the source code of this file.

Functions

void relu_backward_bf16 (const uint16_t *input, const uint16_t *d_output, uint16_t *d_input, size_t n)
 
void relu_forward_bf16 (const uint16_t *input, uint16_t *output, size_t n)
 
void relu_forward_inplace_bf16 (uint16_t *data, size_t n)
 

Detailed Description

ReLU activation kernels for BF16 tensors.

CK-ENGINE KERNEL RULES:

  1. NO malloc/free - memory via bump allocator, pointers passed in
  2. NO OpenMP - parallelization at orchestrator/codegen layer
  3. API must define: inputs, outputs, workspace, and memory layouts
  4. Pure computation - deterministic, no side effects

After changes: make test && make llamacpp-parity-full

ReLU: y = max(0, x)

Definition in file relu_kernels_bf16.c.

Function Documentation

◆ relu_backward_bf16()

void relu_backward_bf16 ( const uint16_t *  input,
const uint16_t *  d_output,
uint16_t *  d_input,
size_t  n 
)

Definition at line 45 of file relu_kernels_bf16.c.

49 {
50  if (!input || !d_output || !d_input) {
51  return;
52  }
53  for (size_t i = 0; i < n; ++i) {
54  float x = bf16_to_float(input[i]);
55  float dy = bf16_to_float(d_output[i]);
56  d_input[i] = float_to_bf16(x > 0.0f ? dy : 0.0f);
57  }
58 }
static uint16_t float_to_bf16(float f)
Definition: bf16_utils.h:90
static float bf16_to_float(uint16_t v)
Definition: bf16_utils.h:38

References bf16_to_float(), and float_to_bf16().

◆ relu_forward_bf16()

void relu_forward_bf16 ( const uint16_t *  input,
uint16_t *  output,
size_t  n 
)

Definition at line 23 of file relu_kernels_bf16.c.

24 {
25  if (!input || !output) {
26  return;
27  }
28  for (size_t i = 0; i < n; ++i) {
29  float x = bf16_to_float(input[i]);
30  output[i] = float_to_bf16(x > 0.0f ? x : 0.0f);
31  }
32 }

References bf16_to_float(), and float_to_bf16().

◆ relu_forward_inplace_bf16()

void relu_forward_inplace_bf16 ( uint16_t *  data,
size_t  n 
)

Definition at line 34 of file relu_kernels_bf16.c.

35 {
36  if (!data) {
37  return;
38  }
39  for (size_t i = 0; i < n; ++i) {
40  float x = bf16_to_float(data[i]);
41  data[i] = float_to_bf16(x > 0.0f ? x : 0.0f);
42  }
43 }

References bf16_to_float(), and float_to_bf16().