Back to Sglang

sgl-kernel Metal kernels

sgl-kernel/csrc/metal/README.md

0.5.131.4 KB
Original Source

sgl-kernel Metal kernels

Custom Apple Metal kernels for the MLX backend on Apple Silicon. Shader sources (*.metal) and C++ host / nanobind sources (*.cpp) in this directory are compiled by sgl-kernel/setup_metal.py into the native Metal extension and the sgl_metal_kernels.metallib archive, then exposed through public Python wrappers in python/sgl_kernel/metal.py.

Kernels

KernelDescriptionTested on
rope_pool_fusedFused NeoX RoPE for Q/K plus K/V scatter into the MLX KV pool.Apple Silicon / MLX

Adding a new Metal kernel

  1. Add the shader under csrc/metal/<kernel>.metal.
  2. Add the C++ host / nanobind binding under csrc/metal/<kernel>.cpp, exporting the native entry point for the wrapper in python/sgl_kernel/metal.py.
  3. Append both files to metal_shader_sources and cxx_sources in sgl-kernel/setup_metal.py.
  4. Add a Python wrapper in python/sgl_kernel/metal.py that validates input shapes/dtypes and invokes the native AOT entry point without forcing MLX evaluation.
  5. Add a test under sgl-kernel/tests/ and update the Kernels table above with a short description and the hardware / OS / MLX version the kernel was validated on.