sgl-kernel/csrc/metal/README.md
Custom Apple Metal kernels for the MLX backend on Apple Silicon. Shader sources (*.metal) and C++ host / nanobind sources (*.cpp) in this directory are compiled by sgl-kernel/setup_metal.py into the native Metal extension and the sgl_metal_kernels.metallib archive, then exposed through public Python wrappers in python/sgl_kernel/metal.py.
| Kernel | Description | Tested on |
|---|---|---|
rope_pool_fused | Fused NeoX RoPE for Q/K plus K/V scatter into the MLX KV pool. | Apple Silicon / MLX |
csrc/metal/<kernel>.metal.csrc/metal/<kernel>.cpp, exporting the native entry point for the wrapper in python/sgl_kernel/metal.py.metal_shader_sources and cxx_sources in sgl-kernel/setup_metal.py.python/sgl_kernel/metal.py that validates input shapes/dtypes and invokes the native AOT entry point without forcing MLX evaluation.sgl-kernel/tests/ and update the Kernels table above with a short description and the hardware / OS / MLX version the kernel was validated on.