third_party/xla/docs/errors/error_1001.md
Category: Compile Time: Scoped Vmem OOM
This error indicates that the program requires more Scoped Vector Memory (Vmem) than what was allocated.
Sample Error Messages:
RESOURCE_EXHAUSTED: Ran out of memory in memory space vmem while allocating on stack for %my-custom-kernel = bf16[2048,4096]{1,0:T(8,128)(2,1)} custom-call(...) ...
XLA Backends: TPU
TPUs have Vector Memory (VMEM) which is a local scratchpad memory used exclusively by the TensorCore (TC). The compiler manages Vmem for different types of allocations:
A Compile Time Scoped Vmem OOM occurs when the instruction-scoped allocations exceed the allocation limit for that instruction. This limit is controlled
and
These errors are typically caused by an internal compiler bug or by a custom kernel exceeding its allocation limit.
Carefully analyze the error message to identify if the error stems from a custom kernel or a standard HLO. An error due to a custom kernel should have the following signature:
Ran out of memory in memory space vmem while allocating on stack for %my-custom-call = <output-shape> custom-call(<params>), custom_call_target="tpu_custom_call" ...
If the error originates from a custom kernel, use the following techniques to reduce the kernel's memory requirement: