src/plugins/intel_gpu/docs/debugging_guide.md
This document describes debugging practices that can help diagnose issues in the GPU plugin.
During execution, OpenCL may return an out-of-resource (OOR) error. This can happen for two primary reasons:
Verify Memory Consumption First, assess whether memory usage is within a reasonable range.
If memory consumption is unexpectedly high, investigate:
Identify the Kernel That Triggers the OOR If memory usage appears normal, the next step is locating the specific kernel that causes the failure. Use the opencl-intercept-layer with:
With these options enabled, the OOR error will be associated with a specific enqueue call, allowing you to determine which layer caused it.
Validate by Replacing the Kernel with a Dummy Implementation Once you identify the problematic layer, you can try replacing the kernel with a dummy implementation. For OpenVINO kernels, you can temporarily comment out the corresponding OpenCL code.
If the OOR disappears after replacing the kernel, this indicates the error originates from within the kernel execution itself. You can then bisect the kernel code to isolate the exact section responsible.
Note: When modifying or partially removing code, keep in mind that the OpenCL compiler may optimize away sections which are considered "unused." Be cautious when interpreting results after such modifications.