Back to Wgpu

Cooperative Matrix Multiplication

examples/features/src/cooperative_matrix/README.md

29.0.11.4 KB
Original Source

Cooperative Matrix Multiplication

This example demonstrates how to use cooperative matrix operations (also known as tensor cores on NVIDIA GPUs) to perform efficient matrix multiplication on the GPU.

For the full description of the cooperative matrix feature (supported configurations, WGSL types and operations, validation rules, and backend support), see the central API spec:

  • docs/api-specs/cooperative_matrix.md

Example specifics

This example computes C = A * B + C where:

  • A is a 64×64 matrix
  • B is a 64×64 matrix
  • C is a 64×64 matrix (accumulator/result)

The example:

  • Tiles the 64×64 matrices into cooperative matrix tiles (e.g. 8×8) and performs a tiled matmul
  • Uses a compute shader and compares GPU results against a CPU reference implementation

Requirements

  • A GPU and backend that expose Features::EXPERIMENTAL_COOPERATIVE_MATRIX
  • A configuration returned from adapter.cooperative_matrix_properties() that matches the tile size and element types used by this example
  • See docs/api-specs/cooperative_matrix.md for details on hardware / backend support

Running

bash
cargo run --bin wgpu-examples -- cooperative_matrix

Notes

  • This is an experimental feature and may not work on all hardware
  • The shader uses the standard create_shader_module with full validation
  • Results are verified against a CPU reference implementation