sgl-model-gateway/examples/wasm/wasm-guest-ratelimit/README.md
This example demonstrates rate limiting middleware for sgl-model-gateway using the WebAssembly Component Model.
This middleware provides rate limiting:
429 Too Many Requests when limit exceededImportant: This is a simplified demonstration. Since WASM components are stateless, each worker thread maintains its own counter. For production, implement rate limiting at the router/host level with shared state.
# Build
cd examples/wasm-guest-ratelimit
./build.sh
# Deploy (replace file_path with actual path)
curl -X POST http://localhost:3000/wasm \
-H "Content-Type: application/json" \
-d '{
"modules": [{
"name": "ratelimit-middleware",
"file_path": "/absolute/path/to/wasm_guest_ratelimit.component.wasm",
"module_type": "Middleware",
"attach_points": [{"Middleware": "OnRequest"}]
}]
}'
Modify constants in src/lib.rs:
const RATE_LIMIT_REQUESTS: u64 = 100; // requests per window
const RATE_LIMIT_WINDOW_MS: u64 = 60_000; // time window in ms
# Send multiple requests (first 60 succeed, then 429)
for i in {1..65}; do
curl -s -o /dev/null -w "%{http_code}\n" \
http://localhost:3000/v1/models \
-H "Authorization: Bearer secret-api-key-12345"
done
OnRequest phase