Back to Candle

candle-mamba2: Mamba2 implementation

candle-examples/examples/mamba2/README.md

0.10.11.4 KB
Original Source

candle-mamba2: Mamba2 implementation

Candle implementation of Mamba2 [1] inference. Mamba2 introduces the State Space Duality (SSD) framework which unifies structured SSMs and attention variants.

Running the example

bash
cargo run --example mamba2 --release -- --prompt "Mamba is the"

Supported models

ModelHuggingFace ID
Mamba2-130mAntonV/mamba2-130m-hf
Mamba2-370mAntonV/mamba2-370m-hf
Mamba2-780mAntonV/mamba2-780m-hf
Mamba2-1.3bAntonV/mamba2-1.3b-hf
Mamba2-2.7bAntonV/mamba2-2.7b-hf

Verification

Outputs match the PyTorch transformers Mamba2ForCausalLM reference implementation.

mamba2-130m

bash
cargo run --example mamba2 --release -- \
  --prompt "Mamba is the" \
  --which mamba2-130m \
  --sample-len 20 \
  --repeat-penalty 1.0

Expected output:

Mamba is the most popular and popular game in the world. It is a game where you can play with your friends

mamba2-370m

bash
cargo run --example mamba2 --release -- \
  --prompt "Mamba is the" \
  --which mamba2-370m \
  --sample-len 20 \
  --repeat-penalty 1.0

Expected output:

Mamba is the first game in the series to feature a new character, the Mamba, who is a female version