Skip to content

mamba

mamba

Mamba-2 selective state space block.

Implements the Mamba-2 architecture (Dao & Gu, 2024) via its State Space Duality (SSD) chunked-matmul algorithm: the recurrence is split along T into chunks of size Q, with attention-style dense matmuls inside each chunk and a much shorter associative scan over chunk-boundary states. This keeps activation memory at O(B·T·H·N/Q + B·n_chunks·H·N·P) and turns the inner loop into tensor-core-friendly GEMMs.

MambaBlock

Bases: Module

Mamba-2 selective state space block.

Architecture: RMSNorm -> in_proj (gate + x + dt + B + C) -> short conv -> SSM -> gate -> out_proj -> residual.