mamba

`mamba` ¶

Mamba-2 language model.

A pure selective state space model following the Mamba-2 architecture (Dao & Gu, 2024). No positional embeddings — position information is implicit in the SSM recurrent state.

Inherits GPT's loss and unembed. Overrides __call__ to drop the block-size assertion (SSMs have no fixed context-length limit), and overrides setup, embed, and decode for SSM-specific structure.

`Mamba` ¶

Bases: GPT

Mamba-2 language model — SSM-only, no attention.

Overrides GPT's setup/embed/decode to use MambaBlock layers and skip positional embeddings. loss and unembed are inherited unchanged.

`embed(idx: jax.Array, deterministic: bool = False, **kwargs: Any) -> Any` ¶

Token embeddings only — no positional encoding needed for SSMs.

mamba

mamba ¶

Mamba ¶

embed(idx: jax.Array, deterministic: bool = False, **kwargs: Any) -> Any ¶

`mamba` ¶

`Mamba` ¶

`embed(idx: jax.Array, deterministic: bool = False, **kwargs: Any) -> Any` ¶