Skip to content

lact

lact

LaCT language model.

Test-Time Training Done Right (arXiv:2505.23884). A GPT-style transformer where every block is a LaCTBlock (SWA + TTT + FFN). Inherits embed, unembed, loss, and __call__ from GPT unchanged; only setup, components, and sharding are new.

The fast-weight machinery lives inside LaCTBlock; this class just stitches the embedding / residual stream / unembedding around it.

LaCT

Bases: GPT

LaCT language model — SWA + chunked-TTT layers throughout.

embed(idx: jax.Array, deterministic: bool = False, **kwargs: Any) -> Any

Token embeddings only — RoPE handles positions inside each block.