rmsnorm rmsnorm ¶ RMSNorm ¶ Bases: Module Root-mean-square layer norm. centered=True switches to the Qwen 3.5 / OLMo-2 convention: multiplier is 1 + weight with weight init at 0 (numerically still centered at 1, but lets HF checkpoints round-trip). RMSNormGated ¶ Bases: Module RMSNorm with a multiplicative silu(gate) after the weight. Used inside the gated-delta-net token mixer (Qwen 3.5 linear-attention layers). Weight is initialized to ones — matches HF Qwen3_5RMSNormGated.