layers
layers
¶
RotaryPosEncoding(d_model: int, base: int = 10000, seq_dim: int = 1, partial_rotary_factor: float = 1.0)
¶
Unified RoPE supporting standard, Qwen, and NeoX variants.
Differences between variants are controlled by constructor args: - base: 10000 (standard/NeoX) or 1e6 (Qwen) - partial_rotary_factor: 1.0 (standard/Qwen) or <1.0 (NeoX)