rope
rope
¶
Attention module with Rotary Positional Encoding (RoPE).
Inherits KV cache support from SelfAttention. During cached decode, the base class automatically injects the correct absolute position into kwargs["positions"] so RoPE applies the right rotation.