Skip to content

rope

rope

Attention module with Rotary Positional Encoding (RoPE).

Inherits KV cache support from SelfAttention. During cached decode, the base class automatically injects the correct absolute position into kwargs["positions"] so RoPE applies the right rotation.