base

`base` ¶

Bases: Module

Compute token and positional embeddings given inputs.

Compute decoded residual channels given embeddings.

Compute output distribution.

Compute cross-entropy loss given logits and targets.

Parameters:

Name	Type	Description	Default
`idx`	`Array`	Input token indices of shape (B, T).	required
`targets`	`Optional[Array]`	Target token indices of shape (B, T). Use -1 to ignore positions.	`None`
`padding_mask`	`Optional[Array]`	Boolean tensor of shape (B, T). True for valid tokens, False for padding tokens.	`None`
`deterministic`	`bool`	If False, applies dropout.	`False`

Returns:

Name	Type	Description
`logits`	`Array`	Output logits of shape (B, T, vocab_size).
`loss`	`Optional[Array]`	Cross-entropy loss if targets provided, else None.