Skip to content

padded

padded

padded.py Padded Dataset - datasets with pre-padded sequences and masks.

PaddedDataset(spec: ExecutionSpec, block_size: int, name: str, suffix: str = '')

Bases: Dataset

get_batch(batch_size: int, split: str = 'train', deterministic_key: Optional[int] = None) -> dict[str, np.ndarray]

get batches from padded dataset with masks