Skip to content

hellaswag

hellaswag

HellaSwag rollout evaluation (Zellers et al., 2019).

4-way multiple choice over sentence-ending continuations.

HellaSwagEval()

Bases: RolloutEvaluation

HellaSwag evaluation (validation split).