Skip to content

arc_challenge

arc_challenge

ARC-Challenge rollout evaluation (Clark et al., 2018).

Grade-school science multiple choice; we use the validation split. Questions have variable choice counts and labels (sometimes A-D, sometimes 1-4). We map answerKey to a letter index into the presented choices.

ARCChallengeEval()

Bases: RolloutEvaluation

ARC-Challenge rollout evaluation (validation split).