blimp
blimp
¶
BLiMP (Benchmark of Linguistic Minimal Pairs) evaluation.
Tests grammatical knowledge via minimal pair acceptability judgments.
Blimp(subset: str | None = None)
¶
Bases: PerplexityComparisonEvaluation
BLiMP evaluation using perplexity comparison.
Each sample contains a grammatically correct and incorrect sentence. The model should assign lower perplexity to the correct sentence.
get(indx: int) -> Tuple[str, list[str], int]
¶
Get sample at index.
Returns:
| Type | Description |
|---|---|
Tuple[str, list[str], int]
|
(prefix, list_of_continuations, correct_index) |