A standard evaluation bench for NLG evaluations
Table of Contents
1. Downstream Datasets
| Card Name | selected SeqLen | Comments |
|---|---|---|
| wikisql | 512 | |
| spider | 512 | |
| allenai/commongen | 256 | |
| e2enlg | 512 | |
| UCL-DARK/openai-tldr-filtered | 2048 | filter |
| cnndailymail | 4096 | filter |
| samsum | 2048 | filter |
| piqa | 256 | |
| truthfulqa | 256 | |
| allenai/ai2arc | 256 |