A standard evaluation bench for NLG evaluations
Table of Contents
1. Downstream Datasets
| Card Name | selected SeqLen | Comments |
|---|---|---|
| wikisql | 512 | |
| spider | 512 | |
| allenai/common_gen | 256 | |
| e2e_nlg | 512 | |
| UCL-DARK/openai-tldr-filtered | 2048 | filter |
| cnn_dailymail | 4096 | filter |
| samsum | 2048 | filter |
| piqa | 256 | |
| truthful_qa | 256 | |
| allenai/ai2_arc | 256 |