A standard evaluation bench for NLG evaluations
Table of Contents
1. Downstream Datasets
Card Name | selected SeqLen | Comments |
---|---|---|
wikisql | 512 | |
spider | 512 | |
allenai/common_gen | 256 | |
e2e_nlg | 512 | |
UCL-DARK/openai-tldr-filtered | 2048 | filter |
cnn_dailymail | 4096 | filter |
samsum | 2048 | filter |
piqa | 256 | |
truthful_qa | 256 | |
allenai/ai2_arc | 256 |