Teach Me to Explain: A Review of Datasets for Explainable Natural Language Processing

Wiegreffe, Sarah; Marasovic, Ana

Teach Me to Explain: A Review of Datasets for Explainable Natural Language Processing

Part of Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1 (NeurIPS Datasets and Benchmarks 2021) round1

Bibtex Paper Reviews And Public Comment » Supplemental

Authors

Sarah Wiegreffe, Ana Marasovic

Abstract

Explainable Natural Language Processing (ExNLP) has increasingly focused on collecting human-annotated textual explanations. These explanations are used downstream in three ways: as data augmentation to improve performance on a predictive task, as supervision to train models to produce explanations for their predictions, and as a ground-truth to evaluate model-generated explanations. In this review, we identify 65 datasets with three predominant classes of textual explanations (highlights, free-text, and structured), organize the literature on annotating each type, identify strengths and shortcomings of existing collection methodologies, and give recommendations for collecting ExNLP datasets in the future.

Teach Me to Explain: A Review of Datasets for Explainable Natural Language Processing

Authors

Abstract

Name Change Policy