CTRLsum: TOWARDS GENERIC CONTROLLABLE TEXT SUMMARIZATION

- 8 mins

Authors : Junxian He (CMU), Wojciech Krys ́cin ́ski, Bryan McCann, Nazneen Rajani, Caiming Xiong (Saleforce Research)

Arxiv 2020
Paper : https://arxiv.org/pdf/2012.04281.pdf
Code : https://github.com/salesforce/ctrl-sum


Summary

개인적 견해


Abstract

Current summarization systems yield generic summaries that are disconnected from users’ preferences and expectations. To address this limitation, we present CTRLsum, a novel framework for controllable summarization. Our approach enables users to control multiple aspects of generated summaries by interacting with the summarization system through textual input in the form of a set of keywords or descriptive prompts. Using a single unified model, CTRLsum is able to achieve a broad scope of summary manipulation at inference time without requiring additional human annotations or pre-defining a set of control aspects during training. We quantitatively demonstrate the effectiveness of our approach on three domains of summarization datasets and five control aspects: 1) entity-centric and 2) length-controllable summarization, 3) contribution summarization on scientific papers, 4) invention purpose summarization on patent filings, and 5) question-guided summarization on news articles in a reading comprehension setting. Moreover, when used in a standard, uncontrolled summarization setting, CTRLsum achieves state-of-the-art results on the CNN/DailyMail dataset.

1. Introduction

2. CTRLsum

2. 1. Overview

2. 2. Automatic Keyword Extraction

Training

Inference

2. 3. Summarization: Training Details

2. 4. Summarization: Inference with Keywords

기학습된 CTRLsum은 추가적인 fine-tuning 없이 새로운 use cases에 대응할 수 있음

예를 들어 훈련 중에는 entity 또는 length 에 대해 특별히 초점을 맞추지는 않았지만 제어가 가능함

2. 5. Summarization: Inference with Keywords and Prompts

요약 시스템에서 처음으로 Prompt 기반 제어 방법을 사용 및 평가함 각 요약에 대한 Prompt :

Previous work on controllable summarization often collects control codes such as entity or length as supervision to train the model conditioned on both the code and article together (Fan et al., 2018; Liu et al., 2018). These methods do not generalize for controlling aspects of the summarization that were not seen during training. Recently Saito et al. (2020a) use the number of word prototypes to control summary length in a similar way to how we use keywords. Interactive summarization provides a way for users to continuously control the information that is included in the summary (Bornstein et al., 1999; Leuski et al., 2003). More broadly, controllable text generation has been studied for styles (Hu et al., 2017; Fu et al., 2018; He et al., 2020b), topics (Tang et al., 2019; Huang et al., 2019), and templates (Guu et al., 2018; Wiseman et al., 2018; He et al., 2020a).

Keyword-guided text generation has been applied in other contexts with different motivations. Gehrmann et al. (2018) utilize copying words at test time to mask copying operations in a summarization task. Li et al. (2018) and Saito et al. (2020b) use keywords as extra input to improve the uncontrolled summarization performance. Wang et al. (2016), Mou et al. (2016), and Yao et al. (2019) use textual input to plan poetry, dialogue, and stories respectively. Lexically-constrained decoding specifies certain lexicons as hard constraints in the target text (Hokamp & Liu, 2017; Post & Vilar, 2018). Prefix-constrained decoding was used in machine translation (Knowles & Koehn, 2016; Wuebker et al., 2016) and also to demonstrate the multi-task ability present in large pretrained models (McCann et al., 2018; Radford et al., 2019; Keskar et al., 2019; Brown et al., 2020).

4. Experiments

4. 1. Experimental Details

4. 2. Entity Control

4. 3. Length Control

4. 4. Contribution and Purpose Summarization

4. 5. Question-Guided Summarization

4. 6. Automatic Summarization

4. 7. Human Evaluation

Controlled Summarization

1 - 5 척도

Uncontrolled Summarization

1 - 5 척도

실험 세팅 및 ablation statudy, 다양한 예제는 Appendix 참조

5. Conclusion

In this paper we propose a generic framework to perform multi-aspect controllable summarization. The model is conditioned on keywords to predict summaries during training. At inference time the control tokens, in the form of keywords or prompts, enable users to interact with models in a very flexible way. Experiments on five different control aspects demonstrate the efficacy of our method.

Dongju Park

Dongju Park

Research Scientist / Engineer @ NAVER CLOVA

comments powered by Disqus
rss facebook twitter github gitlab googlescholar youtube mail spotify lastfm instagram linkedin google google-plus pinterest medium vimeo stackoverflow reddit quora quora