CoCon: A Self-Supervised Approach For Controlled Text Generation

- 9 mins

Authors : Alvin Chan, Yew-Soon Ong, Bill Pung, Aston Zhang, Jie Fu / Nanyang Tech, Amazon AI, Mila, Polytechnique Montreal

ICLR 2021 Poster
Paper : https://openreview.net/pdf?id=VD_ozqvBy4W
Code : -


Summary

개인적 견해


Abstract

Pretrained Transformer-based language models (LMs) display remarkable natural language generation capabilities. With their immense potential, controlling text generation of such LMs is getting attention. While there are studies that seek to control high-level attributes (such as sentiment and topic) of generated text, there is still a lack of more precise control over its content at the word- and phrase-level. Here, we propose Content-Conditioner (CoCon) to control an LM’s output text with a content input, at a fine-grained level. In our self-supervised approach, the CoCon block learns to help the LM complete a partially-observed text sequence by conditioning with content inputs that are withheld from the LM. Through experiments, we show that CoCon can naturally incorporate target content into generated texts and control high-level text attributes in a zero-shot manner.

1. Introduction

3. Content Conditioner (CoCon)

3. 1. Self-Supervised Learning

CoCon의 Self-supervised learning approach는 자연어에서의 content의 다양성에 영감을 받음

Text sequence \(\text{x}=\{x_1,...,x_{t-1},x_t,...,x_l\}\) 이 주어지면 sequence를 쪼갠다.

\(\text{x}^a = \{x_1,...,x_{t-1}\}\), \(\text{x}^b = \{x_t,...,x_{l}\}\) where \(\text{x} = [\text{x}^a;\text{x}^b]\)

Real world에서는 \(\text{x}^a\)가 주어졌을 때 수많은 \(\text{x}^b\)가 나올 수 있다.

이는 샘플링 과정과 동반되어 \(\text{x}^b\)의 정보가 없으면 LM 만으로 \(\text{x}\)를 복원하기 힘들다는 것을 의미한다.

4. Experiments

4. 1. Content Similarity

4. 2. Topic Relevance

4. 3. Sentiment Control

4. 4. Versatility of CoCon

Generated examples 은 paper를 참고하자.

5. Conclusion

We proposed Content-Conditioner (CoCon) as an approach for more fine-grained control over neural text generation. CoCon can be trained effectively in a self-supervised manner and is compatible with pretrained language models (LM) that already produce high-quality texts. Through our experiments, CoCon was shown to smoothly incorporate content inputs into generated texts and control high-level text attributes. This new dimension of control over powerful LMs opens them up for even wider range of applications.


Dongju Park

Dongju Park

Research Scientist / Engineer @ NAVER CLOVA

comments powered by Disqus
rss facebook twitter github gitlab googlescholar youtube mail spotify lastfm instagram linkedin google google-plus pinterest medium vimeo stackoverflow reddit quora quora