Institute of Global Studies
Tokyo University of Foreign Studies
Tokyo, Japan

Analyzing Usefulness of Dialogues from Closed Caption TV Corpus as an Example of Can-do Statements for Language Learning

This paper describes a clustering method by using Doc2vec, SVD, and k-means method, in order to find discourse segments extracted from a closed caption TV corpus using formulaic sequences related to can-do statements for language learning. We report a feature of discourse segments in each classified group, and analyze usability of the acquired discourse segments whether discourse segments can be used as sample dialogues for can-do statements.

