Testing the usefulness of RST and more general representations for discourse analysis across domains and applications

Ferracane, Elisa
Journal Title
Journal ISSN
Volume Title

Discourse analysis is a task with enormous potential but is often met with lukewarm results. This report explores how well Rhetorical Structure Theory (RST) and more general representations of discourse can generalize across domains and tasks, and the validity of their underlying assumptions. Our first study attempts to uncover issues in Rhetorical Structure Theory (RST) discourse parsing by starting at the first step of discourse segmentation, and evaluate in the medical domain. Errors on our novel, small-scale medical corpus reveal differences at lower linguistic levels that affect the discourse segmenter, and point to problem areas in the way RST was operationalized. Our second study focuses on more general representations of discourse that are learned by the model, and that have only a simple constraint of forming a dependency tree. We find these latent trees in fact do not represent discourse and focus instead on lexical cues. We propose a variant of this model that is able to learn deeper structures, but conclude that a different task which makes more use of discourse may be needed in order to produce more discourse-like structures