BIB-VERSION:: CS-TR-v2.0 ID:: UCB//S2K-93-24 ENTRY:: February 25, 1994 TITLE:: TextTiling: A Quantitative Approach to Discourse Segmentation DATE:: AUTHOR:: Hearst, Marti A. PAGES:: 10 ABSTRACT:: This paper represents TextTiling, a method for partitioning full-length text documents into coherent multiparagraph units. The layout of text tiles is meant to reflect the pattern of subtropics contained in an expository text. The approach uses lexical analyses based on tfidf, and information retrieval measurement, to determine the extent of the tiles, incorporating thesaural information via a statistical disambiguation algorithm. The tiles have been found to correspond will to human judgements of the major subtopic boundaries of science magazine articles. RETRIEVAL:: postscript (in all.ps) END:: UCB//S2K-93-24