

Efficient data mining from large text databases
pp. 123-139
in: Setsuo Arikawa, Ayumi Shinohara (eds), Progress in discovery science, Berlin, Springer, 2002Abstract
In this paper, we consider the problem of discovering a simple class of combinatorial patterns from a large collection of unstructured text data. As a framework of data mining, we adopted optimized pattern discovery in which a mining algorithm discovers the best patterns that optimize a given statistical measure within a class of hypothesis patterns on a given data set. We present efficient algorithms for the classes of proximity word association patterns and report the experiments on the keyword discovery from Web data.