

Extended association algorithm based on roc analysis for visual information navigator
pp. 640-649
in: Setsuo Arikawa, Ayumi Shinohara (eds), Progress in discovery science, Berlin, Springer, 2002Abstract
It is very important to derive association rules at high speed from huge volume of databases. However, the typical fast mining algorithms in text databases tend to derive meaningless rules such as stopwords, then many researchers try to remove these noisy rules by using various filters. In our researches, we improve the association algorithm and develop information navigation systems for text data using visual interface, and we also apply a dictionary to remove noisy keywords from derived association rules. In order to remove noisy keywords automatically, we propose an algorithm basedon the true positive rate and the false positive rate in the ROC analysis. Moreover, in order to remove stopwords automatically from raw association rules, we introduce several thresholdv alues of the ROC analysis into our proposedmining algorithm. We evaluate the performance of our proposedmining algorithms in a bibliographic database.