karl bühler digital

Home > Edited Book >

Publication details

Verlag: Springer

Ort: Berlin

Jahr: 2002

Pages: 586-599

ISBN (Hardback): 9783540433385

Volle Referenz:

Hiroshi Sakamoto, Hiroki Arimura, Setsuo Arikawa, "Knowledge discovery from semistructured texts", in: Progress in discovery science, Berlin, Springer, 2002

Abstrakt

This paper surveys our recent results on the knowledge discovery from semistructured texts, which contain heterogeneous structures represented by labeled trees. The aim of our study is to extract useful information from documents on the Web. First, we present the theoretical results on learning rewriting rules between labeled trees. Second, we apply our method to the learning HTML trees in the framework of the wrapper induction. We also examine our algorithms for real world HTML documents and present the results.

Publication details

Verlag: Springer

Ort: Berlin

Jahr: 2002

Pages: 586-599

ISBN (Hardback): 9783540433385

Volle Referenz:

Hiroshi Sakamoto, Hiroki Arimura, Setsuo Arikawa, "Knowledge discovery from semistructured texts", in: Progress in discovery science, Berlin, Springer, 2002