Digital library of construction informatics
and information technology in civil engineering and construction


Paper w78-1999-2507:
Use of automatic keyphrase generation for creation of a construction thesaurus

Facilitated by the SciX project

Kosovac B, Vanier D J

Use of automatic keyphrase generation for creation of a construction thesaurus

Abstract: The paper describes development of a thesaurus in the roofing domain. This work is part of a larger effort to investigate the potential of thesauri as an aid in product modeling. Extractor, a software module that extracts keyphrases from documents, was used for collecting candidate thesaurus terms from Internet sources. The principal advantage of the Internet as a source of candidate terms is that it reflects colloquial language: -- the language that is actually used by building practitioners and that it covers the widest range of different 'user views' on the domain. The advantage of using Extractor or similar software is that it allows processing huge text corpora available on the Internet and it eliminates irrelevant terms. The methodology used was found to be highly useful, although it was not sufficient by itself for constructing a construction thesaurus, as considerable human intervention was required. Though limited time resources did not allow full exploitation of Extractor's capabilities, some possibilities for customization of the software and for partial automation of a thesaurus construction process are suggested.



Full text: content.pdf (68,731 bytes) (available to registered users only)

Series: w78:1999 (browse)
Cluster: papers of the same cluster (result of machine made clusters)
Class: class.retrieve (0.057636) class.collaboration (0.015453) class.man-man (0.010828)
Similar papers:
Sound: read aloud.

Permission to reproduce these papers has been graciously provided by the Research Press of the National Research Council of Canada. The support of the editors, particularly Dr. Dana Vanier, is gratefully appreciated.


hosted by University of Ljubljana



© itc.scix.net 2003
Home page of this database login Powered by SciX Open Publishing Services 1.002 February 16, 2003