Digital library of construction informatics
and information technology in civil engineering and construction


Paper itcon-2000-2:
Use of Keyphrase Extraction Software for Creation of an AEC/FM Thesaurus

Facilitated by the SciX project

Kosovac B, Vanier D J, Froese T M

Use of Keyphrase Extraction Software for Creation of an AEC/FM Thesaurus

Abstract: The paper describes a method used to collect terms needed for the development of a thesaurus in the roofing domain. This work is part of a larger effort to investigate the potential of thesauri as an aid in product modeling and as a tool for information management in model-based systems. Extractor, a software module that extracts keyphrases from documents, was used for collecting candidate thesaurus terms from Internet sources. The principal advantage of the Internet as a source of candidate terms is that it reflects the language that is actually used in communications concerning buildings and that it covers the widest range of different views on the domain. The advantage of using Extractor or similar software is that it allows processing huge text corpora available on the Internet while eliminating irrelevant terms. The methodology used was found to be highly useful, although it was not sufficient by itself for constructing a thesaurus for the architecture, engineering, construction and facilities management industries, as considerable human intervention was required. Some possibilities for customizing the software and for partially automating a thesaurus construction process are suggested.

Keywords: thesauri, Internet, automatic indexing software, thesaurus construction


Full text: http://www.itcon.org/2000/2 (available to registered users only)

Series: itcon:2000 (browse)
Similar papers:
Sound: read aloud.


hosted by University of Ljubljana



© itc.scix.net 2003
Home page of this database login Powered by SciX Open Publishing Services 1.002 February 16, 2003