Paper w78-2000-302:
Extracting representation from structured text: initial steps

Drogemuller R, Woodbury R, Crawford J

Abstract:A great deal of work has been done in the past on natural language recognition within the field of artificial intelligence. The aim of this work was to allow natural language text to be read in by a computer and structured in a format that would allow automatic interpretation of the text. This was intended to reduce the "knowledge engineering bottleneck" that has been a significant constraint on the use of artificial intelligence techniques within many fields. Some similar work has also been done within the AEC industry concentrating mainly on building codes. The research project described in this paper aims to simplify the analysis of structured text and its conversion into computer interpretable forms by providing support with computer software. The work is built around two documents - a glossary of building terms used in Australia and the Building Code of Australia. The various issues concerned with "noise" in the source data, the structure and content of documents to be analysed and the desired computer interpretable result will be presented. This work is motivated by: the need to maintain BCAider, a knowledge based system that assists in checking building designs for compliance with the Building Code of Australia; continuing work in encoding of regulations in computer interpretable form; and the need for international glossaries to support information harmonisation efforts such as the IAI and STEP. The software suite under development assists people with some understanding of language structure and knowledge engineering in converting structured text into computer interpretable form using a visual user interface. The current state of the design and development of this software suite will be described and the results of its use presented.


