Identification and categorization of defects in construction specifications utilizing natural language processing
Madenli, O., Atasoy, G. and Dikmen, I.
It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing. Abstract/SummaryDefective specification statements cause not only a faulty outcome but also disputes among project stakeholders, claims for project budget and time, project disruptions, and even litigation. Identifying defects in technical sections of construction specifications is challenging. This research aims to develop a structured defect framework and implement supervised natural language processing methods for identifying and categorizing defects in specifications. The dataset includes 175 specifications related to 21 different architectural works collected from 16 construction projects. Eight Machine Learning (ML) models, ranging from shallow to transformer-based, were trained and tested with combinations of different text representation techniques. Subsequently, a study with a GenAI tool, ChatGPT-4o, was conducted. Pre-trained RoBERTa model outperformed the recognition of defects in construction specifications with a macro F1 score of 91.2% and 98% accuracy. This research offers a data-driven methodology with practical tools to enhance the quality of specifications and decrease disputes by reducing the defective specification statements during design, bidding, and pre-construction.
Deposit Details University Staff: Request a correction | Centaur Editors: Update this record |
Tools
Tools