Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/111785
PIRA download icon_1.1View/Download Full Text
Title: Automatic bridge inspection database construction through hybrid information extraction and large language models
Authors: Zhang, C
Lei, X 
Xia, Y
Sun, L
Issue Date: Dec-2024
Source: Developments in the built environment, Dec. 2024, v. 20, 100549
Abstract: Regular bridge inspections generate extensive reports that, while critical for maintenance, often remain underutilized due to their unstructured format. Traditional information extraction methods depend on intricate labeling systems that commonly require time-consuming and labor-intensive labeling. This paper presents a novel bridge inspection database construction method leveraging LLM-assisted information extraction. First, we introduce the pseudo-labelling method using a closed-source LLM to generate high-quality data. Then we propose the hybrid extraction pipeline to extract relevant information segments and process them by a generation-based IE model, fine-tuned on pseudo-labeled data. Finally, the extracted data is used to construct the bridge inspection database. The proposed method, validated with real-world data, not only demonstrates higher extraction precision than the closed-source LLM used for pseudo-labeling but also outperforms traditional methods in both data preparation time and extraction accuracy. This approach provides a scalable solution for more proactive and data-driven bridge maintenance strategies.
Keywords: Bridge inspection data
Information extraction
Large languge model
Natural language processing
Pseudo label
Publisher: Elsevier Ltd
Journal: Developments in the built environment 
EISSN: 2666-1659
DOI: 10.1016/j.dibe.2024.100549
Rights: © 2024 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
The following publication Zhang, C., Lei, X., Xia, Y., & Sun, L. (2024). Automatic bridge inspection database construction through hybrid information extraction and large language models. Developments in the Built Environment, 20, 100549 is available at https://doi.org/10.1016/j.dibe.2024.100549.
Appears in Collections:Journal/Magazine Article

Files in This Item:
File Description SizeFormat 
1-s2.0-S2666165924002308-main.pdf13.83 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Access
View full-text via PolyU eLinks SFX Query
Show full item record

Page views

12
Citations as of Apr 14, 2025

Downloads

8
Citations as of Apr 14, 2025

SCOPUSTM   
Citations

7
Citations as of Dec 19, 2025

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.