Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/97760
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorDepartment of Building and Real Estateen_US
dc.creatorXiao, Ben_US
dc.creatorWang, Yen_US
dc.creatorKang, SCen_US
dc.date.accessioned2023-03-13T02:23:53Z-
dc.date.available2023-03-13T02:23:53Z-
dc.identifier.issn0733-9364en_US
dc.identifier.urihttp://hdl.handle.net/10397/97760-
dc.language.isoenen_US
dc.publisherAmerican Society of Civil Engineersen_US
dc.rights© 2022 American Society of Civil Engineers.en_US
dc.rightsThis material may be downloaded for personal use only. Any other use requires prior permission of the American Society of Civil Engineers. This material may be found at https://dx.doi.org/10.1061/(ASCE)CO.1943-7862.0002297.en_US
dc.subjectDeep learningen_US
dc.subjectImage captioningen_US
dc.subjectConstruction machinesen_US
dc.subjectFeasibility studyen_US
dc.subjectVision-based monitoringen_US
dc.titleDeep learning image captioning in construction management : a feasibility studyen_US
dc.typeJournal/Magazine Articleen_US
dc.identifier.volume148en_US
dc.identifier.issue7en_US
dc.identifier.doi10.1061/(ASCE)CO.1943-7862.0002297en_US
dcterms.abstractDeep learning image captioning methods are able to generate one or several natural sentences to describe the contents of construction images. By deconstructing these sentences, the construction object and activity information can be retrieved integrally for automated scene analysis. However, the feasibility of deep learning image captioning in construction remains unclear. To fill this gap, this research investigates the feasibility of deep learning image captioning methods in construction management. First, a linguistic schema for annotating construction machine images was established, and a captioning data set was developed. Then, six deep learning image captioning methods from the computer vision community were selected and tested on the construction captioning data set. In the sentence-level evaluation, the transformer-self-critical sequence training (Tsfm-SCST) method has obtained the best performance among six methods with the bilingual evaluation (BLEU)-1 score of 0.606, BLEU-2 of 0.506, BLEU-3 of 0.427, BLEU-4 of 0.349, metric for evaluation of translation with explicit ordering (METEOR) of 0.287, recall-oriented understudy for gisting evaluation (ROUGE) of 0.585, consensus-based image description evaluation (CIDEr) of 1.715, and semantic propositional image caption evaluation (SPICE) score of 0.422. In the element-level evaluation, the Tsfm-SCST method achieved an average precision of 91.1%, recall of 83.3%, and an F1 score of 86.6% for recognition of construction machine objects by deconstructing the generated sentences. This research indicates that deep learning image captioning is feasible as a method of generating accurate and precise text descriptions from construction images, with potential applications in construction scene analysis and image documentation.en_US
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitationJournal of construction engineering and management, July 2022, v. 148, no. 7, 04022049en_US
dcterms.isPartOfJournal of construction engineering and managementen_US
dcterms.issued2022-07-
dc.identifier.eissn1943-7862en_US
dc.identifier.artn04022049en_US
dc.description.validate202303 bcchen_US
dc.description.oaAccepted Manuscripten_US
dc.identifier.FolderNumbera1177-n01-
dc.identifier.SubFormID44077-
dc.description.fundingSourceSelf-fundeden_US
dc.description.pubStatusPublisheden_US
dc.description.oaCategoryGreen (AAM)en_US
Appears in Collections:Journal/Magazine Article
Files in This Item:
File Description SizeFormat 
Xiao_Image_Captioning_Construction.pdfPre-Published version5.98 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Final Accepted Manuscript
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

Page views

110
Citations as of Apr 14, 2025

Downloads

238
Citations as of Apr 14, 2025

SCOPUSTM   
Citations

11
Citations as of Jun 21, 2024

WEB OF SCIENCETM
Citations

10
Citations as of Oct 10, 2024

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.