Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/109615
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorFaculty of Science-
dc.creatorIm, SK-
dc.creatorChan, KH-
dc.date.accessioned2024-11-08T06:10:28Z-
dc.date.available2024-11-08T06:10:28Z-
dc.identifier.urihttp://hdl.handle.net/10397/109615-
dc.language.isoenen_US
dc.publisherInstitute of Electrical and Electronics Engineersen_US
dc.rightsThis work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/en_US
dc.rightsThe following publication S. -K. Im and K. -H. Chan, "Context-Adaptive-Based Image Captioning by Bi-CARU," in IEEE Access, vol. 11, pp. 84934-84943, 2023 is available at https://doi.org/10.1109/ACCESS.2023.3302512.en_US
dc.subjectAttention mechanismen_US
dc.subjectBi-CARUen_US
dc.subjectCNNen_US
dc.subjectContext-adaptiveen_US
dc.subjectImage captioningen_US
dc.subjectNLPen_US
dc.subjectRNNen_US
dc.titleContext-adaptive-based image captioning by Bi-CARUen_US
dc.typeJournal/Magazine Articleen_US
dc.identifier.spage84934-
dc.identifier.epage84943-
dc.identifier.volume11-
dc.identifier.doi10.1109/ACCESS.2023.3302512-
dcterms.abstractImage captions are abstract expressions of content representations using text sentences, helping readers to better understand and analyse information between different media. With the advantage of encoder-decoder neural networks, captions can provide a rational structure for tasks such as image coding and caption prediction. This work introduces a Convolutional Neural Network (CNN) to Bidirectional Content-Adaptive Recurrent Unit (Bi-CARU) (CNN-to-Bi-CARU) model that performs bidirectional structure to consider contextual features and captures major feature from image. The encoded feature coded form image is respectively passed into the forward and backward layer of CARU to refine the word prediction, providing contextual text output for captioning. An attention layer is also introduced to collect the feature produced by the context-adaptive gate in CARU, aiming to compute the weighting information for relationship extraction and determination. In experiments, the proposed CNN-to-Bi-CARU model outperforms other advanced models in the field, achieving better extraction of contextual information and detailed representation of image captions. The model obtains a score of 41.28 on BLEU@4, 31.23 on METEOR, 61.07 on ROUGE-L, and 133.20 on CIDEr-D, making it competitive in the image captioning of MSCOCO dataset.-
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitationIEEE access, 2023, v. 11, p. 84934-84943-
dcterms.isPartOfIEEE access-
dcterms.issued2023-
dc.identifier.scopus2-s2.0-85167787587-
dc.identifier.eissn2169-3536-
dc.description.validate202411 bcch-
dc.description.oaVersion of Recorden_US
dc.identifier.FolderNumberOA_Scopus/WOSen_US
dc.description.fundingSourceOthersen_US
dc.description.fundingTextMacao Polytechnic University Research Projecten_US
dc.description.pubStatusPublisheden_US
dc.description.oaCategoryCCen_US
Appears in Collections:Journal/Magazine Article
Files in This Item:
File Description SizeFormat 
Im_Context-Adaptive-Based_Image_Captioning.pdf1.74 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

Page views

27
Citations as of Apr 14, 2025

Downloads

23
Citations as of Apr 14, 2025

SCOPUSTM   
Citations

10
Citations as of Apr 3, 2026

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.