Please use this identifier to cite or link to this item:
http://hdl.handle.net/10397/109615
| DC Field | Value | Language |
|---|---|---|
| dc.contributor | Faculty of Science | - |
| dc.creator | Im, SK | - |
| dc.creator | Chan, KH | - |
| dc.date.accessioned | 2024-11-08T06:10:28Z | - |
| dc.date.available | 2024-11-08T06:10:28Z | - |
| dc.identifier.uri | http://hdl.handle.net/10397/109615 | - |
| dc.language.iso | en | en_US |
| dc.publisher | Institute of Electrical and Electronics Engineers | en_US |
| dc.rights | This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ | en_US |
| dc.rights | The following publication S. -K. Im and K. -H. Chan, "Context-Adaptive-Based Image Captioning by Bi-CARU," in IEEE Access, vol. 11, pp. 84934-84943, 2023 is available at https://doi.org/10.1109/ACCESS.2023.3302512. | en_US |
| dc.subject | Attention mechanism | en_US |
| dc.subject | Bi-CARU | en_US |
| dc.subject | CNN | en_US |
| dc.subject | Context-adaptive | en_US |
| dc.subject | Image captioning | en_US |
| dc.subject | NLP | en_US |
| dc.subject | RNN | en_US |
| dc.title | Context-adaptive-based image captioning by Bi-CARU | en_US |
| dc.type | Journal/Magazine Article | en_US |
| dc.identifier.spage | 84934 | - |
| dc.identifier.epage | 84943 | - |
| dc.identifier.volume | 11 | - |
| dc.identifier.doi | 10.1109/ACCESS.2023.3302512 | - |
| dcterms.abstract | Image captions are abstract expressions of content representations using text sentences, helping readers to better understand and analyse information between different media. With the advantage of encoder-decoder neural networks, captions can provide a rational structure for tasks such as image coding and caption prediction. This work introduces a Convolutional Neural Network (CNN) to Bidirectional Content-Adaptive Recurrent Unit (Bi-CARU) (CNN-to-Bi-CARU) model that performs bidirectional structure to consider contextual features and captures major feature from image. The encoded feature coded form image is respectively passed into the forward and backward layer of CARU to refine the word prediction, providing contextual text output for captioning. An attention layer is also introduced to collect the feature produced by the context-adaptive gate in CARU, aiming to compute the weighting information for relationship extraction and determination. In experiments, the proposed CNN-to-Bi-CARU model outperforms other advanced models in the field, achieving better extraction of contextual information and detailed representation of image captions. The model obtains a score of 41.28 on BLEU@4, 31.23 on METEOR, 61.07 on ROUGE-L, and 133.20 on CIDEr-D, making it competitive in the image captioning of MSCOCO dataset. | - |
| dcterms.accessRights | open access | en_US |
| dcterms.bibliographicCitation | IEEE access, 2023, v. 11, p. 84934-84943 | - |
| dcterms.isPartOf | IEEE access | - |
| dcterms.issued | 2023 | - |
| dc.identifier.scopus | 2-s2.0-85167787587 | - |
| dc.identifier.eissn | 2169-3536 | - |
| dc.description.validate | 202411 bcch | - |
| dc.description.oa | Version of Record | en_US |
| dc.identifier.FolderNumber | OA_Scopus/WOS | en_US |
| dc.description.fundingSource | Others | en_US |
| dc.description.fundingText | Macao Polytechnic University Research Project | en_US |
| dc.description.pubStatus | Published | en_US |
| dc.description.oaCategory | CC | en_US |
| Appears in Collections: | Journal/Magazine Article | |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| Im_Context-Adaptive-Based_Image_Captioning.pdf | 1.74 MB | Adobe PDF | View/Open |
Page views
27
Citations as of Apr 14, 2025
Downloads
23
Citations as of Apr 14, 2025
SCOPUSTM
Citations
10
Citations as of Apr 3, 2026
Google ScholarTM
Check
Altmetric
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.



