Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/106886
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorDepartment of Electrical and Electronic Engineeringen_US
dc.creatorLi, Hen_US
dc.creatorYang, Den_US
dc.creatorHuang, Sen_US
dc.creatorLam, KMen_US
dc.creatorJin, Len_US
dc.creatorZhuang, Zen_US
dc.date.accessioned2024-06-07T00:58:38Z-
dc.date.available2024-06-07T00:58:38Z-
dc.identifier.issn0925-2312en_US
dc.identifier.urihttp://hdl.handle.net/10397/106886-
dc.language.isoenen_US
dc.publisherElsevier BVen_US
dc.rights© 2020 Elsevier B.V. All rights reserved.en_US
dc.rights© 2020. This manuscript version is made available under the CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/by-nc-nd/4.0/en_US
dc.rightsThe following publication Li, H., Yang, D., Huang, S., Lam, K. M., Jin, L., & Zhuang, Z. (2020). Two-dimensional multi-scale perceptive context for scene text recognition. Neurocomputing, 413, 410-421 is available at https://doi.org/10.1016/j.neucom.2020.06.071.en_US
dc.subjectMulti-scale perceptive contexten_US
dc.subjectScene text recognitionen_US
dc.subjectTwo-dimensional contexten_US
dc.titleTwo-dimensional multi-scale perceptive context for scene text recognitionen_US
dc.typeJournal/Magazine Articleen_US
dc.identifier.spage410en_US
dc.identifier.epage421en_US
dc.identifier.volume413en_US
dc.identifier.doi10.1016/j.neucom.2020.06.071en_US
dcterms.abstractInspired by speech recognition, most of the recent state-of-the-art works convert scene text recognition into sequence prediction. Like most speech recognition problems, context modeling is considered as a critical component in these methods for achieving better performance. However, they usually only consider using a holistic or single-scale local sequence context, in a single dimension. Actually, scene texts or sequence contexts may span arbitrarily across a two-dimensional (2-D) space and in any style, not limited to only horizontal. Moreover, contexts of various scales may synthetically contribute to text recognition, in particular for irregular text recognition. In our method, we consider the context in a 2-D manner, and simultaneously consider context reasoning at various scales, from local to global. Based on this, we propose a new Two-Dimensional Multi-Scale Perceptive Context (TDMSPC) module, which performs multi-scale context learning, along both the horizontal and vertical directions, and then merges them. This can generate shape and layout-dependent feature maps for scene text recognition. This proposed module can be handily inserted into existing sequence-based frameworks to replace their context learning mechanism. Furthermore, a new scene text recognition network, called TDMSPC-Net, is built, by using the TDMSPC module as a building block for the encoder, and adopting an attention-based LSTM as the decoder. Experiments on benchmark datasets show that the TDMSPC module can substantially boost the performance of existing sequence-based scene text recognizers, irrespective of the decoder or backbone network being used. The proposed TDMSPC-Net achieves state-of-the-art accuracy on all the benchmark datasets.en_US
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitationNeurocomputing, 6 Nov. 2020, v. 413, p. 410-421en_US
dcterms.isPartOfNeurocomputingen_US
dcterms.issued2020-11-06-
dc.identifier.scopus2-s2.0-85089400670-
dc.identifier.eissn1872-8286en_US
dc.description.validate202405 bcchen_US
dc.description.oaAccepted Manuscripten_US
dc.identifier.FolderNumberEIE-0126-
dc.description.fundingSourceSelf-fundeden_US
dc.description.pubStatusPublisheden_US
dc.identifier.OPUS50283774-
dc.description.oaCategoryGreen (AAM)en_US
Appears in Collections:Journal/Magazine Article
Files in This Item:
File Description SizeFormat 
Lam_Two-Dimensional_Multi-Scale_Perceptive.pdfPre-Published version3.57 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Final Accepted Manuscript
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

Page views

4
Citations as of Jun 30, 2024

Downloads

3
Citations as of Jun 30, 2024

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.