Enhancing human parsing with region-level learning

Zhou, Y; Mok, PY

doi:10.1049/cvi2.12222

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/109570

Title:	Enhancing human parsing with region-level learning
Authors:	Zhou, Y Mok, PY
Issue Date:	Feb-2024
Source:	IET computer vision, Feb. 2024, v. 18, no. 1, p. 60-71
Abstract:	Human parsing is very important in a diverse range of industrial applications. Despite the considerable progress that has been achieved, the performance of existing methods is still less than satisfactory, since these methods learn the shared features of various parsing labels at the image level. This limits the representativeness of the learnt features, especially when the distribution of parsing labels is imbalanced or the scale of different labels is substantially different. To address this limitation, a Region-level Parsing Refiner (RPR) is proposed to enhance parsing performance by the introduction of region-level parsing learning. Region-level parsing focuses specifically on small regions of the body, for example, the head. The proposed RPR is an adaptive module that can be integrated with different existing human parsing models to improve their performance. Extensive experiments are conducted on two benchmark datasets, and the results demonstrated the effectiveness of our RPR model in terms of improving the overall parsing performance as well as parsing rare labels. This method was successfully applied to a commercial application for the extraction of human body measurements and has been used in various online shopping platforms for clothing size recommendations. The code and dataset are released at this link https://github.com/applezhouyp/PRP.
Keywords:	Computer vision Image processing Image segmentation Pose estimation
Publisher:	The Institution of Engineering and Technology
Journal:	IET computer vision
ISSN:	1751-9632
EISSN:	1751-9640
DOI:	10.1049/cvi2.12222
Rights:	© 2023 The Authors. IET Computer Vision published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology. This is an open access article under the terms of the Creative Commons Attribution‐NonCommercial‐NoDerivs License (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made. The following publication Zhou, Y., & Mok, P. Y. (2024). Enhancing human parsing with region-level learning. IET Computer Vision, 18(1), 60-71 is available at https://doi.org/10.1049/cvi2.12222.
Appears in Collections:	Journal/Magazine Article

Files in This Item:

File	Description	Size	Format
Zhou_Enhancing_Human_Parsing.pdf		1.12 MB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Version of Record

Access

View full-text via PolyU eLinks

Show full item record

Page views

42

Citations as of Apr 14, 2025

Downloads

29

Citations as of Apr 14, 2025

SCOPUS^TM
Citations

2

Citations as of Sep 12, 2025

Google Scholar^TM

Check