Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/80995
Title: Understanding human fashion images : a parse-pose based study
Authors: Zhou, Yanghong
Advisors: Mok, Pik-yin Tracy (ITC)
Keywords: Textile industry -- Technological innovations
Image processing -- Digital techniques
Issue Date: 2019
Publisher: The Hong Kong Polytechnic University
Abstract: There is an ever-increasing amount of fashion image data available on the Internet nowadays, and the rate of growth itself is also increasing. It is necessary to have some ideas about the image contents before we can effectively manage and make use of such a large amount of image assets. In this study, we research to develop effective computer systems for understanding fashion images. This means to cross the so-called semantic gap between the pixel level information stored in the image files and the human understanding of the same images. The traditional approach to image understanding involves a sequence of processing steps. The overall effectiveness of these methods, therefore, relies on the performance of individual processes, and they are not fully end-to-end solutions over raw image pixels. To understand fashion images, a number of challenges have to address. Firstly, fashion products often have large variations in style, texture, and cutting. Secondly, the clothing classifications reported in the literature are too brief, thus the value of the classified image contents are limited. Thirdly, clothing items are frequently subject to deformations and occlusions. Earlier works on clothing recognition mostly relied on handcrafted features, and therefore the performance of these methods was limited by the expressive power of these features. We research on an overall platform for fashion image understanding in this research. It aims to understand the detailed and high-level information in the images, including segmenting the regions of interests, extracting size and shape information of human presented in the images, recognising the fashion items and further recognising the fine-grained attributes of fashion items. The human parsing is the basic block of the proposed framework. It aims to segment a human photo into semantic fashion/body items, such as the face, arms, legs, dress, and background. By reviewing the state-of-the-arts human parsing research, an attention-based human parsing approach is proposed, which is first realised in a cascade network model and later in an end-to-end network model. As the basic block of the proposed platform, human parsing solves the problem of cross-domain clothing retrieval and enables the implementation of clothing recognition and human shape modelling. Human parsing and pose estimation are highly correlated and complementary to each other. We therefore propose to fine-tune the regions of interests segmented by human parsing using pose estimation. The segmented semantic regions are input for human and fashion information understanding. In terms of human information, we mainly extract the size and shape information of the human subjects in the input images. We use 3D modelling customisation technology to address this problem. This is because the segmented regions of human body parts can separate humans from clutter backgrounds and enable extraction of accurate 2D contours of the human subjects on the input images. The extracted contours are employed to reconstruct 3D human shape model, from which body sizes and shape parameters are calculated. In addition to understanding human subjects' information, we investigate the understanding of fashion information from images. To do so, we first develop a new dataset and taxonomy of fashion products, based on the industrial needs on fashion understanding. We next develop deep neural network models to recognise clothing category, fine-grained features and attributes from fashion photos. In the proposed framework, human parsing, pose estimation and clothing recognition are based on deep learning techniques.
Description: xxviii, 257 pages : color illustrations
PolyU Library Call No.: [THS] LG51 .H577P ITC 2019 ZhouY
URI: http://hdl.handle.net/10397/80995
Rights: All rights reserved.
Appears in Collections:Thesis

Files in This Item:
File Description SizeFormat 
991022232428103411_link.htmFor PolyU Users168 BHTMLView/Open
991022232428103411_pira.pdfFor All Users (Non-printable)9.65 MBAdobe PDFView/Open
Show full item record
PIRA download icon_1.1View/Download Contents

Page view(s)

8
Citations as of Aug 21, 2019

Download(s)

4
Citations as of Aug 21, 2019

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.