Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/118390
Title: Deep learning-based intelligent fashion image generation system
Authors: Liao, Fangjian
Degree: Ph.D.
Issue Date: 2025
Abstract: A Fashion image generation engine is a project that specializes in enhancing writing and elucidating concepts by providing visual representations that align with the associated text or idea. However, translating drawings into effective visual representations can often appear as a particularly challenging task, and to my knowledge, there are no similar techniques or tools available in the market that encompass the same combination of capabilities and benefits to assist designers in accelerating creation. This tool aims not only to generate novel concepts for designers but also to ignite inspiration. Simultaneously, the engine seeks to enhance the efficiency of AIGC tools and bring new trends to the academic field.
In the present scenario, deep learning networks have made significant strides in various domains, including traditional object detection, object segmentation, image pose transformation, and text generation models. However, when it comes to applying these advanced technologies in the field of fashion, it poses unique and complex challenges since the scenario is different. Currently, research efforts in the field of image-to-image translation have primarily focused on translating the details and appearances of objects. However, there is insufficient research dedicated to translating painting styles, with limited ability to preserve the essence of the style translation. Additionally, when it comes to uninformative images such as illustrations, conducting effective translation using existing methods remains a challenging task. The designing process includes several key steps such as ideation, sketching, refinement, and more. During this process, the illustrator faces the challenge of finding creative solutions to effectively simplify or visually communicate complex ideas. Additionally, illustration work often operates under tight deadlines, making efficient time management while maintaining high-quality output a challenge for illustrators.
This thesis aims to develop an intelligent fashion image generation system to overcome the aforementioned limitations. First, an automatic drawing image generation engine referencing tops and bottoms is proposed. The engine is combined with a deep neural network focused on keypoint detection and clothing segmentation, enabling effective and quick pixel-level clothing mapping. Keypoint mapping between the detected clothing and the model enables true virtual try-on. Additionally, with the assistance of the segmentation model, the garment can be better fitted on the model when the front and back pieces of the clothing are presented as separate elements.
Secondly, a novel pipeline of drawings-to-images-to-illustrations generation is proposed to spark the inspiration of creation. A state-of-the-art generation model is applied to generate images with the reference of the generated drawing images. In detail, the boundary of the drawing images is extracted as a conditional image, and the generated model combines the extracted conditional image with the drawing image to generate product images in the real domain. Face refinement will be applied to adjust the flaws in the face part of the generated image in the second stage. Additionally, a novel fashion image-to-image generation method named Uni-Duallora is proposed, which optimizes generative capabilities and reduces the number of additional parameters required, to obtain illustrative images.
Thirdly, a novel method is introduced to achieve pose-guided runway image generation. This method leverages the advantages of attention and affine transformation operations, while introducing a novel confidence map in the attention operation to enhance the performance of the image synthesis. The hierarchical structure is applied to generate the final runway image with the conditioned pose.
The contributions of this thesis lie in developing and advancing an intelligent fashion image generation engine. This comprehensive framework not only streamlines the workflow for professionals but also provides a well of inspiration. Simultaneously, its objective is to broaden the design enjoyment realm to individuals without specialized expertise. This thesis also proposes a comprehensive survey on human pose transfer, including its limitations, and provides insights for further exploration.
Subjects: Deep learning (Machine learning)
Fashion drawing
Fashion design
Hong Kong Polytechnic University -- Dissertations
Pages: xii, 144 pages : color illustrations
Appears in Collections:Thesis

Show full item record

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.