A deep learning approach for fashion image processing with controllable synthesis and flexible editing

Sun, Zhengwentai

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/115855

DC Field	Value	Language
dc.contributor	School of Fashion and Textiles	-
dc.creator	Sun, Zhengwentai	-
dc.date.accessioned	2025-11-07T22:35:21Z	-
dc.date.available	2025-11-07T22:35:21Z	-
dc.identifier.uri	https://theses.lib.polyu.edu.hk/handle/200/13948	-
dc.identifier.uri	http://hdl.handle.net/10397/115855	-
dc.language.iso	English	-
dc.title	A deep learning approach for fashion image processing with controllable synthesis and flexible editing	-
dc.type	Thesis	-
dcterms.abstract	Fashion design typically involves composing elements and concepts, where designers select and harmonize colors, patterns, prints, and consider functional attributes like collar types, sleeve length, and overall fit. This process, reflecting the designer's creativity and market preferences, usually requires iterative modifications and can be time-consuming even for experts. Although recent advances in generative models offer efficient and effective way of processing of fashion images, applying these models in design remains challenging. The generative models primarily map random noise into an image, and the process is arbitrary and uncontrollable that requires multiple attempts to achieve a satisfactory image, meeting certain specific requirements.	-
dcterms.abstract	A primary solution in enhancing the experience of generating desired garment images could involve detailed supervisory information. For instance, by collecting a fashion garment dataset with detailed annotations of each design element, the generative models could learn a conditional mapping from specific elements to the desired garment image. However, an obvious drawback of such a solution is the requirement of tedious annotation, which could be time-consuming and expensive. Moreover, those labels usually consider a discrete attribute where each element will be assigned to a category. When using such a model to consider the design process, its flexibility is limited as there are multiple design elements that are hard to categorize, e.g., colors and/or textures.	-
dcterms.abstract	To address the above-mentioned challenges in controllability and flexibility, this study develops generative models involving a decoupling method in the data collection and training. The overall motivation is to decouple a garment image into different modalities of data, each representing different design elements. For instance, the HED model is utilized to extract sketches that represent spatial level attributes like collars, lengths, and overall shapes. At the texture level, the cropped image patches are employed. These decoupled data, derived partially from the original garment images, are used to train generative models with the capable of reconstructing the original images. The trained model enables control over the synthesized garment image by selecting specific design elements during the inference stage.	-
dcterms.abstract	Building on this capability, this thesis introduces an image processing system that involves two models: a controllable generation model and a flexible editing model, each targeting different fashion image processing tasks. The first model, called SGDiffs, focuses on the control over texture, the generation model leverages randomly cropped texture patches and text prompts to reconstruct garments. Once trained, it uses texture patches as decoupled style condition to control the synthesized garment images. Subsequently, an editing model, called CoDE-GAN, is introduced to modify the shape of fashion images. It learns the editing function by reconstructing masked images using sketch maps. The two models can work independently or integratively as one system, enabling effective and flexible control in the generation and editing of fashion images. Both models have been comprehensively evaluated to demonstrate their specific advantages in comparison of other state-of-the-art models.	-
dcterms.accessRights	open access	-
dcterms.educationLevel	M.Phil.	-
dcterms.extent	xiv, 127 pages : color illustrations	-
dcterms.issued	2024	-
Appears in Collections:	Thesis

Access

View full-text via https://theses.lib.polyu.edu.hk/handle/200/13948

Show simple item record

Google Scholar^TM

Check

Access

Google ScholarTM

Google Scholar^TM