Learning approaches for scene localization and quality scene reconstruction

Li, Chu Tak

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/97938

DC Field	Value	Language
dc.contributor	Department of Electronic and Information Engineering	-
dc.creator	Li, Chu Tak	-
dc.identifier.uri	https://theses.lib.polyu.edu.hk/handle/200/12268	-
dc.language.iso	English	-
dc.title	Learning approaches for scene localization and quality scene reconstruction	-
dc.type	Thesis	-
dcterms.abstract	Vision-based autonomous driving techniques are popular in both academia and industry because of the highly cost-effective commodity cameras with high quality output images and the information richness of images. Global Navigation Satellite Systems (GNSS) is well-known for many real-world ego-localization and other related applications. However, GNSS suffers from reflection and blocking due to dense concrete buildings and tall trees, especially in the densely populated urban areas, like Hong Kong. There are also other solutions using high-level sensors like Lidar, Radar and 360 RGB-D cameras. Nevertheless, these solutions still have their respective limitations and are not widely used in various commercial products. Therefore, various technologies including visual place recognition and reconstruction methods discussed in this thesis will be required for achieving a comprehensive autonomous driving system.	-
dcterms.abstract	Place recognition or localization is an important element to autonomous driving system. Accurate ego location information is crucial for either removing past accumulated errors or future planning. The challenges lie in the variations in appearance, speeds, lighting environments, perspectives and objects. Therefore, we develop a fast algorithm for place recognition, for which fast tracking with the use of historical information and effective representation of a frame have been comprehensively studied to achieve satisfactory recognition performance and minimize computational cost. We name the use of historical information as a tubing strategy which emphasizes the temporal correlation between consecutive input frames.	-
dcterms.abstract	We take the advantages of recent deep learning techniques; also remove two main barriers of Convolutional Neural Networks (CNNs) , i.e., heavy computational cost and large amount of labelled data, such that deep learning techniques can be used for efficient place recognition methods. We study lightweight CNN models to offer efficient feature extraction and improve an existing automatic training data generation module by considering more variations in conditions. We further propose a way to adaptively use the historical information to tackle the tasks of unknown initial location and efficient recognition. The proposed methods outperform other state-of-the-art methods in terms of both recognition performance and complexity.	-
dcterms.abstract	To ensure the quality of the extracted features from images, we also study object removal by means of deep learning-based image inpainting for scene reconstruction. By removing unwanted objects like moving vehicles and pedestrians in images, we can have clean images for place recognition. We propose Deep Generative Inpainting Network (DeepGIN) and inpainting model with Multi-Dilation Fusion Block (MDFB) and auxiliary attention learning branch which seek for a better balance of pixel-wise accuracy and visual quality. We show that our proposed models can handle wild images by testing them on several publicly available datasets, Flickr-Faces-HQ (FFHQ), The Oxford Buildings and Places2 datasets. We demonstrate that our inpainting results can be used in other high-level computer vision tasks such as face verification and semantic segmentation. We believe that the inpainting results can also be used in place recognition.	-
dcterms.abstract	For future research work, we target at developing a more comprehensive recognition system for which our inpainting models are used as pre-processing module to obtain better input images and our tubing strategy is applied to the post-processing stage to obtain better recognition performance. Apart from combining the techniques discussed in this thesis, we would like to develop an online learning strategy to keep the understanding of a path up to date for further enhancing life-long recognition performance.	-
dcterms.accessRights	open access	-
dcterms.educationLevel	M.Phil.	-
dcterms.extent	xiv, 141 pages : color illustrations	-
dcterms.issued	2023	-
dcterms.LCSH	Automated vehicles -- Data processing	-
dcterms.LCSH	Automated vehicles -- Control	-
dcterms.LCSH	Machine learning	-
dcterms.LCSH	Hong Kong Polytechnic University -- Dissertations	-
Appears in Collections:	Thesis

Access

View full-text via https://theses.lib.polyu.edu.hk/handle/200/12268

Show simple item record

Page views

160

Last Week
4

Last month

Citations as of Nov 30, 2025

Google Scholar^TM

Check

Access

Page views

Google ScholarTM

Google Scholar^TM