Please use this identifier to cite or link to this item:
http://hdl.handle.net/10397/116138
| DC Field | Value | Language |
|---|---|---|
| dc.contributor | Department of Mechanical Engineering | - |
| dc.creator | Gao, Shuang | - |
| dc.date.accessioned | 2025-11-24T22:35:36Z | - |
| dc.date.available | 2025-11-24T22:35:36Z | - |
| dc.identifier.uri | https://theses.lib.polyu.edu.hk/handle/200/13970 | - |
| dc.identifier.uri | http://hdl.handle.net/10397/116138 | - |
| dc.language.iso | English | - |
| dc.title | Research on semantic bird-eye-view map prediction for autonomous driving | - |
| dc.type | Thesis | - |
| dcterms.abstract | Semantic Bird Eye View (BEV) map prediction is a widely utilized environmental perception technique in autonomous driving, converting forward-facing images captured by on-board cameras into top-down two-dimensional images. The semantic BEV map not only encompasses spatial positional information of the driving scene but also includes semantic information of different objects within the scene. This top-down representation facilitates the fusion of multiple sensor data, eliminates perspective distortions caused by camera imaging process, and intuitively conveys the spatial relationships between the autonomous vehicle and surrounding obstacles. This provides a solid perceptual foundation for downstream tasks in autonomous driving. Additionally, the embedded semantic information aids autonomous vehicles in achieving a high-level understanding of their environment. Due to its numerous advantages, semantic BEV map prediction has become a primary research focus in environmental perception. | - |
| dcterms.abstract | The prevailing approach for generating semantic BEV map relies on deep learning models. This method is fundamentally data-driven, with its performance significantly influenced by the quality of the training datasets. However, generating semantic BEV map involves view transformation, making it difficult to label corresponding semantic BEV labels. Manual annotation can introduce significant errors, leading to insufficient labeled samples and decreased label quality. The issue related to training samples present significant challenges for the prediction of semantic BEV map. Additionally, the limited field of view (FOV) of cameras constrains the expression of environmental information in semantic BEV map. This dissertation focuses on the prediction of semantic BEV map, exploring semi-supervised learning methods to ensure segmentation accuracy while reducing the dependency on labeled samples during training. To tackle the limitation in information expression, the dissertation explores sequential image fusion algorithms, using historical observations to enhance the information expression capability of semantic BEV map. The main research contributions are as follows: | - |
| dcterms.abstract | To reduce the dependence of the semantic BEV map prediction network on labeled samples, this dissertation proposes a semi-supervised semantic BEV map prediction network based on contrastive learning. This network innovatively integrates view transformation with contrastive learning, avoiding the usage of complex data augmentation in traditional contrastive learning networks. The proposed network is capable of end-to-end training with both labeled and unlabeled samples, resulting in accurate and stable semantic BEV map. | - |
| dcterms.abstract | To address the issue of limited environmental observation due to the restricted FOV of cameras, this dissertation proposes a full-view semantic BEV map prediction network based on equidistant sequence fusion. This network utilizes equidistant image sequences to expand the observation range of the environment. The proposed network is capable of generating accurate and clear semantic BEV map in full view with an explainable view transformation module. | - |
| dcterms.abstract | To provide autonomous driving systems with perception of future scenes, this dissertation proposes a short-term future semantic BEV map prediction network based on long short-term memory (LSTM). By predicting future scenes, this network enhances the perception system's ability to provide early warnings for autonomous vehicles, enabling timely reactions to the changes in driving environment. | - |
| dcterms.accessRights | open access | - |
| dcterms.educationLevel | Ph.D. | - |
| dcterms.extent | xxi, 134 pages : color illustrations | - |
| dcterms.issued | 2025 | - |
| Appears in Collections: | Thesis | |
Access
View full-text via https://theses.lib.polyu.edu.hk/handle/200/13970
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.


