Modelling and assessing reinforcement learning-assisted GNSS multipath estimation and mitigation for urban navigation

Qi, X; Xu, B

doi:10.1109/TVT.2025.3625730

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/118564

Title:	Modelling and assessing reinforcement learning-assisted GNSS multipath estimation and mitigation for urban navigation
Authors:	Qi, X Xu, B
Issue Date:	2025
Source:	IEEE transactions on vehicular technology, Date of Publication: 27 October 2025, Early Access, https://doi.org/10.1109/TVT.2025.3625730
Abstract:	Multipath has long been considered one of the major error sources for Global Navigation Satellite Systems (GNSS) in urban areas. In recent years, many multipath mitigation techniques based on offline machine learning have been proposed. However, collecting and labeling training data for offline learning based techniques is challenging for the multipath problem. On the other hand, it is difficult for the training data to completely capture multipath characteristics in various locations because of the high dependence of multipath on environments. This undoubtedly causes the performance of pre-trained models on real data to degrade. This paper proposes a multipath parameter estimation algorithm based on maximum likelihood estimation (MLE) using online reinforcement learning (RL). The proposed algorithm searches for optimal parameter estimates by interacting with the environment online, thus avoiding the problems associated with offline learning-based techniques. Specifically, the MLE of multipath parameters is formulated as an optimization problem, and a modified Q-learning-based algorithm is employed to solve the problem using an iterative method. Comprehensive performance evaluations of the proposed reinforcement learning-based multipath parameter estimation (RL-MPE) show that: 1) The proposed algorithm performs better in multipath parameter estimation compared to random search and the Bayesian optimization method named the Tree-structured Parzen Estimator (TPE). 2) RL-MPE has better performance for short multipath signals than the Multipath Estimating Delay Lock Loop (MEDLL) when using a comparable number of correlators. 3) RL-MPE achieves significantly better multipath mitigation performance on real urban data compared to the pre trained random forest (RF)-based multipath parameter estimator.
Keywords:	Global Navigation Satellite System (GNSS) Maximum likelihood estimation (MLE) Multipath mitigation Online reinforcement learning (RL) Q-learning
Publisher:	Institute of Electrical and Electronics Engineers
Journal:	IEEE transactions on vehicular technology
ISSN:	0018-9545
EISSN:	1939-9359
DOI:	10.1109/TVT.2025.3625730
Appears in Collections:	Journal/Magazine Article

Open Access Information

Status	embargoed access
Embargo End Date	0000-00-00 (to be updated)

Access

View full-text via PolyU eLinks

Show full item record

Google Scholar^TM

Check

Open Access Information

Access

Google ScholarTM

Altmetric

Google Scholar^TM