Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/101452
PIRA download icon_1.1View/Download Full Text
Title: PMR : prototypical modal rebalance for multimodal learning
Authors: Fan, Y 
Xu, W 
Wang, H
Wang, J
Guo, S 
Issue Date: 2023
Source: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17-24 June 2023, p. 20029-20038
Abstract: Multimodal learning (MML) aims to jointly exploit the common priors of different modalities to compensate for their inherent limitations. However, existing MML methods often optimize a uniform objective for different modalities, leading to the notorious “modality imbalance” problem and counterproductive MML performance. To address the problem, some existing methods modulate the learning pace based on the fused modality, which is dominated by the better modality and eventually results in a limited improvement on the worse modal. To better exploit the features of multimodal, we propose Prototypical Modality Rebalance (PMR) to perform stimulation on the particular slow-learning modality without interference from other modalities. Specifically, we introduce the prototypes that represent general features for each class, to build the non-parametric classifiers for uni-modal performance evaluation. Then, we try to accelerate the slow-learning modality by enhancing its clustering toward prototypes. Furthermore, to alleviate the suppression from the dominant modality, we introduce a prototype-based entropy regularization term during the early training stage to prevent premature convergence. Besides, our method only relies on the representations of each modality and without restrictions from model structures and fusion methods, making it with great application potential for various scenarios. The source code is available here 1 1 https://github.com/fanyunfeng-bit/Modal-Imbalance-PMR.
Keywords: Multi-modal learning
Publisher: IEEE
ISBN: 979-8-3503-0129-8 (Electronic)
979-8-3503-0130-4 (Print on Demand(PoD))
DOI: 10.1109/CVPR52729.2023.01918
Rights: ©2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
The following publication Y. Fan, W. Xu, H. Wang, J. Wang and S. Guo, "PMR: Prototypical Modal Rebalance for Multimodal Learning," 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 2023, pp. 20029-20038 is available at https://doi.org/10.1109/CVPR52729.2023.01918.
Appears in Collections:Conference Paper

Files in This Item:
File Description SizeFormat 
Fan_PMR_Prototypical_Modal.pdfPre-Published version1.25 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Final Accepted Manuscript
Access
View full-text via PolyU eLinks SFX Query
Show full item record

Page views

134
Citations as of Apr 14, 2025

Downloads

128
Citations as of Apr 14, 2025

SCOPUSTM   
Citations

8
Citations as of Jun 21, 2024

WEB OF SCIENCETM
Citations

7
Citations as of Oct 10, 2024

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.