Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/118830
PIRA download icon_1.1View/Download Full Text
Title: HM3 : hierarchical multi-objective model merging for pretrained models
Authors: Zhou, Y 
Wu, X 
Wu, J 
Feng, L
Tan, KC 
Issue Date: 2025
Source: The Thirty-ninth Annual Conference on Neural Information Processing Systems, NeurIPS 2025, San Diego, USA, Dec 01 2025, https://openreview.net/forum?id=JeP0lpusYw
Abstract: Model merging is a technique that combines multiple large pretrained models into a single model, enhancing performance and broadening task adaptability without original data or additional training. However, most existing model merging methods focus primarily on exploring the parameter space, merging models with identical architectures. Despite its potential, merging in the architecture space remains in its early stages due to the vast search space and challenges related to layer compatibility. This paper designs a hierarchical model merging framework named HM3, formulating a bilevel multi-objective model merging problem across both parameter and architecture spaces. At the parameter level, HM3 integrates existing merging methods to quickly identify optimal parameters. Based on these, an actor-critic strategy with efficient policy discretization is employed at the architecture level to explore inference paths with Markov property in the layer-granularity search space for reconstructing these optimal models. By training reusable policy and value networks, HM3 learns Pareto optimal models to provide customized solutions for various tasks. Experimental results on language and vision tasks demonstrate that HM3 outperforms methods focusing solely on the parameter or architecture space.
Publisher: OpenReview.net
Description: The Thirty-ninth Annual Conference on Neural Information Processing Systems, NeurIPS 2025, San Diego, USA, Dec 01 2025
Rights: CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
The following publication Zhou, Y., Wu, X., Wu, J., Feng, L., & Tan, K. C. (2026). Hm3: Hierarchical multi-objective model merging for pretrained models. In The Thirty-ninth Annual Conference on Neural Information Processing Systems is available at https://openreview.net/forum?id=JeP0lpusYw.
Appears in Collections:Conference Paper

Files in This Item:
File Description SizeFormat 
29077_HM3_Hierarchical_Multi_O.pdf867.53 kBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Show full item record

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.