H2R bridge : transferring vision-language models to few-shot intention meta-perception in human robot collaboration

Wu, D; Zhao, Q; Fan, J; Qi, J; Zheng, P; Hu, J

doi:10.1016/j.jmsy.2025.03.016

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/116348

Title:	H2R bridge : transferring vision-language models to few-shot intention meta-perception in human robot collaboration
Authors:	Wu, D Zhao, Q Fan, J Qi, J Zheng, P Hu, J
Issue Date:	Jun-2025
Source:	Journal of manufacturing systems, June 2025, v. 80, p. 524-535
Abstract:	Human–robot collaboration enhances efficiency by enabling robots to work alongside human operators in shared tasks. Accurately understanding human intentions is critical for achieving a high level of collaboration. Existing methods heavily rely on case-specific data and face challenges with new tasks and unseen categories, while often limited data is available under real-world conditions. To bolster the proactive cognitive abilities of collaborative robots, this work introduces a Visual-Language-Temporal approach, conceptualizing intent recognition as a multimodal learning problem with HRC-oriented prompts. A large model with prior knowledge is fine-tuned to acquire industrial domain expertise, then enables efficient rapid transfer through few-shot learning in data-scarce scenarios. Comparisons with state-of-the-art methods across various datasets demonstrate the proposed approach achieves new benchmarks. Ablation studies confirm the efficacy of the multimodal framework, and few-shot experiments further underscore meta-perceptual potential. This work addresses the challenges of perceptual data and training costs, building a human–robot bridge (H2R Bridge) for semantic communication, and is expected to facilitate proactive HRC and further integration of large models in industrial applications.
Keywords:	Few-shot learning Human–robot collaboration Intent recognition Vision-language models
Publisher:	Elsevier
Journal:	Journal of manufacturing systems
ISSN:	0278-6125
DOI:	10.1016/j.jmsy.2025.03.016
Appears in Collections:	Journal/Magazine Article

Open Access Information

Status	embargoed access
Embargo End Date	2027-06-30

Access

View full-text via PolyU eLinks

Show full item record

SCOPUS^TM
Citations

10

Citations as of Apr 3, 2026

Google Scholar^TM

Check

Open Access Information

Access

SCOPUSTM Citations

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

Google Scholar^TM