Please use this identifier to cite or link to this item:
http://hdl.handle.net/10397/116348
| Title: | H2R bridge : transferring vision-language models to few-shot intention meta-perception in human robot collaboration | Authors: | Wu, D Zhao, Q Fan, J Qi, J Zheng, P Hu, J |
Issue Date: | Jun-2025 | Source: | Journal of manufacturing systems, June 2025, v. 80, p. 524-535 | Abstract: | Human–robot collaboration enhances efficiency by enabling robots to work alongside human operators in shared tasks. Accurately understanding human intentions is critical for achieving a high level of collaboration. Existing methods heavily rely on case-specific data and face challenges with new tasks and unseen categories, while often limited data is available under real-world conditions. To bolster the proactive cognitive abilities of collaborative robots, this work introduces a Visual-Language-Temporal approach, conceptualizing intent recognition as a multimodal learning problem with HRC-oriented prompts. A large model with prior knowledge is fine-tuned to acquire industrial domain expertise, then enables efficient rapid transfer through few-shot learning in data-scarce scenarios. Comparisons with state-of-the-art methods across various datasets demonstrate the proposed approach achieves new benchmarks. Ablation studies confirm the efficacy of the multimodal framework, and few-shot experiments further underscore meta-perceptual potential. This work addresses the challenges of perceptual data and training costs, building a human–robot bridge (H2R Bridge) for semantic communication, and is expected to facilitate proactive HRC and further integration of large models in industrial applications. | Keywords: | Few-shot learning Human–robot collaboration Intent recognition Vision-language models |
Publisher: | Elsevier | Journal: | Journal of manufacturing systems | ISSN: | 0278-6125 | DOI: | 10.1016/j.jmsy.2025.03.016 |
| Appears in Collections: | Journal/Magazine Article |
Show full item record
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.



