Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/105554
PIRA download icon_1.1View/Download Full Text
Title: GGP : glossary guided post-processing for word embedding learning
Authors: Yang, R 
Cao, J 
Wen, Z 
Issue Date: 2020
Source: In Twelfth International Conference on Language Resources and Evaluation: May 11-16 , 2020, Palais du Pharo, Marseille, France, Conference proceedings, p. 4726-4730. Paris, France: ELRA – European Language Resources Association, 2020
Abstract: Word embedding learning is the task to map each word into a low-dimensional and continuous vector based on a large corpus. To enhance corpus based word embedding models, researchers utilize domain knowledge to learn more distinguishable representations via joint optimization and post-processing based models. However, joint optimization based models require much training time. Existing post-processing models mostly consider semantic knowledge while learned embedding models show less functional information. Glossary is a comprehensive linguistic resource. And in previous works, the glossary is usually used to enhance the word representations via joint optimization based methods. In this paper, we post-process pre-trained word embedding models with incorporating the glossary and capture more topical and functional information. We propose GGP (Glossary Guided Post-processing word embedding) model which consists of a global post-processing function to fine-tune each word vector, and an auto-encoding model to learn sense representations, furthermore, constrains each post-processed word representation and the composition of its sense representations to be similar. We evaluate our model by comparing it with two state-of-the-art models on six word topical/functional similarity datasets, and the results show that it outperforms competitors by an average of 4.1% across all datasets. And our model outperforms GloVe by more than 7%.
Keywords: Word embedding
Post-processing model
Representation learning
Publisher: Association for Computational Linguistics (ACL)
ISBN: 979-10-95546-34-4
Description: 12th International Conference on Language Resources and Evaluation (LREC 2020), May 11-16 , 2020, Marseille, France
Rights: © European Language Resources Association (ELRA), licensed under CC-BY-NC
The following publication Ruosong Yang, Jiannong Cao, and Zhiyuan Wen. 2020. GGP: Glossary Guided Post-processing for Word Embedding Learning. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 4726–4730, Marseille, France. European Language Resources Association is available at https://aclanthology.org/2020.lrec-1.581.
Appears in Collections:Conference Paper

Files in This Item:
File Description SizeFormat 
2020.lrec-1.581.pdf243.68 kBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Access
View full-text via PolyU eLinks SFX Query
Show full item record

Page views

86
Last Week
6
Last month
Citations as of Nov 30, 2025

Downloads

14
Citations as of Nov 30, 2025

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.