Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/117478
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorDepartment of Computing-
dc.creatorde Souza, JVen_US
dc.creatorAmamou, Hen_US
dc.creatorChen, Ren_US
dc.creatorSalari, Een_US
dc.creatorGubelmann, Ren_US
dc.creatorNiklaus, Cen_US
dc.creatorSerpa, Ten_US
dc.creatorde Freitas Lima, MMen_US
dc.creatorPinto, PTen_US
dc.creatorKshirsagar, Sen_US
dc.creatorDavoust, Aen_US
dc.creatorHandschuh, Sen_US
dc.creatorAvila, ARen_US
dc.date.accessioned2026-02-26T03:46:05Z-
dc.date.available2026-02-26T03:46:05Z-
dc.identifier.issn0104-6500en_US
dc.identifier.urihttp://hdl.handle.net/10397/117478-
dc.language.isoenen_US
dc.publisherSpringerOpenen_US
dc.rightsCopyright (c) 2025 José Victor de Souza, Hazem Amamou, Rubing Chen, Elmira Salari, Reto Gubelmann, Christina Niklaus, Talita Serpa, Marcela Marques de Freitas Lima, Paula Tavares Pinto, Shruti Kshirsagar, Alan Davoust, Siegfried Handschuh, Anderson Raymundo Avilaen_US
dc.rightsThis work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/).en_US
dc.rightsThe following publication de Souza, J. V., Amamou, H., Chen, R., Salari, E., Gubelmann, R., Niklaus, C., Serpa, T., Lima, M. M. de F., Pinto, P. T., Kshirsagar, S., Davoust, A., Handschuh, S., & Avila, A. R. (2025). Cross-Lingual Keyword Extraction for Pesticide Terminology in Brazilian Portuguese and English. Journal of the Brazilian Computer Society, 31(1), 972-989 is available at https://doi.org/10.5753/jbcs.2025.5815.en_US
dc.subjectBERT embeddingsen_US
dc.subjectMultilingual extractionen_US
dc.subjectPesticidesen_US
dc.subjectWord alignmenten_US
dc.titleCross-lingual keyword extraction for pesticide terminology in Brazilian Portuguese and Englishen_US
dc.typeJournal/Magazine Articleen_US
dc.identifier.spage972en_US
dc.identifier.epage989en_US
dc.identifier.volume31en_US
dc.identifier.issue1en_US
dc.identifier.doi10.5753/jbcs.2025.5815en_US
dcterms.abstractAgriculture plays a crucial role in Brazil's economy. As the country intensifies its activities in the sector, the use of pesticides also increases. Hence, the risks associated with pesticide-laden food consumption have become a concern for chemistry researchers. An issue affecting regulatory standardization of pesticides in Brazil is the difficulty in translating pesticide names, particularly from English. For example, the word malathion can be translated from English to Portuguese as malatiom or malatião, resulting in inconsistent labeling. This issue extends to the broader problem of translating highly technical terms between languages, in particular for low-resource languages. In this work, we investigate terminological variation in the chemistry of organophosphorus pesticides. Our goal is to study strategies for domain-specific multilingual keyword extraction. To that end, two corpora were built based on pesticide-related scientific documents in Brazilian Portuguese and English, which led to a total of 84 and 210 texts, respectively, representing the low- and high-resource languages in this study. We then assessed 6 methods for keyword extraction: Simple Maths, TF-IDF, YAKE, TextRank, MultipartiteRank, and KeyBERT. We relied on a multilingual contextual BERT embedding to retrieve corresponding pesticide names in the target language. Fine-tuning was also explored to improve the multilingual representation further. Moreover, we evaluated the use of large language models (LLMs) combined with the recent retrieval-augmented generation (RAG) framework. As a result, we found that the contextual approach, combined with fine-tuning, provided the best results, contributing to enhancing Pesticide Terminology Extraction in a multilingual scenario.-
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitationJournal of the Brazilian Computer Society, 17 Jan. 2025, v. 31, no. 1, p. 972-989en_US
dcterms.isPartOfJournal of the Brazilian Computer Societyen_US
dcterms.issued2025-01-17-
dc.identifier.scopus2-s2.0-105019700300-
dc.identifier.eissn1678-4804en_US
dc.description.validate202602 bcch-
dc.description.oaVersion of Recorden_US
dc.identifier.FolderNumberOA_Scopus/WOS-
dc.description.fundingSourceOthersen_US
dc.description.fundingTextThis research was funded, in part, by the São Paulo Research Foundation (FAPESP), processes 2021/08830-9 and 2019/14752-0, the National Council for Scientific and Technological Development (CNPq), process 130524/2021-2, and the Leading House for the Latin American Region (University of St. Gallen).en_US
dc.description.pubStatusPublisheden_US
dc.description.oaCategoryCCen_US
Appears in Collections:Journal/Magazine Article
Files in This Item:
File Description SizeFormat 
5815-Article Text-32536-1-10-20251009.pdf1.08 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.