LGDE: Local Graph-based Dictionary Expansion

dc.contributor.authorSchindler, Juni
dc.contributor.authorJha, Sneha
dc.contributor.authorZhang, Xixuan
dc.contributor.authorBuehling, Kilian
dc.contributor.authorHeft, Annett
dc.contributor.authorBarahona, Mauricio
dc.date.accessioned2025-06-02T08:44:17Z
dc.date.available2025-06-02T08:44:17Z
dc.date.issued2025
dc.description.abstractWe present Local Graph-based Dictionary Expansion (LGDE), a method for data-driven discovery of the semantic neighbourhood of words using tools from manifold learning and network science. At the heart of LGDE lies the creation of a word similarity graph from the geometry of word embeddings followed by local community detection based on graph diffusion. The diffusion in the local graph manifold allows the exploration of the complex nonlinear geometry of word embeddings to capture word similarities based on paths of semantic association, over and above direct pairwise similarities. Exploiting such semantic neighbourhoods enables the expansion of dictionaries of pre-selected keywords, an important step for tasks in information retrieval, such as database queries and online data collection. We validate LGDE on two user-generated English-language corpora and show that LGDE enriches the list of keywords with improved performance relative to methods based on direct word similarities or co-occurrences. We further demonstrate our method through a real-world use case from communication science, where LGDE is evaluated quantitatively on the expansion of a conspiracy-related dictionary from online data collected and analysed by domain experts. Our empirical results and expert user assessment indicate that LGDE expands the seed dictionary with more useful keywords due to the manifold-learning-based similarity network.
dc.identifier.citationSchindler, J., Jha, S., Zhang, X., Buehling, K., Heft, A., & Barahona, M. (2025). LGDE: Local Graph-based Dictionary Expansion. Computational Linguistics, 1–32. https://doi.org/10.1162/coli_a_00562
dc.identifier.doi10.1162/coli_a_00562
dc.identifier.issn0891-2017
dc.identifier.issn1530-9312
dc.identifier.urihttps://www.weizenbaum-library.de/handle/id/904
dc.language.isoeng
dc.rightsopen access
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subjectdictionary expansion
dc.subjectword embeddings
dc.subjectcosine similarity
dc.subjectmanifold learning
dc.subjectgraph diffusion
dc.subjectlocal community detection
dc.titleLGDE: Local Graph-based Dictionary Expansion
dc.typeArticle
dc.type.statuspublishedVersion
dcmi.typeText
dcterms.bibliographicCitation.urlhttps://direct.mit.edu/coli/article/doi/10.1162/coli_a_00562/128944/LGDE-Local-Graph-based-Dictionary-Expansion
local.researchgroupDynamiken der digitalen Mobilisierung
local.researchgroupDigitalisierung und transnationale Öffentlichkeit
local.researchtopicDigitale Märkte und Öffentlichkeiten auf Plattformen
local.researchtopicDemokratie – Partizipation – Öffentlichkeit
Dateien
Originalbündel
Gerade angezeigt 1 - 1 von 1
Lade...
Vorschaubild
Name:
Heft_ea_LDGE-local-graph-based-dictionary.pdf
Größe:
2.96 MB
Format:
Adobe Portable Document Format
Beschreibung: