Holographic Embeddings for Text and Graphs

Published in Brandeis University, 2024

Embeddings are everywhere in NLP. Obtaining and using word and sentence embeddings is a pre-training and fine-tuning step for many language models and downstream NLP tasks. Traditionally, natural language is embedded in Euclidean space. Besides Euclidean space, words and sentences can be embedded in hyperbolic, spherical, and other non-Euclidean spaces. Each space has unique strengths and can more naturally encode distinct features of natural language. While there is no “best” embedding set or space across all NLP tasks and domains, there are situations where certain spaces and embeddings greatly outperform others. Further, there are underexplored yet interesting uses of embeddings for which holographic embeddings are uniquely suitable. Holographic embeddings and holographic space present a way to combine existing embeddings across different spaces with perturbations to improve performance in certain tasks or supra-domains. To this end, I put forth multiple proposals for generating, experimenting with, and evaluating the performance of holographic word and sentence embeddings. I experiment with word-level and sentence-level NLP tasks as well as AMR-based and graph-based storage and retrieval tasks. This work shows the strong performance of holographic embeddings compared to existing embeddings with the added benefit of (de)compositionality. Finally, I explore how holographic embeddings can be used with linguistic theories and representations through applications to Generative Lexicon (GL) Theory and Uniform Meaning Representation (UMR).

Timothy Obiso. 2024. Holographic Embeddings for Text and Graphs. In Brandeis University 2024.

https://scholarworks.brandeis.edu/esploro/outputs/graduate/Holographic-Embeddings-for-Text-and-Graphs/9924354787101921