Prepositional Polysemy through the lens of contextualized word embeddings

DOI : 10.34847/nkl.58fc5ie6 Publique
Auteur : ORCID Lauren Fonteyn

In recent years, contextualized embeddings generated by neural language models have grown extremely popular in Machine Learning community (e.g. Baroni, Dinu & Kruszewski 2014), but linguists seem generally more wary to use them. In this talk, I would like to highlight why it may be worth exploring to what extent linguists can (i) employ these models can be employed as tools, and (ii) help reveal w...hat sort of information these models capture.

The first part of this talk discusses the practical application of contextualized embeddings. Focusing on embeddings created by the Bidirectional Encoder Representations from Transformer model, also known as 'BERT' (Devlin et al. 2019), I hope to demonstrate how contextualized embeddings can help counter two types of retrieval inefficiency scenarios that may arise with purely form-based corpus queries. In the first scenario, the formal query yields a large number of hits, which contain a reasonable number of relevant examples that can be labeled and used as input for a sense disambiguation classifier. In the second scenario, the contextualized embeddings of exemplary tokens are used to retrieve more relevant examples in a large, unlabeled dataset. As a practical case study, I will focus on the English preposition into (e.g. She got into her car / I'm so into you).

Subsequently, I will briefly turn to the question of whether these models can be employed as analytical tools to study meaning. In the second part (Part II) of this talk, I will focus on the principled polysemy model of the English preposition over as proposed by Tyler & Evans (2001) to investigate whether the sense network that emerges from this theoretical model of meaning representation can be reconstructed by BERT.

What emerges from these explorations is that BERT clearly captures fine-grained, local semantic similarities between tokens. Even with an entirely unsupervised application of BERT, discrete, coherent token groupings can be discerned that correspond relatively well with the sense categories proposed by linguists. Furthermore, embeddings of over also clearly encode information about conceptual domains, as concrete, spatial uses of prepositions are neatly distinguished from more abstract,metaphorical extensions (into the conceptual domain of time, or other non-spatial domains). However, there are no indications that BERT embeddings also encode information about the abstract image schema resemblances between tokens across those domains. These findings highlight the fact that such imagistic similarities may not be straightforwardly captured in contextualized. Such findings can provide an interesting basis for further experimental research (testing to what extent different operational models of meaning representation are complementary when assessed against elicited behavioral data), as well as a discussion on how we can bring about a "greater crossfertilization of theoretical and computational approaches" to the study of meaning (Boleda 2020: 213; Baroni & Lenci 2011).

Baroni, Marco, Georgiana Dinu & Germán Kruszewski. 2014. Don't count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 238–247.
Boleda, Gemma. 2020. Distributional Semantics and Linguistic Theory. Annual Review of Linguistics 6(1). 213–234.
Devlin, Jacob, Ming-Wei Chang, Kenton Lee & Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL-HLT 2019. 4171–4186.
Tyler, Andrea & Vyvyan Evans. 2001. Reconsidering Prepositional Polysemy Networks: The Case of Over. Language 77(4). 724–765.

Fichier  
Visualisation

ID : 10.34847/nkl.58fc5ie6/b92984f514cc7a6ac142375fb7f0bee13c7a7fe9

Url d'intégration : https://api.nakala.fr/embed/10.34847/nkl.58fc5ie6/b92984f514cc7a6ac142375fb7f0bee13c7a7fe9

Url de téléchargement : https://api.nakala.fr/data/10.34847/nkl.58fc5ie6/b92984f514cc7a6ac142375fb7f0bee13c7a7fe9

Licence
Creative Commons Attribution Non Commercial Share Alike 4.0 International (CC-BY-NC-SA-4.0)
Citer
Fonteyn, Lauren (2021) «Prepositional Polysemy through the lens of contextualized word embeddings» [Audiovisual] NAKALA. https://doi.org/10.34847/nkl.58fc5ie6
Déposée par Revue OpenEdition CogniTextes le 02/03/2021
nakala:title xsd:string Anglais Prepositional Polysemy through the lens of contextualized word embeddings
nakala:creator ORCID Lauren Fonteyn
nakala:created xsd:string 2020-12-18
nakala:type xsd:anyURI Vidéo
nakala:license xsd:string Creative Commons Attribution Non Commercial Share Alike 4.0 International (CC-BY-NC-SA-4.0)
dcterms:description xsd:string Anglais In recent years, contextualized embeddings generated by neural language models have grown extremely popular in Machine Learning community (e.g. Baroni, Dinu & Kruszewski 2014), but linguists seem generally more wary to use them. In this talk, I would like to highlight why it may be worth exploring to what extent linguists can (i) employ these models can be employed as tools, and (ii) help reveal what sort of information these models capture.

The first part of this talk discusses the practical application of contextualized embeddings. Focusing on embeddings created by the Bidirectional Encoder Representations from Transformer model, also known as 'BERT' (Devlin et al. 2019), I hope to demonstrate how contextualized embeddings can help counter two types of retrieval inefficiency scenarios that may arise with purely form-based corpus queries. In the first scenario, the formal query yields a large number of hits, which contain a reasonable number of relevant examples that can be labeled and used as input for a sense disambiguation classifier. In the second scenario, the contextualized embeddings of exemplary tokens are used to retrieve more relevant examples in a large, unlabeled dataset. As a practical case study, I will focus on the English preposition into (e.g. She got into her car / I'm so into you).

Subsequently, I will briefly turn to the question of whether these models can be employed as analytical tools to study meaning. In the second part (Part II) of this talk, I will focus on the principled polysemy model of the English preposition over as proposed by Tyler & Evans (2001) to investigate whether the sense network that emerges from this theoretical model of meaning representation can be reconstructed by BERT.

What emerges from these explorations is that BERT clearly captures fine-grained, local semantic similarities between tokens. Even with an entirely unsupervised application of BERT, discrete, coherent token groupings can be discerned that correspond relatively well with the sense categories proposed by linguists. Furthermore, embeddings of over also clearly encode information about conceptual domains, as concrete, spatial uses of prepositions are neatly distinguished from more abstract,metaphorical extensions (into the conceptual domain of time, or other non-spatial domains). However, there are no indications that BERT embeddings also encode information about the abstract image schema resemblances between tokens across those domains. These findings highlight the fact that such imagistic similarities may not be straightforwardly captured in contextualized. Such findings can provide an interesting basis for further experimental research (testing to what extent different operational models of meaning representation are complementary when assessed against elicited behavioral data), as well as a discussion on how we can bring about a "greater crossfertilization of theoretical and computational approaches" to the study of meaning (Boleda 2020: 213; Baroni & Lenci 2011).

Baroni, Marco, Georgiana Dinu & Germán Kruszewski. 2014. Don't count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 238–247.
Boleda, Gemma. 2020. Distributional Semantics and Linguistic Theory. Annual Review of Linguistics 6(1). 213–234.
Devlin, Jacob, Ming-Wei Chang, Kenton Lee & Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL-HLT 2019. 4171–4186.
Tyler, Andrea & Vyvyan Evans. 2001. Reconsidering Prepositional Polysemy Networks: The Case of Over. Language 77(4). 724–765.
dcterms:language xsd:string anglais
dcterms:subject xsd:string AFLiCo
xsd:string CogniTextes
xsd:string Lecture Series
xsd:string Cognitive linguistics
xsd:string BERT