Prepositional Polysemy through the lens of contextualized word embeddings

Lauren Fonteyn

doi:10.34847/nkl.58fc5ie6

Chargement

Prepositional Polysemy through the lens of contextualized word embeddings

DOI : 10.34847/nkl.58fc5ie6 Publique

Contacter le gestionnaire

Auteur :

Lauren Fonteyn

In recent years, contextualized embeddings generated by neural language models have grown extremely popular in Machine Learning community (e.g. Baroni, Dinu & Kruszewski 2014), but linguists seem generally more wary to use them. In this talk, I would like to highlight why it may be worth exploring to what extent linguists can (i) employ these models can be employed as tools, and (ii) help reveal w...hat sort of information these models capture.

The first part of this talk discusses the practical application of contextualized embeddings. Focusing on embeddings created by the Bidirectional Encoder Representations from Transformer model, also known as 'BERT' (Devlin et al. 2019), I hope to demonstrate how contextualized embeddings can help counter two types of retrieval inefficiency scenarios that may arise with purely form-based corpus queries. In the first scenario, the formal query yields a large number of hits, which contain a reasonable number of relevant examples that can be labeled and used as input for a sense disambiguation classifier. In the second scenario, the contextualized embeddings of exemplary tokens are used to retrieve more relevant examples in a large, unlabeled dataset. As a practical case study, I will focus on the English preposition into (e.g. She got into her car / I'm so into you).

Subsequently, I will briefly turn to the question of whether these models can be employed as analytical tools to study meaning. In the second part (Part II) of this talk, I will focus on the principled polysemy model of the English preposition over as proposed by Tyler & Evans (2001) to investigate whether the sense network that emerges from this theoretical model of meaning representation can be reconstructed by BERT.

What emerges from these explorations is that BERT clearly captures fine-grained, local semantic similarities between tokens. Even with an entirely unsupervised application of BERT, discrete, coherent token groupings can be discerned that correspond relatively well with the sense categories proposed by linguists. Furthermore, embeddings of over also clearly encode information about conceptual domains, as concrete, spatial uses of prepositions are neatly distinguished from more abstract,metaphorical extensions (into the conceptual domain of time, or other non-spatial domains). However, there are no indications that BERT embeddings also encode information about the abstract image schema resemblances between tokens across those domains. These findings highlight the fact that such imagistic similarities may not be straightforwardly captured in contextualized. Such findings can provide an interesting basis for further experimental research (testing to what extent different operational models of meaning representation are complementary when assessed against elicited behavioral data), as well as a discussion on how we can bring about a "greater crossfertilization of theoretical and computational approaches" to the study of meaning (Boleda 2020: 213; Baroni & Lenci 2011).

Baroni, Marco, Georgiana Dinu & Germán Kruszewski. 2014. Don't count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 238–247.
Boleda, Gemma. 2020. Distributional Semantics and Linguistic Theory. Annual Review of Linguistics 6(1). 213–234.
Devlin, Jacob, Ming-Wei Chang, Kenton Lee & Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL-HLT 2019. 4171–4186.
Tyler, Andrea & Vyvyan Evans. 2001. Reconsidering Prepositional Polysemy Networks: The Case of Over. Language 77(4). 724–765.

Fichier

Visualisation

Lauren-Fonteyn.mp4

Mots-clés

AFLiCo CogniTextes Lecture Series Cognitive linguistics BERT

Licence

Creative Commons Attribution Non Commercial Share Alike 4.0 International (CC-BY-NC-SA-4.0)

Collection

AFLiCo Lecture Series 2020

Citer

Fonteyn, Lauren (2021) «Prepositional Polysemy through the lens of contextualized word embeddings» [Audiovisual] NAKALA. https://doi.org/10.34847/nkl.58fc5ie6

Partager

Email Facebook Twitter LinkedIn

Déposée par Revue OpenEdition CogniTextes le 02/03/2021

nakala:title	xsd:string	Anglais	Prepositional Polysemy through the lens of contextualized word embeddings
nakala:creator			Lauren Fonteyn
nakala:created	xsd:string		2020-12-18
nakala:type	xsd:anyURI		Vidéo
nakala:license	xsd:string		Creative Commons Attribution Non Commercial Share Alike 4.0 International (CC-BY-NC-SA-4.0)
dcterms:description	xsd:string	Anglais	In recent years, contextualized embeddings generated by neural language models have grown extremely popular in Machine Learning community (e.g. Baroni, Dinu & Kruszewski 2014), but linguists seem generally more wary to use them. In this talk, I would like to highlight why it may be worth exploring to what extent linguists can (i) employ these models can be employed as tools, and (ii) help reveal what sort of information these models capture. The first part of this talk discusses the practical application of contextualized embeddings. Focusing on embeddings created by the Bidirectional Encoder Representations from Transformer model, also known as 'BERT' (Devlin et al. 2019), I hope to demonstrate how contextualized embeddings can help counter two types of retrieval inefficiency scenarios that may arise with purely form-based corpus queries. In the first scenario, the formal query yields a large number of hits, which contain a reasonable number of relevant examples that can be labeled and used as input for a sense disambiguation classifier. In the second scenario, the contextualized embeddings of exemplary tokens are used to retrieve more relevant examples in a large, unlabeled dataset. As a practical case study, I will focus on the English preposition into (e.g. She got into her car / I'm so into you). Subsequently, I will briefly turn to the question of whether these models can be employed as analytical tools to study meaning. In the second part (Part II) of this talk, I will focus on the principled polysemy model of the English preposition over as proposed by Tyler & Evans (2001) to investigate whether the sense network that emerges from this theoretical model of meaning representation can be reconstructed by BERT. What emerges from these explorations is that BERT clearly captures fine-grained, local semantic similarities between tokens. Even with an entirely unsupervised application of BERT, discrete, coherent token groupings can be discerned that correspond relatively well with the sense categories proposed by linguists. Furthermore, embeddings of over also clearly encode information about conceptual domains, as concrete, spatial uses of prepositions are neatly distinguished from more abstract,metaphorical extensions (into the conceptual domain of time, or other non-spatial domains). However, there are no indications that BERT embeddings also encode information about the abstract image schema resemblances between tokens across those domains. These findings highlight the fact that such imagistic similarities may not be straightforwardly captured in contextualized. Such findings can provide an interesting basis for further experimental research (testing to what extent different operational models of meaning representation are complementary when assessed against elicited behavioral data), as well as a discussion on how we can bring about a "greater crossfertilization of theoretical and computational approaches" to the study of meaning (Boleda 2020: 213; Baroni & Lenci 2011). Baroni, Marco, Georgiana Dinu & Germán Kruszewski. 2014. Don't count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 238–247. Boleda, Gemma. 2020. Distributional Semantics and Linguistic Theory. Annual Review of Linguistics 6(1). 213–234. Devlin, Jacob, Ming-Wei Chang, Kenton Lee & Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL-HLT 2019. 4171–4186. Tyler, Andrea & Vyvyan Evans. 2001. Reconsidering Prepositional Polysemy Networks: The Case of Over. Language 77(4). 724–765.
dcterms:language	xsd:string		anglais
dcterms:subject	xsd:string		AFLiCo
	xsd:string		CogniTextes
	xsd:string		Lecture Series
	xsd:string		Cognitive linguistics
	xsd:string		BERT