Research interests (last updated Apr 2023)
Put briefly, my interested have been centered around the development of representations for the meaning of words, phrases and documents ("natural language") that can be acquired from corpora (or at least from normal language users). I am interested in such representations -- and the models used to construct them -- from three main angles:
- (a) various linguistic and psycholinguistic phenomena (such as ambiguity, semantic relations, inference, and cognitive processing cost);
- (b) how language data can provide insight into processes of (human) negotiation and knowledge transfer (opening a window to computational social sciences and humanities);
- (c) how unstructured language data can be transformed to, and interact with, structured formats such as knowledge graphs.
The following paragraphs make these core interests more concrete and link to the most recent (or most canonical) papers.
Modeling Word Meaning and Semantic Relations
Historically, I started out with developing a "distributional" framework for building word embeddings from dependency graphs for tasks such as synonymy detection and prediction of priming effects, following the intuition that these models can outperform pure bag-of-words models (Padó and Lapata CL 2007). We then used these dependency-based models for the representation of selectional preferences (Erk, Padó, Padó CL 2010).
Given the rapid advances of word embedding models in recent years, my focus has now shifted to analysing systematic effects inherent in such models. We have looked at bias (Dayanik, Vu, Padó NEJLT 2022), at interactions between properties of tasks, data, and models (Papay, Klinger, Padó EMNLP 2021) and at representation bias in sentence embeddings -- asking what it is that such models pay attention to (Nikolaev and Padó SIGTYP 2022, Nikolaev and Padó EACL 2023, Nikolaev and Padó IWCS 2023).
I'm also interested in cross-lingual models, and have, fairly early on, investigated strategies to induce bilingual embedding spaces from comparable corpora. Back then, we found that this process can profit not only from cross-lingual synonymy (=translation) but also from "looser" semantic relations (Peirsman and Padó TSNLP 2011). Later, I looked into transferring embeddings using bilingual dictionaries (Utt and Padó TACL 2014). Lately, we have worked on multilingual data in the context of semantic frame identification (see below).
Semantics and Morphology
Derivation (build + er -> builder, build + ing -> building) is situated at the boundary between morphology and semantics and there is an interesting area to explore there with analysis methods from both sides. We have built DErivBase, a large derivational lexicon for German (Zeller, Šnajder, Padó ACL 2013). We have added information about semantic (in-)transparency to the resource, finding that both lexical (distributional) information and structural (paradigmatic) information contribute (Zeller, Padó, Šnajder COLING 2014). We have linked the difficulty of predicting the meaning of derived words to aspects of their argument structure (Padó et al. COLING 2016) and have worked on disambiguating novel nominalizations (Lapesa et al. Word Structure 2018) as well as characterizing differences between nominalization paradigms (Varvara, Lapesa, Padó Morphology 2021).
Discourse and Document Level Properties
Lexical knowledge also interfaces in interesting ways with structures at the discourse and document levels. We performed a corpus-based investigation of the concept of "formal vs. informal" address (tu/vous, tú/usted, du/Sie, etc.) in in English literature. Since English does not overtly mark formality, we took a bilingual approach (Faruqui and Padó EACL 2012). A second task I have worked on is the detection of reported speech ("quotations"), for which we developed a probabilistic model (Scheible, Klinger, Padó ACL 2016) and a corpus-agnostic neural model (Papay and Padó RANLP 2019). My more recent activities mostly center around questions from political science (see below).
Compositionality and Polysemy
An early study I contributed to looked at the meaning of predicates is influenced by its arguments, which we modeled by way of expectations (Erk and Padó EMNLP 2008) -- this can be seen as a precursor of today's contextualized embeddings.
An interesting alternative to look at word meaning in context is to characterize it in terms of single-word substitutions that are only possible in particular contexts (the "lexical substitution" paradigm). We have also constructed a large, "all-words" lexical substitution corpus for English (Kremer et al. 2014) and showed that substitution data is helpful for sense discrimination (Alagic et al. AAAI 2018).
Psycholinguistics of Semantic Processing
My interest in meaning and ambiguity also extends to the psycholinguistic side. In an investigation of the time course of metonymic sentence interpretation, we have found effects from subject (actor) choice that are consistent with a primarily world knowledge-based interpretation process (Zarcone, Padó, Lenci J. CogSci 2014) or at least with an interaction between lexical and worled knowledge (Zarcone et al. Frontiers in Psychology 2017).
Semantic Lexicons and Semantic Roles
Even though the paradigm of inducing semantic lexical-knowledge from corpora is our best shot at large-scale resource building, it has its own share of problems. There may be fundamental disagreement on what semantic dimensions to use for the description of meaning, be it with respect to frameworks for semantic role annotation (Ellsworth et al. LREC 2006) or general-purpose semantic verb classifications (Culo et al. LRE 2008). Formal semantic analysis can also be useful in the context of Semantic Web / Information Extraction (Augenstein, Rudolph, Padó ESWC 2012).
Following up on my PhD thesis, where I worked on cross-lingual projection of frame-semantic information (Padó and Lapata JAIR 2009), I have a renewed interest in the analysis of frame-semantic frames. We investigated their ability to capture monolingual paraphrases (Sikos and Padó Constructions and Frames 2018), and to transfer across languages (Sikos, Roth, Padó LiLT 2022).
Text and Understanding
I am involved in two broad application areas where NLP is mainly an enabling technology to investigate domain-specific questions: Digital Humanities and Computational Political Science.
In digital humanities, a phenomenon I have looked at is emotions with a focus on the relationship between events and emotions, taking a cue from semantic role research. Along these lines, we have elicited event-related emotions (Troiano, Klinger, Padó ACL 2019), looked at the relationship between literary genres and emotions (Kim, Klinger, Padó LaTeCH/CLfL 2017) and emotions in translation (Troiano, Klinger, Padó COLING 2020). We've also looked at emotions diachronically, analyzing travelogues over time (Ehrlicher et al. Liinc 2019).
The second focus is historical newspaper data and the technical challenges it raises. This includes article segmentation (Betz et al. LaTeCH-CLfL 2019) and named entity recognition (Riedl and Padó ACL 2019).
Computational Political Science
In political science, I have worked on NLP methods to automatically construct discourse networks from newspaper texts. This is a relatively complex span/relation extraction task (Padó et al. ACL 2019). We have created a large annotated corpus for the migration domain, DebateNet (Blokker et al. LRE 2022), and have investigated a number of technical aspects: integrating automatic and manual annotation (Haunss et al. Politics & Governance 2020), fairness (Dayanik and Padó ACL 2020), and hierachical classification (Dayanik et al. ACL Findings 2022).
I am also interested in computational models of the positoning of political parties and other actors, looking at the capabilities of different model classes (Ceron, Blokker and Padó CoNLL 2022) and at fine-grained scaling (Ceron, Nikolaev and Padó ACL Findings 2023).
Knowledge graphs and relation extraction
In order to representation multiple layers of linguistic analysis (e.g., syntax and semantics) we have developed an OWL-DL model that addresses granularity, reliability, and supports comfortable querying (Burchardt et al. LiLT 2008).
I have also looked at the interface between distributional and symbolic knowledge and found that distributional representations are surprisingly good at predicting very fine-grained properties of entities, such as the GDP of countries (Gupta et al. EMNLP 2015). We have generalized this idea to represent categories in terms of the distributions of their instances rather than in terms of the class name (Westera et al. J. CogSci 2021).In a 2022 paper, we went the opposite direction and showed how knowledge about admissible relational structure can be fed back into probabilistic graphical models, specifically linear-chain CRFs (Papay, Klinger, Padó ICLR 2022).
The framework of "Textual Entailment" (TE) tries to cast the semantic processing needs of NLP applications in terms of common sense entailment decisions. We have approached Machine Translation evaluation through TE (Padó et al. MT 2009) and have defined a generic platform that supports various kinds of approaches to computing TE (Padó et al. JNLE 2014). Furthermore, we have shown that a substantial number of "difficult" entailments could be solved with discourse knowledge, in particular coreference and bridging (Mirkin et al. ACL 2010)