Research interests (last updated Aug 2020)

Put briefly, my interested are centered around the development of representations for the meaning of words, phrases and documents ("natural language") that can be acquired from corpora (or at least from normal language users). I am interested in such representations from three main angles: (a) various linguistic and psycholinguistic phenomena (such as ambiguity, semantic relations, inference, and cognitive processing cost); (b) how language data can provide insight into processes of (human) negotiation and knowledge transfer (notably in computational social sciences and humanities); (c) how unstructured language data can be transformed to, and interact with, structured formats such as knowledge graphs.

The following paragraphs sketch the most important research directions I follow and link to the most recent (or most canonical) papers.

Meaning Representations

Distributional Modeling of Word Meaning and Semantic Relations

Historically, I started out with developing a framework for building vector spaces from dependency graphs for tasks such as synonymy detection and prediction of priming effects, following the intuition that these models can outperform pure bag-of-words models (Padó and Lapata CL 2007). We then used these dependency-based models for the representation of selectional preferences. The resulting models can rival the performance of deep semantic models for modelling selectional preferences while being much easier to construct (Erk, Padó, and Padó CL 2010).

I'm also interested in cross-lingual distributional models, and have investigated strategies to induce bilingual vector spaces from comparable corpora which can be used to translate semantic models (like the selectional preference models) into new languages. A surprisingly nice result is that this translation can profit not only from cross-lingual synonymy (=translation) but also from "looser" semantic relations (Peirsman and Padó TSNLP 2011). Later, I looked into "translating" distributional models using just bilingual dictionaries (Utt and Padó TACL 2014). Lately, we have worked on multilingual data in the context of semantic frame identification (see below).

Linguistic Aspects

Semantics and Morphology

In about 2012 I discovered my interest in (morphological) derivation, which is quite prominent in German. Derivation is situated at the boundary between morphology and semantics and there is an interesting area to explore there with analysis methods from both sides. We have built DErivBase, a large derivational lexicon for German (Zeller, Snajder and Padó ACL 1013). We have added information about semantic (in-)transparency to the resource, finding that both lexical (distributional) information and structural (paradigmatic) information contribute (Zeller, Padó, and Snajder COLING 2014). We have linked the difficulty of predicting the meaning of derived words to aspects of their argument structure (Padó et al. COLING 2016) and have worked on disambiguating novel nominalizations (Lapesa et al. Word Structure 2018).

Discourse and Document Level Properties

Lexical knowledge also interfaces in interesting ways with structures at the discourse and document levels. We performed a corpus-based investigation of the concept of "formal vs. informal" address (tu/vous, tú/usted, du/Sie, etc.) in in English literature. Since English does not overtly mark formality, we took a bilingual approach (Faruqui and Padó EACL 2012). A second task I have worked on is the detection of reported speech ("quotations"), for which we developed a probabilistic model (Scheible et al. ACL 2016) and a corpus-agnostic neural model (Papay and Padó RANLP 2019). Finally, I have done some work on native language identification, i.e. given some English text, determine what the writer's first language was (Stehwien and Padó IJCOL 2016).

Compositionality and Polysemy

An early study I contributed to looked at the meaning of predicates is influenced by its arguments, which we modeled by way of expectations (Erk and Padó EMNLP 2008).

An interesting alternative to look at word meaning in context is to characterize it in terms of single-word substitutions that are only possible in particular contexts (the "lexical substitution" paradigm). We have also constructed a large, "all-words" lexical substitution corpus for English (Kremer et al. 2014) and showed that substitution data is helpful for sense discrimination (Alagic et al. AAAI 2018).

Another lasting problem is polysemy (that is, systematic sense ambiguity), as opposed to homonymy (idiosyncratic sense ambiguity). We found that the homonymy/polysemy distinction can be made fairly well on the basis of ontological information from WordNet and CoreLex (Utt and Padó IWCS 2011). We were also able to show that the regular polysemy can be learned from corpora and represented in a distributional model (Boleda et al. *SEM 2012).

Psycholinguistics of Semantic Processing

My interest in meaning and ambiguity also extends to the psycholinguistic side. In an investigation of the time course of metonymic sentence interpretation, we have found effects from subject (actor) choice that are consistent with a primarily world knowledge-based interpretation process (Zarcone et al. J. Cog Sci 2014) or at least with an interaction between lexical and worled knowledge (Zarcone et al. Frontiers in Psychology 2017).

Another aspect that I'm interested in is the relationship between the distributional models from computational linguistics and grounded cognition (Ziemke et al. TopiCS 2014).

Semantic Lexicons and Semantic Roles

Even though the paradigm of inducing semantic lexical-knowledge from corpora is our best shot at large-scale resource building, it has its own share of problems. There may be fundamental disagreement on what semantic dimensions to use for the description of meaning, be it with respect to frameworks for semantic role annotation (Ellsworth et al. LREC 2006) or general-purpose semantic verb classifications (Culo, Erk, Padó and Schulte im Walde LRE 2008). Formal semantic analysis can also be useful in the context of Semantic Web / Information Extraction (Augenstein et al. ESWC 2012).

Following up on my PhD thesis, where I worked on cross-lingual projection of frame-semantic information (Padó and Lapata JAIR 2009), I have a renewed interest in the analysis of frame-semantic frames with regard to their ability to capture paraphrases within languages (Sikos and Padó Constructions and Frames 2018) and across languages (Sikos and Padó WS 2018).

Text and Understanding

I am involved in two broad application areas where NLP is mainly an enabling technology to investigate domain-specific questions: Digital Humanities and Computational political Science.

Digital Humanities

The first focus of my work in this direction is emotions. We have investigated implicit (event-based) emotions (Troiano et al. ACL 2019), the relationship between literary genres and emotions (Kim et al. LaTeCH/CLfL 2017), and structured analysis that captures emotions together with experiencers, causes, and stimuli (Klinger et al. 2020).

The second focus is historical data. We have performed various semantic analyses on historical data, such as article segmentation in historical newspapers (Betz et al. LaTeCH-CLfL 2019), emotions in travel writing (Ehrlicher et al. Liinc 2019), and addressed the specific challenges of historical texts such as named entity recognition on noisy OCR data (Riedl and Padó 2019).

Computational Political Science

In political science, I have worked on NLP methods to automatically construct discourse networks from newspaper texts. This is a relatively complex span/relation extraction task (Padó et al. ACL 2019). Nevertheless, automatic support can speed up annotation, and the core network is surprisingly easy to predict (Haunss et al. Politics & Governance 2020). A remaining issue is however fairness, even just with regard to frequency (Dayanik and Padó ACL 2020).

Semantic Processing

Knowledge graphs and relation extraction

In order to representation multiple layers of linguistic analysis (e.g., syntax and semantics) we have developed an OWL-DL model that addresses granularity, reliability, and supports comfortable querying (Burchardt et al. LiLT 2008).

More recently, I have looked at the interface between distributional and symbolic knowledge and found that distributional representations are surprisingly good at predicting very fine-grained properties of entities, such as the GDP of countries (Gupta et al. EMNLP 2015).

Textual Entailment

The framework of "Textual Entailment" (TE) tries to cast the semantic processing needs of NLP applications in terms of common sense entailment decisions. We have approached Machine Translation evaluation through TE (Padó et al. MT 2009) and have defined a generic platform that supports various kinds of approaches to computing TE (Padó et al. JNLE 2014). Furthermore, we have shown that a substantial number of "difficult" entailments could be solved with discourse knowledge, in particular coreference and bridging (Mirkin et al. ACL 2010)