Research interests (last updated Dec 2016)

Put briefly, my interested are centered around the development of representations for the meaning of natural language words, phrases and documents that can be acquired from corpora (or at least from language users without a linguistics degree) but are still able to account for various phenomena of language use and understanding (such as ambiguity, semantic relations, inference, and cognitive processing cost). The following paragraphs sketch the most important research directions I follow and link to the most recent (or most canonical) papers.

Distributional Modeling of Word Meaning and Semantic Relations

Historically, I started out with developing a framework for building vector spaces from dependency graphs for tasks such as synonymy detection and prediction of priming effects, following the intuition that these models can outperform pure bag-of-words models (Pado and Lapata CL 2007). We then used these dependency-based models for the representation of selectional preferences. The resulting models can rival the performance of deep semantic models for modelling selectional preferences while being much easier to construct (Erk, Pado, and Pado CL 2010). More recently, I have looked at the interface between distributional and symbolic knowledge and found that distributional representations are surprisingly good at predicting very fine-grained properties of entities, such as the GDP of countries (Gupta, Boleda, Baroni, and Pado EMNLP 2015).

I'm also interested in cross-lingual distributional models, and have investigated strategies to induce bilingual vector spaces from comparable corpora which can be used to translate semantic models (like the selectional preference models) into new languages. A surprisingly nice result is that this translation can profit not only from cross-lingual synonymy (=translation) but also from "looser" semantic relations (Peirsman and Pado TSNLP 2011). More recently, I've looked into "translating" distributional models using just bilingual dictionaries and got surprisingly good results (Utt and Pado TACL 2014).

Semantics and Morphology

A more recent interest is (morphological) derivations, which are quite prominent in German (although not as prominent as in Slavic languages). They are situated at the boundary between morphology and semantics and there is an interesting area to explore there with analysis methods from both sides. We have built DErivBase, a large derivational lexicon for German (Zeller, Snajder and Pado ACL 1013) and employed it to improve various lexical-semantic tasks (Pado, Snajder, and Zeller ACL 2013). We have added information about semantic (in-)transparency to the resource, finding that both lexical (distributional) information and structural (paradigmatic) information contribute (Zeller, Pado, and Snajder COLING 2014) and have been able to link the difficulty of predicting the meaning of derived words to aspects of their argument structure (Pado, Herbelot, Kisselew, and Snajder COLING 2016).

Discourse and Document Level Properties

Lexical knowledge also interfaces in interesting ways with structures at the discourse and document levels. In Faruqui and Pado EACL 2012 we performed a corpus-based investigation of the concept of "formal vs. informal" address (tu/vous, tú/usted, du/Sie, etc.) in literary conversations in English as a language which does not overtly mark formality. In Scheible, Klinger, and Pado ACL 2016, we follow up with a general-purpose model to detect reported speech (both direct and indirect speech).

Phenomena in Lexical Semantics: Compositionality and Polysemy

An important challenge in distributional modeling is the construction of models that are able to model the meaning not only of individual words, but of whole predicate-argument combinations -- or, of words in context, which turns out to be a closely related question. We have proposed two models for word meaning in (local) context. The first one combines the word's context-independent lexical meaning with the expectations of its context for the position the word occupies (Erk and Pado EMNLP 2008). The second one represents words as instance clouds and treats word combination as "activation" of one cloud by the other (Erk and Pado ACL 2010).

An interesting way to look at word meaning in context is to characterize it in terms of single-word substitutions that are only possible in particular contexts (the "lexical substitution" paradigm). We have also constructed a large, "all-words" lexical substitution corpus for English (Kremer, Erk, Pado, and Thater 2014).

Another lasting problem is polysemy (that is, systematic sense ambiguity), as opposed to homonymy (idiosyncratic sense ambiguity). We found that the homonymy/polysemy distinction can be made fairly well on the basis of ontological information from WordNet and CoreLex (Utt and Pado IWCS 2011). We were also able to show that the regular polysemy can be learned from corpora and represented in a distributional model (Boleda, Pado, and Utt *SEM 2012).

Semantic Psycholinguistics

My interest in meaning and ambiguity also extends to the psycholinguistic side. In an investigation of the time course of metonymic sentence interpretation, we have found effects from subject (actor) choice that are difficult to relate to a lexicon-based account and are more consistent with a primarily world knowledge-based interpretation process (Zarcone, Pado, and Lenci 2014).

Another aspect that I'm interested in is the relationship between the distributional models that we tend to develop in computational linguistics and grounded cognition. We have published some utopian (?) ideas on that in (Ziemke, Pado, and Thill 2014).

Semantic Processing and Textual Entailment

I also work in the framework of "Textual Entailment" which tries to cast the semantic processing needs of NLP applications in terms of common sense entailment decisions at the surface level. For example, we have approached Machine Translation evaluation based on textual entailment features and have been able to predict human judgments of MT quality can be predicted significantly better than surface-based methods can (Pado, Galley, Cer, Jurafsky, and Manning MT 2009).

In the context of the European project EXCITEMENT, I was concerned with the specification of a generic platform that supports various kinds of approaches to computing Textual Entailment. The goal is to achieve a level of sustainability or interoperability among systems that has been achieved in parsing a while ago but that is still elusive in semantics. The result, the EXCITEMENT Open Platform, is now available (Pado, Noh, Stern, Wang, and Zanoli JNLE 2014).

A third strand that I have worked on is the relation between Textual Entailment and discourse. We have shown that a substantial number of "difficult" entailments could be solved with discourse knowledge, in particular coreference and bridging (Mirkin, Dagan, and Pado ACL 2010). This suggests that discourse processing and entailment tasks should be better interleaved in future work.

I have done a bit of work on German Named Entity Recognition (NER). One engineering result with practical impact is a Named Entity Recognizer that is among the best-performing ones for German (Faruqui and Pado KONVENS 2010) . Another one is a shared task featuring a CC-licensed dataset for German NER built on the basis of Wikipedia (Benikova, Biemann, Kisselew, and Pado KONVENS 2014).

Semantic Lexicons and Semantic Roles

Even though the paradigm of inducing semantic lexical-knowledge from corpora is our best shot at large-scale resource building, it has its own share of problems. There may be fundamental disagreement on what semantic dimensions to use for the description of meaning, be it with respect to frameworks for semantic role annotation (Ellsworth, Erk, Kingsbury and Pado LREC 2006) or general-purpose semantic verb classifications (Culo, Erk, Pado and Schulte im Walde LRE 2008). In order to representation multiple layers of linguistic analysis (e.g., syntax and semantics) we have developed an OWL-DL model that addresses granularity, reliability, and supports comfortable querying (Burchardt, Pado, Spohr, Frank, and Heid LiLT 2008). Formal semantic analysis can also be useful in the context of Semantic Web / Information Extraction (Augenstein, Pado, and Rudolph ESWC 2012).

My PhD thesis was concerned with a topic in the area of semantic roles, namely the cross-lingual projection of frame-semantic information. Starting from the available English resource (i.e., FrameNet), I used annotation projection in parallel corpora to produce resources for languages using graph-based tree alignment (Pado and Lapata JAIR 2009). The models are fairly language-independent, but my evaluation concentrated on German and French. I also investigated a semi-supervised approach to SRL that transfers verbal predicate annotation to label nominal predicates (Pado, Pennacchiotti, and Sporleder COLING 2008).