Publications [Google Scholar]

Current work (2026)

A Computational Analysis of Character Archetypes in the Works of Calderón de la Barca.
Journal of Computational Literary Studies, 5(1), 2026.
Allison Keith, Antonio Rojas Castro, Kerstin Jung, Hanno Ehrlicher and Sebastian Padó.
[url] [BibTeX]

AmchiBias: Measuring Stereotypical Bias in Goan Identity Groups with a Minimal Pair Dataset in English and Konkani.
In: Proceedings of the ACL StereACuLT workshop. San Diego, CA, 2026.
Michelle Barbosa, Sebastian Padó and Franziska Weeber.
[url] [BibTeX]

Beyond Marginal Distributions: A Framework to Evaluate the Representativeness of Demographic-Aligned LLMs.
In: Findings of ACL. 2026.
Tristan Williams, Franziska Weeber, Sebastian Padó and Alan Akbik.
[url] [BibTeX]

Biasing a Translation Model Toward Portuguese Variants.
In: Proceedings of PROPOR. Salvador, Brazil, 2026.
Catarina Costa and Sebastian Padó.
[url] [BibTeX]

Comparing and Modeling Argumentation in German Political Communication across Arenas.
In: Proceedings of KONVENS. Hamburg, Germany, 2026. Accepted for publication
Nina Vikhrova, Johannes Kühling, Sebastian Haunss and Sebastian Padó.
[BibTeX]

Democratizing News Recommenders: Modeling Multiple Perspectives for News Candidate Generation with VQ-VAE.
In: Proceedings of FaCCT. Montreal, Canada, 2026.
Hardy Hardy, Sebastian Padó, Amelie Wuehrl and Tanise Ceron.
[url] [BibTeX]

Diverging Transformer Predictions for Human Sentence Processing: A Comprehensive Analysis of Agreement Attraction Effects.
2026. Manuscript.
Titus von der Malsburg and Sebastian Padó.
[url] [BibTeX]

Do Political Opinions Transfer Between Western Languages? An Analysis of Unaligned and Aligned Multilingual LLMs.
In: Proceedings of EACL. Rabat, Morocco, 2026.
Franziska Weeber, Tanise Ceron and Sebastian Padó.
[url] [BibTeX]

Finding Sense in Nonsense with Generated Contexts: Perspectives from Humans and Language Models.
In: Proceedings of STARSEM. San Diego, 2026. Honorable mention for best paper.
Katrina Olsen and Sebastian Padó.
[url] [BibTeX]

Investigating features of the gracioso of Pedro Calderón de la Barca.
Revista de Humanidades Digitales, 2026.
Allison Keith, Antonio Rojas-Castro, Hanno Ehrlicher and Sebastian Padó.
[BibTeX]

One Persona, Many Cues, Different Results: How Sociodemographic Cues Impact LLM Personalization.
In: Proceedings of ACL. San Diego, CA, 2026.
Franziska Weeber, Vera Neplenbroek, Jan Batzner and Sebastian Padó.
[url] [BibTeX]

Similar, but why? A Toolkit for Explaining Text Similarity.
In: Proceedings of the EACL Demonstration Session. Rabat, Morocco, 2026.
Juri Opitz, Lucas Möller, Andrianos Michail, Sebastian Padó and Simon Clematide.
[url] [BibTeX]

Understanding the Impact of Linguistic Realization Choices on LLM Stance with Causal Tracing.
In: Proceedings of KONVENS. Hamburg, Germany, 2026. Accepted for publication
Langchen Huang, Sebastian Padó and Franziska Weeber.
[url] [BibTeX]

Journal articles [preprints provided for non open access articles]

Investigating features of the gracioso of Pedro Calderón de la Barca.
Revista de Humanidades Digitales, 2026.
Allison Keith, Antonio Rojas-Castro, Hanno Ehrlicher and Sebastian Padó.
[BibTeX]

Computational Analysis of Gender Depiction in the Comedias of Calderón de la Barca.
Journal of Computational Literary Studies, 4(1), 2025.
Allison Keith, Antonio Rojas Castro, Hanno Ehrlicher, Kerstin Jung and Sebastian Padó.
[url] [BibTeX]

Explaining Caption-Image Interactions in CLIP models with Second-Order Attributions.
Transactions on Machine Learning Research, 2025.
Lucas Möller, Pascal Tilli, Ngoc Thang Vu and Sebastian Padó.
[url] [abstract] [BibTeX]

Dual encoder architectures like Clip models map two types of inputs into a shared embedding space and predict similarities between them. Despite their wide application, it is, however, not understood how these models compare their two inputs. Common first-order feature-attribution methods explain importances of individual features and can, thus, only provide limited insights into dual encoders, whose predictions depend on interactions between features. In this paper, we first derive a second-order method enabling the attribution of predictions by any differentiable dual encoder onto feature-interactions between its inputs. Second, we apply our method to Clip models and show that they learn fine-grained correspondences between parts of captions and regions in images. They match objects across input modes and also account for mismatches. This intrinsic visual-linguistic grounding ability, however, varies heavily between object classes, exhibits pronounced out-of-domain effects and we can identify individual errors as well as systematic failure categories

Explaining Neural News Recommendation with Attributions onto Reading Histories.
ACM Transactions on Intelligent Systems and Technology, 16(1):7:1-25, 2025. Special Issue on Responsible Recommender Systems
Lucas Möller and Sebastian Padó.
[url] [abstract] [BibTeX]

An important aspect of responsible recommendation systems is the transparency of the prediction mechanisms. This is a general challenge for deep-learning-based systems such as the currently predominant neural news recommender architectures which are optimized to predict clicks by matching candidate news items against users' reading histories. Such systems achieve state-of-the-art click-prediction performance, but the rationale for their decisions is difficult to assess. At the same time, the economic and societal impact of these systems makes such insights very much desirable. In this paper, we ask the question to what extent the recommendations of current news recommender systems are actually based on content-related evidence from reading histories. We approach this question from an explainability perspective. Building on the concept of integrated gradients, we present a neural news recommender that can accurately attribute individual recommendations to news items and words in input reading histories while maintaining a top scoring click-prediction performance. Using our method as a diagnostic tool, we find that: (a), a substantial number of users' clicks on news are not explainable from reading histories, and many history-explainable items are actually skipped; (b), while many recommendations are based on content-related evidence in histories, for others the model does not attend to reasonable evidence, and recommendations stem from a spurious bias in user representations. Our code is publicly available. Our code is published at https://github.com/lucasmllr/xnrs.

Beyond prompt brittleness: Evaluating the reliability and consistency of political worldviews in LLMs.
Transactions of the Association for Computational Linguistics, 12:1378-1400, 2024.
Tanise Ceron, Neele Falk, Ana Barić, Dmitry Nikolaev and Sebastian Padó.
[url] [abstract] [BibTeX]

Due to the widespread use of large language models (LLMs) in ubiquitous systems, we need to understand whether they embed a specific 'worldview' and what these views reflect. Recent studies report that, prompted with political questionnaires, LLMs show left-liberal leanings. However, it is as yet unclear whether these leanings are reliable (robust to prompt variations) and whether the leaning is consistent across policies and political leaning. We propose a series of tests which assess the reliability and consistency of LLMs' stances on political statements based on a dataset of voting-advice questionnaires collected from seven EU countries and annotated for policy domains. We study LLMs ranging in size from 7B to 70B parameters and find that their reliability increases with parameter count. Larger models show overall stronger alignment with left-leaning parties but differ among policy programs: They evince a (left-wing) positive stance towards environment protection, social welfare state and libera l society but also (right-wing) law and order, with no consistent preferences in foreign policy and migration.

Language Learning, Representation, and Processing in Humans and Machines: Introduction to the Special Issue.
Computational Linguistics, 2024. Introduction to the Special Issue
Marianna Apidianaki, Abdellah Fourtassi and Sebastian Padó.
[url] [abstract] [BibTeX]

Large Language Models (LLMs) and humans acquire knowledge about language without direct supervision. LLMs do so by means of specific training objectives, while humans rely on sensory experience and social interaction. This parallelism has created a feeling in NLP and cognitive science that a systematic understanding of how LLMs acquire and use the encoded knowledge could provide useful insights for studying human cognition. Conversely, methods and findings from the field of cognitive science have occasionally inspired language model development. Yet, the differences in the way that language is processed by machines and humans—in terms of learning mechanisms, amounts of data used, grounding and access to different modalities—make a direct translation of insights challenging. The aim of this edited volume has been to create a forum of exchange and debate along this line of research, inviting contributions that further elucidate similarities and differences between humans and LLMs.

On the Relationship between Frames and Emotionality in Text.
Northern European Journal of Language Technology, 9(1), 2023.
Enrica Troiano, Roman Klinger and Sebastian Padó.
[url] [abstract] [BibTeX]

Emotions, which are responses to salient events, can be realized in text implicitly, for instance with mere references to facts (e.g., “That was the beginning of a long war”). Interpreting affective meanings thus relies on the readers’ background knowledge, but that is hardly modeled in computational emotion analysis. Much work in the field is focused on the word level and treats individual lexical units as the fundamental emotion cues in written communication. We shift our attention to word relations. We leverage Frame Semantics, a prominent theory for the description of predicate-argument structures, which matches the study of emotions: frames build on a “semantics of understanding” whose assumptions rely precisely on people’s world knowledge. Our overarching question is whether and to what extent the events that are represented by frames possess an emotion meaning. To carry out a large corpus-based correspondence analysis, we automatically annotate texts with emotions as well as with FrameNet frames and roles, and we analyze the correlations between them. Our main finding is that substantial groups of frames have an emotional import. With an extensive qualitative analysis, we show that they capture several properties of emotions that are purported by theories from psychology. These observations boost insights on the two strands of research that we bring together: emotion analysis can profit from the event-based perspective of frame semantics; in return, frame semantics gains a better grip of its position vis-a-vis emotions, an integral part of word meanings.

Between welcome culture and border fence: The European refugee crisis in German newspaper reports.
Language Resources and Evaluation, 57:121-153, 2023.
Nico Blokker, Andre Blessing, Erenay Dayanık, Jonas Kuhn, Sebastian Padó and Gabriella Lapesa.
[url] [abstract] [BibTeX]

Newspaper reports provide a rich source of information on the unfolding of public debates, which can serve as basis for inquiry in political science. Such debates are often triggered by critical events, which attract public attention and incite the reactions of political actors: crisis sparks the debate. However, due to the challenges of reliable annotation and modeling, few large-scale datasets with high-quality annotation are available. This paper introduces DebateNet2.0, which traces the political discourse on the 2015 European refugee crisis in the German quality newspaper taz. The core units of our annotation are political claims (requests for specific actions to be taken) and the actors who advance them (politicians, parties, etc.). Our contribution is twofold. First, we document and release DebateNet2.0 along with its companion R package, mardyR. Second, we outline and apply a Discourse Network Analysis (DNA) to DebateNet2.0, comparing two crucial moments of the policy debate on the “refugee crisis”: the migration flux through the Mediterranean in April/May and the one along the Balkan route in September/October. We guide the reader through the methods involved in constructing a discourse network from a newspaper, demonstrating that there is not one single discourse network for the German migration debate, but multiple ones, depending on the research question through the associated choices regarding political actors, policy fields and time spans.

Clasificación de Tragedias y Comedias en las Comedias Nuevas de Calderón de la Barca.
Revista de Humanidades Digitales, 7:80-103, 2022. Spanish version of Lehmann and Padó (ZfDG 2022) modulo reviewer comments
Jörg Lehmann and Sebastian Padó.
[url] [abstract] [BibTeX]

In this study, we aim at distinguishing comedies and tragedies among 112 dramas written by Calderón de la Barca, using procedures established by distri-butional semantics. Fifteen of these comedias nuevas have already been classified by qualitative re-searchers as either tragedies or comedies, respec-tively; for another 82 dramas the classification was unknown. Four independent document embedding methods are explored, which differ from each other in matrix creation and reduction, and in the calcula-tion of similarity or distance matrices. The best results –measured against the pre-established classification of these dramas–are obtained through the classifi-cation procedure that applied the strongest matrix reduction. In addition, a contrastive vocabulary anal-ysis with word embeddings is carried out, based either on word lists produced by the four tested methods, or on the log-likelihood probability distri-bution for two sub-corpora containing only dramas already determined to be comedies or tragedies. This step permits the identification of 130 terms that are each discriminative either of comedies or of tragedies. The outcome shows that the explored methods identify tragedies with greater accuracy than comedies, indicating that tragedies have more distinctive features. It also becomes apparent that one could more appropriately consider classifications such as tragedy and comedy as poles between which gradual differences can be observed, where-by the ensuing transitional area contains comedias nuevas that have been described in prior research as tragicomedias or comedias mitológicas.

Determinants of Grader Agreement: An Analysis of Multiple Short Answer Corpora.
Language Resources and Evaluation, 56:387-416, 2022.
Ulrike Padó and Sebastian Padó.
[url] [abstract] [BibTeX]

The ’short answer’ question format is a widely used tool in educational assessment, in which students write one to three sentences in response to an open question. The answers are subsequently rated by expert graders. The agreement between these graders is crucial for reliable analysis, both in terms of educational strategies and in terms of developing automatic models for short answer grading (SAG), an active research topic in NLP. This makes it important to understand the properties that inﬂuence grader agreement (such as question diﬃculty, answer length, and answer correctness). However, the twin challenges towards such an understanding are the wide range of SAG corpora in use (which diﬀer along a number of dimensions) and the hierarchical structure of potentially relevant properties (which can be located at the corpus, answer, or question levels). This article uses generalized mixed eﬀects models to analyze the eﬀect of various such properties on grader agreement in six major SAG corpora for two main assessment tasks (language and content assessment). Overall, we ﬁnd broad agreement among corpora, with a number of properties behaving similarly across corpora (e.g., shorter answers and correct answers are easier to grade). Some properties show more corpus-speciﬁc behavior (e.g., the question diﬃculty level), and some corpora are more in line with general tendencies than others. In sum, we obtain a nuanced picture of how the major short answer grading corpora are similar and dissimilar from which we derive suggestions for corpus development and analysis.

Editorial: Perspectives for Natural Language Processing between AI, Linguistics and Cognitive Science.
Frontiers in Artificial Intelligence, 2022. Editorial for the Frontiers Research Topic
Alessandro Lenci and Sebastian Padó.
[url] [BibTeX]

Classification of comedies and tragedies written in Calderón de la Barca's Comedias Nuevas.
Zeitschrift für Digitale Geisteswissenschaft, 7, 2022. English version of Lehmann and Padó (RHD 2022) modulo reviewer comments
Jörg Lehmann and Sebastian Padó.
[url] [abstract] [BibTeX]

In this study, we aim at distinguishing comedies and tragedies among 112 dramas written by Calderón de la Barca, using procedures established by distributional semantics. 15 each of these comedias nuevas have already been classified by qualitative researchers as either tragedies or comedies, respectively; for another 82 dramas the classification was unknown. Four independent document embedding methods are explored, which differ from each other in matrix creation and reduction, and in the calculation of similarity or distance matrices. The best results – measured against the pre-established classification of these dramas – are obtained through the classification procedure that applied the strongest matrix reduction. In addition, a contrastive vocabulary analysis with word embeddings is carried out, based either on word lists produced by the four tested methods, or on the log-likelihood probability distribution for two sub-corpora containing only dramas already determined to be comedies or tragedies. This step permits the identification of 130 terms that are each discriminative either of comedies or of tragedies. The outcome shows that the explored methods identify tragedies with greater accuracy than comedies, indicating that tragedies show stronger lexical cohesion. It also becomes apparent that one could more appropriately consider classifications such as ›tragedy‹ and ›comedy‹ as poles between which gradual differences can be observed, whereby the ensuing transitional area contains comedias nuevas that have been described in prior research as tragicomedias or comedias mitológicas.

Improving Multilingual Frame Identification by Estimating Frame Transferability.
Linguistic Issues in Language Technology, 19, 2022.
Jen Sikos, Michael Roth and Sebastian Padó.
[url] [abstract] [BibTeX]

A recent research direction in computational linguistics involves efforts to make the field, which used to focus primarily on English, more multilingual and inclusive. However, resource creation often remains a bottleneck for many languages, in particular at the semantic level. In this article, we consider the case of frame-semantic annotation. We investigate how to perform frame selection for annotation in a target language by taking advantage of existing annotations in different, supplementary languages, with the goal of reducing the required annotation effort in the target language. We measure success by training and testing frame identification models for the target language. We base our selection methods on measuring frame transferability in the supplementary language, where we estimate which frames will transfer poorly, and therefore should receive more annotation, in the target language. We apply our approach to English, German, and French – three languages which have annotations that are similar in size as well as frames with overlapping lexicographic definitions. We find that transferability is indeed a useful indicator and supports a setup where a limited amount of target language data is sufficient to train frame identification systems.

Bias Identification and Attribution in NLP Models With Regression and Effect Sizes.
Northern European Journal of Language Technology, 8(1), 2022.
Erenay Dayanık, Thang Vu and Sebastian Padó.
[url] [abstract] [BibTeX]

In recent years, there has been an increasing awareness that many NLP systems incorporate biases of various types (e.g., regarding gender or race) which can have significant negative consequences. At the same time, the techniques used to statistically analyze such biases are still relatively simple. Typically, studies test for the presence of a significant difference between two levels of a single bias variable (e.g., male vs. female) without aention to potential confounders, and do not quantify the importance of the bias variable. This article proposes to analyze bias in the output of NLP systems using multivariate regression models. They provide a robust and more informative alternative which (a) generalizes to multiple bias variables, (b) can take covariates into account, (c) can be combined with measures of effect size to quantify the size of bias. Jointly, these effects contribute to a more robust statistical analysis of bias that can be used to diagnose system behavior and extract informative examples. We demonstrate the benefits of our method by analyzing a range of current NLP models on one regression and one classification tasks (emotion intensity prediction and coreference resolution, respectively).

Distributional models of category concepts based on names of category members.
Cognitive Science, 45(9):e13029, 2021.
Matthijs Westera, Abhijeet Gupta, Gemma Boleda and Sebastian Padó.
[url] [abstract] [BibTeX]

Cognitive scientists have long used distributional semantic representations of categories. The predominant approach uses distributional representations of category-denoting nouns, like "city" for the category city. We propose a novel scheme that represents categories as prototypes over representations of names of its members, such as "Barcelona", "Mumbai", and "Wuhan" for the category city. This name-based representation empirically outperforms the noun-based representation on two experiments (modelling human judgments of category relatedness and predicting category membership) with particular improvements for ambiguous nouns. We discuss the model complexity of both classes of models and argue that the name-based model has superior explanatory potential with regard to concept acquisition.

Grounding Semantic Transparency In Context: A Distributional Semantic Study on German Event Nominalizations.
Morphology, 31:409-446, 2021.
Rossella Varvara, Gabriella Lapesa and Sebastian Padó.
[url] [abstract] [BibTeX]

We present the results of a large-scale corpus-based comparison of two German event nominalization patterns: deverbal nouns in -ung (e.g., die Evaluierung, 'the evaluation') and nominal infinitives (e.g., das Evaluieren, 'the evaluating'). Among the many available event nominalization patterns for German, we selected these two because they are both highly productive and challenging from the semantic point of view. Both patterns are known to keep a tight relation with the event denoted by the base verb, but with different nuances. Our study targets a better understanding of the differences in their semantic import. The key notion of our comparison is that of semantic transparency, and we propose a usage-based characterization of the relationship between derived nominals and their bases. Using methods from distributional semantics, we bring to bear two concrete measures of transparency which highlight different nuances: the first one, cosine, detects nominalizations which are semantically similar to their bases; the second one, distributional inclusion, detects nominalizations which are used in a subset of the contexts of the base verb. We find that the inclusion measure helps in characterizing the difference between the two types of nominalizations, in relation with the traditionally considered variable of relative frequency (Hay, 2001). We further benefit from our distributional analysis to frame our comparison in the broader coordinates of the inflection vs. derivation cline.

Analysis of Political Debates through Newspaper Reports: Methods and Outcomes.
Datenbank-Spektrum, 20(2), 2020.
Gabriella Lapesa, Andre Blessing, Nico Blokker, Erenay Dayanık, Sebastian Haunss, Jonas Kuhn and Sebastian Padó.
[url] [abstract] [BibTeX]

Discourse network analysis is an aspiring development in political science which analyzes political debates in terms of bipartite actor/claim networks. It aims at understanding the structure and temporal dynamics of major political debates as instances of politicized democratic decision making. We discuss how such networks can be constructed on the basis of large collections of unstructured text, namely newspaper reports. We sketch a hybrid methodology of manual analysis by domain experts complemented by machine learning and exemplify it on the case study of the German public debate on immigration in the year 2015. The first half of our article sketches the conceptual building blocks of discourse network analysis and demonstrates its application. The second half discusses the potential of the application of NLP methods to support the creation of discourse network datasets.

Integrating Manual and Automatic Annotation for the Creation of Discourse Network Data Sets.
Politics and Governance, 8(2), 2020.
Sebastian Haunss, Jonas Kuhn, Sebastian Pado, Andre Blessing, Nico Blokker, Erenay Dayanık and Gabriella Lapesa.
[url] [abstract] [BibTeX]

This article investigates the integration of machine learning in the political claim annotation workflow with the goal to partially automate the annotation and analysis of large text corpora. It introduces the MARDY annotation environment and presents results from an experiment in which the annotation quality of annotators with and without machine learning based annotation support is compared. The design and setting aim to measure and evaluate: a) annotation speed; b) annotation quality; and c) applicability to the use case of discourse network generation. While the results indicate only slight increases in terms of annotation speed, the authors find a moderate boost in annotation quality. Additionally, with the help of manual annotation of the actors and filtering out of the false positives, the machine learning based annotation suggestions allow the authors to fully recover the core network of the discourse as extracted from the articles annotated during the experiment. This is due to the redundancy which is naturally present in the annotated texts. Thus, assuming a research focus not on the complete network but the network core, an AI-based annotation can provide reliable information about discourse networks with much less human intervention than compared to the traditional manual approach.

Measuring Historical Emotions and Their Evolution: An Interdisciplinary Endeavour to Investigate The ‘Emotions of Encounter’.
Liinc Em Revista, 15(1):70-84, 2019.
Hanno Ehrlicher, Roman Klinger, Jörg Lehmann and Sebastian Padó.
[url] [abstract] [BibTeX]

The empirical study of emotions in Spanish travelogues and reports requires cultural knowledge as well as the use of linguistic annotation and quantitative methods. We report on an interdisciplinary project in which we perform emotion annotation on a selection of texts spanning several centuries to analyze the differences across different time slices. We show that indeed the emotional connotation changes qualitatively and quantitatively. Next to this evaluation, we sketch strategies for future automation. This scalable reading approach combines quantitative with qualitative insights and identifies developments over time that call for deeper investigation.

Disambiguation of newly derived nominalizations in context: A Distributional Semantics approach.
Word Structure, 11(3):315-350, 2018.
Gabriella Lapesa, Lea Kawaletz, Ingo Plag, Marios Andreou, Max Kisselew and Sebastian Padó.
[url] [BibTeX]

FrameNet's 'Using' Relation As Source of Concept-driven Paraphrases.
Constructions and Frames, 10(1):38-60, 2018. Preprint at https://nlpado.de/~sebastian/pub/papers/cf18_sikos.pdf
Jennifer Sikos and Sebastian Padó.
[url] [BibTeX]

Complement Coercion: The Joint Effects of Type and Typicality .
Frontiers in Psychology, 8:1987, 2017.
Alessandra Zarcone, Ken McRae, Alessandro Lenci and Sebastian Padó.
[url] [BibTeX]

Native Language Identification Across Text Types: How Special Are Scientists?.
Italian Journal of Computational Linguistics, 2(1):32-45, 2016.
Sabrina Stehwien and Sebastian Padó.
[url] [BibTeX]

Design and Realization of a Modular Architecture for Textual Entailment.
Journal of Natural Language Engineering, 21(2):167-200, 2015. Preprint at https://nlpado.de/~sebastian/pub/papers/jnle13_pado.pdf
Sebastian Padó, Tae-Gil Noh, Asher Stern, Rui Wang and Roberto Zanoli.
[url] [BibTeX]

On the importance of a rich embodiment in the grounding of concepts: Perspectives from embodied cognitive science and computational linguistics.
Topics in Cognitive Science, 6(3):545-558, 2014.
Serge Thill, Sebastian Padó and Tom Ziemke.
[url] [BibTeX]

Logical metonymy resolution in a words-as-cues framework: Evidence from self-paced reading and probe recognition.
Cognitive Science, 38(5):973-996, 2014.
Alessandra Zarcone, Sebastian Padó and Alessandro Lenci.
[url] [BibTeX]

Crosslingual and Multilingual Construction of Syntax-Based Vector Space Models.
Transactions of the Association of Computational Linguistics, 2:245-258, 2014.
Jason Utt and Sebastian Padó.
[url] [BibTeX]

High-Precision Sentence Alignment by Bootstrapping from Wood Standard Annotations.
Prague Bulletin of Mathematical Linguistics, 99:5-16, 2013.
Éva Mújcricza-Majdt, Huiqin Körkel-Qu, Stefan Riezler and Sebastian Padó.
[url] [BibTeX]

Semantic relations in bilingual vector spaces.
ACM Transactions in Speech and Language Processing, 8(2):3:1-3:21, 2011. Preprint at https://nlpado.de/~sebastian/pub/papers/tslp11_peirsman.pdf
Yves Peirsman and Sebastian Padó.
[url] [BibTeX]

A Flexible, Corpus-driven Model of Regular and Inverse Selectional Preferences.
Computational Linguistics, 36(4):723-763, 2010.
Katrin Erk, Sebastian Padó and Ulrike Padó.
[url] [BibTeX]

Cross-lingual Annotation Projection of Role-semantic Information.
Artificial Intelligence Research, 36:307-340, 2009.
Sebastian Padó and Mirella Lapata.
[url] [BibTeX]

Measuring Machine Translation Quality as Semantic Equivalence: A Metric based on Entailment Features.
Machine Translation, 23(2--3):181-193, 2009. Preprint at https://nlpado.de/~sebastian/pub/papers/mt09_pado.pdf
Sebastian Padó, Daniel Cer, Michel Galley, Christopher D. Manning and Daniel Jurafsky.
[url] [BibTeX]

Comparing and Combining Semantic Verb Classifications.
Language Resources and Evaluation, 42(3):161-199, 2008. Preprint at https://nlpado.de/~sebastian/pub/papers/lre08_culo.pdf
Oliver Čulo, Katrin Erk, Sebastian Padó and Sabine Schulte im Walde.
[url] [BibTeX]

Formalising Multi-layer Corpora in OWL DL.
Linguistic Issues in Language Technology, 1(1):1-33, 2008.
Aljoscha Burchardt, Sebastian Padó, Dennis Spohr, Anette Frank and Ulrich Heid.
[url] [BibTeX]

Dependency-based Construction of Semantic Spaces.
Computational Linguistics, 33(2):161-199, 2007.
Sebastian Padó and Mirella Lapata.
[url] [BibTeX]

Conference papers

Biasing a Translation Model Toward Portuguese Variants.
In: Proceedings of PROPOR. Salvador, Brazil, 2026.
Catarina Costa and Sebastian Padó.
[url] [BibTeX]

Generalizability of Media Frames: Corpus creation and analysis across countries.
In: Proceedings of STARSEM. Suzhou, China, 2025.
Agnese Daffara, Sourabh Dattawad, Sebastian Padó and Tanise Ceron.
[url] [BibTeX]

Interpretable Text Embeddings and Text Similarity Explanation: A Survey.
In: Proceedings of EMNLP. Suzhou, China, 2025.
Juri Opitz, Lucas Möller, Andrianos Michail, Sebastian Padó and Simon Clematide.
[url] [BibTeX]

A Computation Analysis of Character Archetypes in the Works of Calderón de la Barca.
In: Annual Conference of Computational Literary Studies. Cracow, Poland, 2025.
Allison Keith, Antonio Rojas Castro, Kerstin Jung, Hanno Ehrlicher and Sebastian Padó.
[url] [BibTeX]

Approximate Attributions for Off-the-Shelf Siamese Transformers.
In: Proceedings of EACL. St Julian's, Malta, 2024.
Lucas Möller, Dmitry Nikolaev and Sebastian Padó.
[url] [BibTeX]

Automatic Analysis of Political Debates and Manifestos: Successes and Challenges.
In: Proceedings of the 1st International Conference on Robust Argumentation Machines, volume 14638, series Lecture Notes in Computer Science. Springer, Bielefeld, Germany, 2024.
Tanise Ceron, Ana Barić, André Blessing, Sebastian Haunss, Jonas Kuhn, Gabriella Lapesa, Sebastian Padó, Sean Papay and Patricia Zauchner.
[url] [BibTeX]

Multi-Dimensional Machine Translation Evaluation: Model Evaluation and Resource for Korean.
In: Proceedings of LREC-COLING. Torino, Italy, 2024.
Dojun Park and Sebastian Padó.
[url] [BibTeX]

Towards Understanding the Relationship between In-context Learning and Compositional Generalization.
In: Proceedings of LREC-COLING. Torino, Italy, 2024.
Sungjun Han and Sebastian Padó.
[url] [BibTeX]

How can business-to-business salespeople get out more of their social media posts?.
In: Proceedings of the Annual Conference of the European Marketing Academy. Bucharest, Romania, 2024.
Marla-Sophie Schmid, Christina Kühnl, Florian Omiecienski and Sebastian Padó.
[url] [BibTeX]

Media Bias Detection Across Families of Language Models.
In: Proceedings of NAACL. Mexico City, Mexico, 2024.
Iffat Maab, Edison Marrese-Taylor, Sebastian Padó and Yutaka Matsuo.
[url] [BibTeX]

Toeing the party line: Election manifestos as a key to understand political discourse on Twitter.
In: Findings of EMNLP. Miami, FL, 2024.
Maximilian Maurer, Tanise Ceron, Sebastian Padó and Gabriella Lapesa.
[url] [BibTeX]

Representation biases in sentence transformers.
In: Proceedings of EACL. Dubrovnik, Croatia, 2023.
Dmitry Nikolaev and Sebastian Padó.
[url] [abstract] [BibTeX]

Variants of the BERT architecture specialised for producing full-sentence representations often achieve better performance on downstream tasks than sentence embeddings extracted from vanilla BERT. However, there is still little understanding of what properties of inputs determine the properties of such representations. In this study, we construct several sets of sentences with pre-defined lexical and syntactic structures and show that SOTA sentence transformers have a strong nominal-participant-set bias: cosine similarities between pairs of sentences are more strongly determined by the overlap in the set of their noun participants than by having the same predicates, lengthy nominal modifiers, or adjuncts. At the same time, the precise syntactic-thematic functions of the participants are largely irrelevant.

Political claim identification and categorization in a multilingual setting: First experiments.
In: Proceedings of KONVENS. Ingolstadt, Germany, 2023.
Urs Zaberer, Sebastian Padó and Gabriella Lapesa.
[url] [abstract] [BibTeX]

The identification and classification of political claims is an important step in the analysis of political newspaper reports; however, resources for this task are few and far between. This paper explores different strategies for the cross-lingual projection of political claims analysis. We conduct experiments on a German dataset, DebateNet2.0, covering the policy debate sparked by the 2015 refugee crisis. Our evaluation involves two tasks (claim identification and categorization), three languages (German, English, and French) and two methods (machine translation -- the best method in our experiments -- and multilingual embeddings).

Multilingual estimation of political-party positioning: From label aggregation to long-input Transformers.
In: Proceedings of EMNLP. Singapore, 2023.
Dmitry Nikolaev, Tanise Ceron and Sebastian Padó.
[url] [abstract] [BibTeX]

Scaling analysis is a technique in computational political science that assigns a political actor (e.g. politician or party) a score on a predefined scale based on a (typically long) body of text (e.g. a parliamentary speech or an election manifesto). For example, political scientists have often used the left--right scale to systematically analyse political landscapes of different countries. NLP methods for automatic scaling analysis can find broad application provided they (i) are able to deal with long texts and (ii) work robustly across domains and languages. In this work, we implement and compare two approaches to automatic scaling analysis of political-party manifestos: label aggregation, a pipeline strategy relying on annotations of individual statements from the manifestos, and long-input-Transformer-based models, which compute scaling values directly from raw text. We carry out the analysis of the Comparative Manifestos Project dataset across 41 countries and 27 languages and find that the task can be efficiently solved by state-of-the-art models, with label aggregation producing the best results.

The argument-adjunct distinction in BERT: A FrameNet-based investigation.
In: Proceedings of IWCS. Nancy, France, 2023.
Dmitry Nikolaev and Sebastian Padó.
[url] [abstract] [BibTeX]

The distinction between arguments and adjuncts is a fundamental assumption of several linguistic theories. In this study, we investigate to what extent this distinction is picked up by a Transformer-based language model. We use BERT as a case study, operationalizing arguments and adjuncts as core and non-core FrameNet frame elements, respectively, and tying them to activations of particular BERT neurons. We present evidence, from English and Korean, that BERT learns more dedicated representations for arguments than for adjuncts when fine-tuned on the FrameNet frame-identification task. We also show that this distinction is already present in a weaker form in the vanilla pre-trained model.

Additive manifesto decomposition: A policy domain aware method for understanding party positioning.
In: Findings of ACL. Toronto, Canada, 2023.
Tanise Ceron, Dmitry Nikolaev and Sebastian Padó.
[url] [abstract] [BibTeX]

Automatic extraction of party (dis)similarities from texts such as party election manifestos or parliamentary speeches plays an increasing role in computational political science. How- ever, existing approaches are fundamentally limited to targeting only global party (dis)- similarity: they condense the relationship be- tween a pair of parties into a single figure, their similarity. In aggregating over all policy do- mains (e.g., health or foreign policy), they do not provide any qualitative insights into which domains parties agree or disagree on. This paper proposes a workflow for estimat- ing policy domain aware party similarity that overcomes this limitation. The workflow cov- ers (a) definition of suitable policy domains; (b) automatic labeling of domains, if no man- ual labels are available; (c) computation of domain-level similarities and aggregation at a global level; (d) extraction of interpretable party positions on major policy axes via mul- tidimensional scaling. We evaluate our work- flow on manifestos from the German federal elections. We find that our method (a) yields high correlation when predicting party similar- ity at a global level and (b) provides accurate party-specific positions, even with automati- cally labelled policy domains.

The Universe of Utterances According to BERT.
In: Proceedings of IWCS. Nancy, France, 2023.
Dmitry Nikolaev and Sebastian Padó.
[url] [abstract] [BibTeX]

It has been argued that BERT ``rediscovers the traditional NLP pipeline'', with lower layers extracting morphosyntactic features and higher layers creating holistic sentence-level representations. In this paper, we critically examine this assumption through a principle-component-guided analysis, extracing sets of inputs that correspond to specific activation patterns in BERT sentence representations. We find that even in higher layers, the model mostly picks up on a variegated bunch of low-level features, many related to sentence complexity, that presumably arise from its specific pre-training objectives.

Adverbs, surprisingly.
In: Proceedings of STARSEM. Toronto, Canada, 2023.
Dmitry Nikolaev, Collin Baker, Miriam R. L. Petruck and Sebastian Padó.
[url] [abstract] [BibTeX]

This paper begins with the premise that adverbs are neglected in computational linguistics. This view derives from two analyses: a literature review and a novel adverb dataset to probe a state-of-the-art language model, thereby uncovering systematic gaps in accounts for adverb meaning. We suggest that using Frame Semantics for characterizing word meaning, as in FrameNet, provides a promising approach to adverb analysis, given its ability to describe ambiguity, semantic roles, and null instantiation.

An Attribution Method for Siamese Encoders.
In: Proceedings of EMNLP. Singapore, 2023.
Lucas Möller, Dmitry Nikolaev and Sebastian Padó.
[url] [abstract] [BibTeX]

Despite the success of Siamese encoder models such as sentence transformers (ST), little is known about the aspects of inputs they pay attention to. A barrier is that their predictions cannot be attributed to individual features, as they compare two inputs rather than processing a single one. This paper derives a local attribution method for Siamese encoders by generalizing the principle of integrated gradients to models with multiple inputs. The solution takes the form of feature-pair attributions, and can be reduced to a token--token matrix for STs. Our method involves the introduction of integrated Jacobians and inherits the advantageous formal properties of integrated gradients: it accounts for the model's full computation graph and is guaranteed to converge to the actual prediction. A pilot study shows that in an ST few token-pairs can often explain large fractions of predictions, and it focuses on nouns and verbs. For accurate predictions, it however needs to attend to the majority of tokens and parts of speech.

Improving Neural Political Statement Classification with Class Hierarchical Information.
In: Findings of ACL, pages 2367-2382. Dublin, Ireland, 2022.
Erenay Dayanık, Andre Blessing, Nico Blokker, Sebastian Haunss, Jonas Kuhn, Gabriella Lapesa and Sebastian Padó.
[url] [abstract] [BibTeX]

Many tasks in text-based computational social science (CSS) involve the classification of political statements into categories based on a domain-specific codebook. In order to be useful for CSS analysis, these categories must be fine-grained. The typically skewed distribution of fine-grained categories, however, results in a challenging classification problem on the NLP side. This paper proposes to make use of the hierarchical relations among categories typically present in such codebooks:e.g., markets and taxation are both subcategories of economy, while borders is a subcategory of security. We use these ontological relations as prior knowledge to establish additional constraints on the learned model, thusimproving performance overall and in particular for infrequent categories. We evaluate several lightweight variants of this intuition by extending state-of-the-art transformer-based textclassifiers on two datasets and multiple languages. We find the most consistent improvement for an approach based on regularization.

Optimizing text representations to capture (dis)similarity between political parties.
In: Proceedings of CoNLL, pages 325-338. Abu Dhabi, UAE, 2022.
Tanise Ceron, Nico Blokker and Sebastian Padó.
[url] [abstract] [BibTeX]

Even though fine-tuned neural language models have been pivotal in enabling “deep” automatic text analysis, optimizing text representations for specific applications remains a crucial bottleneck. In this study, we look at this problem in the context of a task from computational social science, namely modeling pairwise similarities between political parties. Our research question is what level of structural information is necessary to create robust text representation, contrasting a strongly informed approach (which uses both claim span and claim category annotations) with approaches that forgo one or both types of annotation with document structure-based heuristics. Evaluating our models on the manifestos of German parties for the 2021 federal election. We find that heuristics that maximize within-party over between-party similarity along with a normalization step lead to reliable party similarity prediction, without the need for manual annotation.

Distributional Analysis of Polysemous Function Words.
In: Proceedings of the 13th International Tbilisi Symposium on Language, Logic and Computation 2019, volume 13206, series Lecture Notes in Computer Science. Springer, 2022.
Sebastian Padó and Daniel Hole.
[url] [abstract] [BibTeX]

In this paper, we are concerned with the phenomenon of function word polysemy. We adopt the framework of distributional semantics, which characterizes word meaning by observing occurrence contexts in large corpora and which is in principle well situated to model polysemy. Nevertheless, function words were traditionally considered as impossible to analyze distributionally due to their highly flexible usage patterns. We establish that contextualized word embeddings, the most recent generation of distributional methods, offer hope in this regard. Using the German reflexive pronoun 'sich' as an example, we find that contextualized word embeddings capture theoretically motivated word senses for 'sich' to the extent to which these senses are mirrored systematically in linguistic usage.

Constraining Linear-chain CRFs to Regular Languages.
In: Proceedings of ICLR. 2022. Long video presentation at https://www.youtube.com/watch?v=iVH5-cHWaiE
Sean Papay, Roman Klinger and Sebastian Padó.
[url] [abstract] [BibTeX]

A major challenge in structured prediction is to represent the interdependencies within output structures. When outputs are structured as sequences, linear-chain conditional random fields (CRFs) are a widely used model class which can learn local dependencies in the output. However, the CRF's Markov assumption makes it impossible for CRFs to represent distributions with nonlocal dependencies, and standard CRFs are unable to respect nonlocal constraints of the data (such as global arity constraints on output labels). We present a generalization of CRFs that can enforce a broad class of constraints, including nonlocal ones, by specifying the space of possible output structures as a regular language L. The resulting regular-constrained CRF (RegCCRF) has the same formal properties as a standard CRF, but assigns zero probability to all label sequences not in L. Notably, RegCCRFs can incorporate their constraints during training, while related models only enforce constraints during decoding. We prove that constrained training is never worse than constrained decoding, and show empirically that it can be substantially better in practice. Additionally, we demonstrate a practical benefit on downstream tasks by incorporating a RegCCRF into a deep neural model for semantic role labeling, exceeding state-of-the-art results on a standard dataset.

New Domain, Major Effort? How Much Data is Necessary to Adapt a Temporal Tagger To the Voice Assistant Domain.
In: Proceedings of IWCS, pages 144-154. Online, 2021.
Touhidul Alam, Alessandra Zarcone and Sebastian Padó.
[url] [abstract] [BibTeX]

Reliable tagging of Temporal Expressions (TEs, e.g., Book a table at L’Osteria for Sunday evening) is a central requirement for Voice Assistants (VAs). However, there is a dearth of resources and systems for the VA domain, since publicly-available temporal taggers are trained only on substantially different domains, such as news and clinical text. Since the cost of annotating large datasets is prohibitive, we investigate the trade-off between in-domain data and performance in DA-Time, a hybrid temporal tagger for the English VA domain which combines a neural architecture for robust TE recognition, with a parser-based TE normalizer. We find that transfer learning goes a long way even with as little as 25 in-domain sentences: DA-Time performs at the state of the art on the news domain, and substantially outperforms it on the VA domain.

RiQuA: A Corpus of Rich Quotation Annotation for English Literary Text.
In: Proceedings of LREC, pages 835-841. Online, 2020.
Sean Papay and Sebastian Padó.
[url] [abstract] [BibTeX]

We introduce RiQuA (RIch QUotation Annotations), a corpus that provides quotations, including their interpersonal structure (speakers and addressees) for English literary text. The corpus comprises 11 works of 19th-century literature that were manually doubly annotated for direct and indirect quotations. For each quotation, its span, speaker, addressee, and cue are identified (if present). This provides a rich view of dialogue structures not available from other available corpora. We detail the process of creating this dataset, discuss the annotation guidelines, and analyze the resulting corpus in terms of inter-annotator agreement and its properties. RiQuA, along with its annotations guidelines and associated scripts, are publicly available for use, modification, and experimentation.

Lost in Backtranslation: Emotion Preservation in Neural Machine Translation.
In: Proceedings of COLING. Online, 2020.
Enrica Troiano, Roman Klinger and Sebastian Padó.
[url] [abstract] [BibTeX]

Machine translation provides powerful methods to convert text between languages, and is therefore a technology enabling a multilingual world. An important part of communication, however, takes place at the non-propositional level (e.g., politeness, formality, emotions), and it is far from clear whether current MT methods properly translate this information. This paper investigates the specific hypothesis that the non-propositional level of emotions is at least partially lost in MT. We carry out a number of experiments in a back-translation setup and establish that (1) emotions are indeed partially lost during translation; (2) this tendency can be reversed almost completely with a simple re-ranking approach informed by an emotion classifier, taking advantage of diversity in the n-best list; (3) the re-ranking approach can also be applied to change emotions, obtaining a model for emotion style transfer. An in-depth qualitative analysis reveals that there are recurring linguistic changes through which emotions are toned down or amplified, such as change of modality.

Masking Actor Information Leads to Fairer Political Claims Detection.
In: Proceedings of ACL, pages 4385-4391. Online, 2020.
Erenay Dayanık and Sebastian Padó.
[url] [abstract] [BibTeX]

A central concern in Computational Social Sciences (CSS) is fairness: where the role of NLP is to scale up text analysis to large corpora, the quality of automatic analyses should be as independent as possible of textual properties. We analyze the performance of a state-of-the-art neural model on the task of political claims detection (i.e., the identification of forward-looking statements made by political actors) and identify a strong frequency bias: claims made by frequent actors are recognized better. We propose two simple debiasing methods which mask proper names and pronouns during training of the model, thus removing personal information bias. We find that (a) these methods significantly decrease frequency bias while keeping the overall performance stable; and (b) the resulting models improve when evaluated in an out-of-domain setting.

Dissecting Span Identification Tasks with Performance Prediction.
In: Proceedings of EMNLP, pages 4881–4895. Online, 2020.
Sean Papay, Roman Klinger and Sebastian Padó.
[url] [abstract] [BibTeX]

Span identification (in short, span ID) tasks such as chunking, NER, or code switching, ask models to identify and classify relevant spans in a text. Despite being a staple of NLP, and sharing a common structure, there is little insight on how their properties influence their difficulty, and thus little guidance on what model families work well on span ID tasks, and why. We analyze span ID tasks via performance prediction, estimating how well neural architectures do on different tasks. Our contributions are: (a) we identify key properties of span ID tasks that can inform performance prediction; (b) we carry out a large-scale experiment on English data, building a model to predict performance for unseen span ID tasks that can support architecture choices; (c), we investigate the parameters of the meta model, yielding new insights on how model and task properties interact to affect span ID performance. We find, e.g., that span frequency is especially important for LSTMs, and that CRFs help when spans are infrequent and boundaries non-distinctive.

DEbateNet-mig15: Tracing the 2015 Immigration Debate in Germany Over Time.
In: Proceedings of LREC, pages 919-927. Online, 2020.
Gabriella Lapesa, Andre Blessing, Nico Blokker, Erenay Dayanık, Sebastian Haunss, Jonas Kuhn and Sebastian Padó.
[url] [abstract] [BibTeX]

DEbateNet-migr15 is a manually annotated dataset for German which covers the public debate on immigration in 2015. The building block of our annotation is the political science notion of a claim, i.e., a statement made by a political actor (a politician, a party, or a group of citizens) that a specific action should be taken (e.g., vacant flats should be assigned to refugees). We identify claims in newspaper articles, assign them to actors and fine-grained categories and annotate their polarity and date. The aim of this paper is two-fold: first, we release the full DEbateNet-mig15 corpus and document it by means of a quantitative and qualitative analysis; second, we demonstrate its application in a discourse network analysis framework, which enables us to capture the temporal dynamics of the political debate.

An Environment for the Relational Annotation of Political Debates.
In: Proceedings of ACL System Demonstrations. Florence, Italy, 2019.
André Blessing, Nico Blokker, Sebastian Haunss, Jonas Kuhn, Gabriella Lapesa and Sebastian Padó.
[url] [abstract] [BibTeX]

This paper describes the MARDY corpus annotation environment developed for a collaboration between political science and computational linguistics. The tool realizes the complete workflow necessary for annotating a large newspaper text collection with rich information about claims (demands) raised by politicians and other actors, including claim and actor spans, relations, and polarities. In addition to the annotation GUI, the tool supports the identification of relevant documents, text pre-processing, user management, integration of external knowledge bases, annotation comparison and merging, statistical analysis, and the incorporation of machine learning models as "pseudo-annotators".

Who Sides With Whom? Towards Computational Construction of Discourse Networks for Political Debates.
In: Proceedings of ACL. Florence, Italy, 2019.
Sebastian Padó, André Blessing, Nico Blokker, Erenay Dayanık, Sebastian Haunss and Jonas Kuhn.
[url] [abstract] [BibTeX]

Understanding the structures of political debates (which actors make what claims) is essential for understanding democratic political decision-making. The vision of computational construction of such discourse networks from newspaper reports brings together political science and natural language processing. This paper presents three contributions towards this goal: (a) a requirements analysis, linking the task to knowledge base population; (b) a first release of an annotated corpus of claims on the topic of migration, based on German newspaper reports; (c) initial modeling results.

Crowdsourcing and Validating Event-focused Emotion Corpora for German and English.
In: Proceedings of ACL. Florence, Italy, 2019.
Enrica Troiano, Sebastian Padó and Roman Klinger.
[url] [abstract] [BibTeX]

Sentiment analysis has a range of corpora available across multiple languages. For emotion analysis, the situation is more limited, which hinders potential research on cross-lingual modeling and the development of predictive models for other languages. In this paper, we fill this gap for German by constructing deISEAR, a corpus designed in analogy to the well-established English ISEAR emotion dataset. Motivated by Scherer's appraisal theory, we implement a crowdsourcing experiment which consists of two steps. In step 1, participants create descriptions of emotional events for a given emotion. In step 2, five annotators assess the emotion expressed by the texts. We show that transferring an emotion classification model from the original english ISEAR to the German crowdsourced deISEAR via machine translation does not, on average, cause a performance drop.

Quotation Detection and Classification with a Corpus-Agnostic Model.
In: Proceedings of RANLP. Varna, Bulgaria, 2019.
Sean Papay and Sebastian Padó.
[url] [abstract] [BibTeX]

The detection of quotations (i.e., reported speech, thought, and writing) has established itself as an NLP analysis task. However, state-of-the-art models have been developed on the basis of specific corpora and incorporate a high degree of corpus-specific assumptions and knowledge, which leads to fragmentation. In the spirit of task-agnostic modeling, we present a corpus-agnostic neural model for quotation detection and evaluate it on three corpora that vary in language, text genre, and structural assumptions. The model (a) approaches the state-of-the-art on the corpora when using established feature sets and (b) shows reasonable performance even when using solely word forms, which makes it applicable for non-standard (e.g., historical) corpora.

Frame Identification as Categorization: Exemplars vs Protoypes in Embeddingland.
In: Proceedings of IWCS, pages 295-306. Gothenburg, Sweden, 2019.
Jennifer Sikos and Sebastian Padó.
[url] [abstract] [BibTeX]

Categorization is a central capability of human cognition, and a number of theories have been developed to account for properties of categorization. Even though many tasks in semantics also involve categorization of some kind, theories of categorization do not play a major role in contemporary research in computational linguistics. This paper follows the idea that embedding-based models of semantics lend themselves well to being formulated in terms of classical categorization theories. The benefit is a space of model families that enables (a) the formulation of hypotheses about the impact of major design decisions, and (b) a transparent assessment of these decisions. We instantiate this idea on the task of frame-semantic frame identification. We define four models that cross two design variables: (a) the choice of prototype vs. exemplar categorization, corresponding to different degrees of generalization applied to the input, and (b) the presence vs. absence of a fine-tuning step, corresponding to generic vs. task-adaptive categorization. We find that for frame identification, generalization and task-adaptive categorization both yield substantial benefits. Our prototype-based, fine-tuned model, which combines the best choices over these variables, establishes a new state-of-the-art in frame identification.

Text-based Joint Prediction of Numeric and Categorical Attributes of Entities in Knowledge Bases.
In: Proceedings of RANLP. Varna, Bulgaria, 2019.
V Thejas, Abhijeet Gupta and Sebastian Padó.
[url] [abstract] [BibTeX]

Collaboratively constructed knowledge bases play an important role in information systems, but are essentially always incomplete. Thus, a large number of models has been developed for Knowledge Base Completion, the task of predicting new attributes of entities given partial descriptions of these entities. Virtually all of these models either concentrate on numeric attributes (Italy,GDP,2TE) or they concentrate on categorical (Tim Cook,chairman,Apple). In this paper, we propose a simple feed-forward neural architecture to jointly predict numeric and categorical attributes based on embeddings learned from textual occurrences of the entities in question. Following insights from multi-task learning, our hypothesis is that due to the correlations among attributes of different kinds, joint prediction improves over separate prediction. Our experiments on seven FreeBase domains show that this hypothesis is true of the two attribute types: we find substantial improvements for numeric attributes in the joint model, while performance remains largely unchanged for categorical attributes. Our analysis indicates that this is the case because categorical attributes, many of which describe membership in various classes, provide useful 'background knowledge' for numeric prediction, while this is true to a lesser degree in the inverse direction.

Lexical Substitution for Evaluating Compositional Distributional Models.
In: Proceedings of NAACL, pages 206-211. New Orleans, LA, 2018.
Maja Buljan, Sebastian Padó and Jan Šnajder.
[url] [abstract] [BibTeX]

Compositional Distributional Semantic Models (CDSMs) model the meaning of phrases and sentences in vector space. They have been predominantly evaluated on limited, artificial tasks such as semantic sentence similarity on hand-constructed datasets. This paper argues for lexical substitution as a means to evaluate CDSMs. Lexical substitution is a more natural task, enables us to evaluate meaning composition at the level of individual words, and provides a common ground to compare CDSMs with dedicated lexical substitution models. We create a lexical substitution dataset for CDSM evaluation from an English-language corpus with manual “all-words” lexical substitution annotation. Our experiments indicate that the Practical Lexical Function CDSM outperforms simple component-wise CDSMs and performs on par with the context2vec lexical substitution model using the same context.

Leveraging Lexical Substitutes for Unsupervised Word Sense Induction.
In: Proceedings of AAAI. New Orleans, LA, 2018.
Domagoj Alagić, Jan Šnajder and Sebastian Padó.
[url] [abstract] [BibTeX]

Word sense induction is the most prominent unsupervised approach to lexical disambiguation. It clusters word instances, typically represented by their bag-of-words contexts. Therefore, uninformative and ambiguous contexts present a major challenge. In this paper, we investigate the use of an alternative instance representation based on lexical substitutes, i.e., contextually suitable, meaning-preserving replacements. Using lexical substitutes predicted by a state-of-the-art automatic system and a simple clustering algorithm, we outperform bag-of-words instance representations and compete with much more complex structured probabilistic models. Furthermore, we show that an oracle based on manually-labeled lexical substitutes yields yet substantially higher performance. Taken together, this provides evidence for a complementarity between word sense induction and lexical substitution that has not been given much consideration before.

A Named Entity Recognition Shootout for German.
In: Proceedings of ACL, pages 120-125. Melbourne, Australia, 2018.
Martin Riedl and Sebastian Padó.
[url] [abstract] [BibTeX]

We ask how to practically build a model for German named entity recognition (NER) that performs at the state of the art for both contemporary and historical texts, i.e., a big-data and a small-data scenario. The two best-performing model families are pitted against each other (linear-chain CRFs and BiLSTM) to observe the trade-off between expressiveness and data requirements. BiLSTM outperforms the CRF when large datasets are available and performs inferior for the smallest dataset. BiLSTMs profit substantially from transfer learning, which enables them to be trained on multiple corpora, resulting in a new state-of-the- art model for German NER on two contemporary German corpora (CoNLL 2003 and GermEval 2014) and two historic corpora.

DERE: A task and domain-independent slot filling framework for declarative relation extraction.
In: Proceedings of EMNLP. Brussels, Belgium, 2018.
Heike Adel, Laura Ana Maria Bostan, Sean Papay, Sebastian Padó and Roman Klinger..
[url] [abstract] [BibTeX]

Most machine learning systems for natural language processing are tailored to specific tasks. As a result, comparability of models across tasks is missing and their applicability to new tasks is limited. This affects end users without machine learning experience as well as model developers. To address these limitations, we present DERE, a novel framework for declarative specification and compilation of template- based information extraction. It uses a generic specification language for the task and for data annotations in terms of spans and frames. This formalism enables the representation of a large variety of natural language processing challenges. The backend can be instantiated by dif- ferent models, following different paradigms. The clear separation of frame specification and model backend will ease the implementation of new models and the evaluation of different models across different tasks. Furthermore, it simplifies transfer learning, joint learning across tasks and/or domains as well as the assessment of model generalizability. DERE is available as open-source software.

Integrating lexical-conceptual and distributional semantics: a case report.
In: Proceedings of the Amsterdam Colloquium, pages 75-84. Amsterdam, The Netherlands, 2017.
Tillmann Pross, Antje Rossdeutscher, Gabriella Lapesa and Sebastian Padó.
[url] [abstract] [BibTeX]

By means of a case study on German verbs prefixed with the preposition über (‘over’) we compare alternation-based lexical-conceptual and usage-based distributional approaches to verb meaning. Our investigation supports the view that when distributional vectors are rendered human-interpretable by approximation of their representation with its nearest neighbour words in the semantic vector space, they reflect conceptual commonalities be- tween verbs similar to those targeted in lexical-conceptual semantics. Moreover, our case study shows that distributional representations reveal conceptual features of verb meaning that are di cult if not impossible to detect and represent in theoretical frameworks of lexical semantics and thus that a general theory of word meaning requires a combination and complementation of lexical and distributional methods.

Modeling Derivational Morphology in Ukrainian.
In: Proceedings of IWCS. Montpellier, France, 2017.
Mariia Melymuka, Gabriella Lapesa, Max Kisselew and Sebastian Padó.
[url] [abstract] [BibTeX]

We report on a study applying compositional distributional semantic models (CDSMs) to a set of Ukrainian derivational patterns. Ukrainian is an interesting language as it is morphologically rich, and low-resource. Our study aims at resolving inconsistent results from previous studies which employed CDSMs for derivation; we provide evidence for a cross-lingual advantage of CBOW over NMF representations, as well as a simple additive over a lexical function model. In addition, we present two case studies in which we test the capabilities of CDSMs to deal with pattern-level ambiguity and apply the same CDSMs to inflectional patterns.

Living a discrete life in a continuous world: Reference in cross-modal entity tracking.
In: Proceedings of IWCS. Montpellier, France, 2017.
Gemma Boleda, Sebastian Padó, Nghia The Pham and Marco Baroni.
[url] [abstract] [BibTeX]

Reference is a crucial property of language that allows us to connect linguistic expressions to the world. Modeling it requires handling both continuous and discrete aspects of meaning. Data-driven models excel at the former, but struggle with the latter, and the reverse is true for symbolic models. This paper (a) introduces a concrete referential task to test both aspects, called cross-modal entity tracking; (b) proposes a neural network architecture that uses external memory to build an entity library: On being presented with exposures of multimodal, distributed entities combined with attributes, the model learns which exposures refer to the same underlying entities and to aggregate the information present in these exposures. Our model shows promise: it beats traditional neural network architectures on the task. However, it is still outperformed by Memory Networks, another model with external memory.

Are doggies cuter than dogs? Emotional valence and concreteness in German derivational morphology.
In: Proceedings of IWCS. Montpellier, France, 2017.
Gabriella Lapesa, Sebastian Padó, Tillmann Pross and Antje Rossdeutscher.
[url] [abstract] [BibTeX]

The semantic behavior of derivational processes has been investigated with compositional distribu- tional models relating the meaning of base, affix, and derivative (e.g., anti+capitalist → anticapitalist). While broadly successful, these approaches model how the distributional behavior generally is affected by derivation. Meanwhile, their predictions can not be interpreted at the level of linguistic regularities. In this paper, we adopt an alternative approach and focus on the impact of derivation on finer-grained semantic properties of the base. We focus on (the psycholinguistically prominent) emotional valence, i.e., the speakers’ positive/negative evaluation of the word referent. We present two case studies on German derivational patterns, combining distributional and regression analysis. We are able to establish the broad presence of valence effects in German derivation as well as strong interactions with concreteness.

Does Free Word Order Hurt? Assessing the Practical Lexical Function Model for Croatian.
In: Proceedings of STARSEM. Vancouver, BC, 2017.
Zoran Medić, Jan Šnajder and Sebastian Padó.
[url] [abstract] [BibTeX]

The Practical Lexical Function (PLF) model is a model of computational distri- butional semantics that attempts to strike a balance between expressivity and learn- ability in predicting phrase meaning and shows competitive results. We investigate how well the PLF carries over to free word order languages, given that it builds on ob- servations of predicate-argument combina- tions that are harder to recover in free word order languages. We evaluate variants of the PLF for Croatian, using a new lexical substitution dataset. We find that the PLF works about as well for Croatian as for En- glish, but demonstrate that its strength lies in modeling verbs, and that the free word order affects the less robust PLF variant.

''Show me the cup'': Reference with Continuous Representations.
In: Proceedings of CICLing. Budapest, Hungary, 2017.
Marco Baroni, Gemma Boleda and Sebastian Padó.
[url] [abstract] [BibTeX]

One of the most basic functions of language is to refer to objects in a shared scene. Modeling reference with continuous representations is challenging because it requires individuation, i.e., tracking and distinguishing an arbitrary number of referents. We introduce a neural network model that, given a definite description and a set of objects represented by natural images, points to the intended object if the expression has a unique referent, or indicates a failure, if it does not. The model, directly trained on reference acts, is competitive with a pipeline manually engineered to perform the same task, both when referents are purely visual, and when they are characterized by a combination of visual and linguistic properties.

Distributed Prediction of Relations for Entities: The Easy, The Difficult, and The Impossible.
In: Proceedings of STARSEM. Vancouver, BC, 2017.
Abhijeet Gupta, Gemma Boleda and Sebastian Padó.
[url] [abstract] [BibTeX]

Word embeddings are supposed to provide easy access to semantic relations such as “male of” (man–woman). While this claim has been investigated for concepts, little is known about the distributional behavior of relations of (Named) Entities. We de- scribe two word embedding-based models that predict values for relational attributes of entities, and analyse them. The task is challenging, with major performance dif- ferences between relations. Contrary to many NLP tasks, high difficulty for a re- lation does not result from low frequency, but from (a) one-to-many mappings; and (b) lack of context patterns expressing the relation that are easy to pick up by word embeddings.

Instances and concepts in distributional space.
In: Proceedings of EACL, pages 79-85. Valencia, Spain, 2017.
Gemma Boleda, Abhijeet Gupta and Sebastian Padó.
[url] [abstract] [BibTeX]

Instances (``Mozart'') are ontologically distinct from concepts or classes (``composer''). Natural language encompasses both, but instances have received comparatively little attention in distributional semantics. Our results show that instances and concepts differ in their distributional properties. We also establish that instantiation detection (``Mozart -- composer'') is generally easier than hypernymy detection (``chemist -- scientist''), and that results on the influence of input representation do not transfer from hyponymy to instantiation.

Predictability of Distributional Semantics in Derivational Word Formation.
In: Proceedings of COLING, pages 1285-1296. Osaka, Japan, 2016.
Sebastian Padó, Aurélie Herbelot, Max Kisselew and Jan Šnajder.
[url] [abstract] [BibTeX]

Compositional distributional semantic models (CDSMs) have successfully been applied to the task of predicting the meaning of a range of linguistic constructions. Their performance on semi-compositional word formation process of (morphological) derivation, however, has been extremely variable, with no large-scale empirical investigation to date. This paper fills that gap, performing an analysis of CDSM predictions on a large dataset (over 30,000 German derivationally related word pairs). We use linear regression models to analyze CDSM performance and obtain insights into the linguistic factors that influence how predictable the distributional context of a derived word is going to be. We identify various such factors, notably part of speech, argument structure, and semantic regularity.

Model Architectures for Quotation Detection.
In: Proceedings of ACL, pages 1736-1745. Berlin, Germany, 2016.
Christian Scheible, Roman Klinger and Sebastian Padó.
[url] [abstract] [BibTeX]

Quotation detection is the task of locating spans of quoted speech in text. The state of the art treats this problem as a sequence labeling task and employs linear-chain conditional random fields. We question the efficacy of this choice: The Markov assumption in the model prohibits it from making joint decisions about the begin, end, and internal context of a quotation. We perform an extensive analysis with two new model architectures. We find that (a), simple boundary classification combined with a greedy prediction strategy is competitive with the state of the art; (b), a semi-Markov model significantly outperforms all others, by relaxing the Markov assumption.

Smoothing Syntax-Based Semantic Spaces: Let The Winner Take It All.
In: Proceedings of KONVENS, pages 186-191. Bochum, Germany, 2016.
Sebastian Padó, Jan Šnajder, Jason Utt and Britta Zeller.
[url] [abstract] [BibTeX]

Syntax-based semantic spaces are more flexible and can potentially better model semantic relatedness than bag-of-words spaces. Their application is however limited by sparsity and restricted coverage. We address these problems by smoothing syntax-based with word-based spaces and investigate when to choose which prediction. We obtain the best results by picking the maximal predicted similarity for each word pair, taking advantage of the tendency of unreliable models to underestimate similarity. We show that smoothing can substantially improve coverage while maintaining prediction quality on two German benchmark tasks.

Improving Zero-Shot-Learning for German Particle Verbs by using Training-Space Restrictions and Local Scaling.
In: Proceedings of STARSEM. Berlin, Germany, 2016.
Maximilian Köper, Sabine Schulte im Walde, Max Kisselew and Sebastian Padó.
[url] [abstract] [BibTeX]

Recent models in distributional semantics consider derivational patterns (e.g., use → use+ful) as the result of a compositional process, where base term and affix are combined. We exploit such models for German particle verbs (PVs), and focus on the task of learning a mapping function between base verbs and particle verbs. Our models apply particle-verb motivated training-space restrictions relying on nearest neighbors, as well as recent advances from zero- shot-learning. The models improve the mapping between base terms and derived terms for a new PV derivation dataset, and also across existing derivation datasets for German and English.

Generalization in Native Language Identification - Learners versus Scientists.
In: Proceedings of CLiC-IT, pages 264-268. Trento, Italy, 2015.
Sabrina Stehwien and Sebastian Padó.
[url] [abstract] [BibTeX]

Native Language Identification (NLI) is the task of recognizing an author's native language from text in another language. In this paper, we consider three English learner corpora and one new, presumably more difficult, scientific corpus. We find that the scientific corpus is only about as hard to model as a less-controlled learner corpus, but cannot profit as much from corpus combination via domain adaptation. We show that this is related to an inherent topic bias in the scientific corpus: researchers from different countries tend to work on different topics.

Distributional vectors encode referential attributes.
In: Proceedings of EMNLP. Lisbon, Portugal, 2015.
Abhijeet Gupta, Gemma Boleda, Marco Baroni and Sebastian Padó.
[url] [abstract] [BibTeX]

Distributional methods have proven to excel at capturing fuzzy, graded aspects of meaning (Italy is more similar to Spain than to Germany). In contrast, it is difficult to extract the values of more specific attributes of word referents from distributional representations, attributes of the kind typically found in structured knowledge bases (Italy has 60 million inhabitants). In this paper, we pursue the hypothesis that distributional vectors also implicitly encode referential attributes. We show that a standard supervised regression model is in fact sufficient to retrieve such attributes to a reasonable degree of accuracy: When evaluated on the prediction of both categorical and numeric attributes of countries and cities, the model consistently reduces baseline error by 30 and is not far from the upper bound. Further analysis suggests that our model is able to "objectify" distributional representations for entities, anchoring them more firmly in the external world in measurable ways.

Obtaining a Better Understanding of Distributional Models of German Derivational Morphology.
In: Proceedings of IWCS, pages 58-63. London, UK, 2015.
Max Kisselew, Sebastian Padó, Alexis Palmer and Jan Šnajder.
[url] [abstract] [BibTeX]

Derivationally related words (read / read+er) usually have closely related meanings. It is an interesting challenge for distributional semantics to account for this relationship by predicting the meaning (represented as a vector) of a derived term (read+er) from the meaning of its base term (read). Previous work has framed this task as an instance of compositional meaning construction, but its properties are not yet well understood. Our goal is to better understand the factors influencing performance on this task via quantitative and qualitative analysis of two existing composition models on a set of German derivation patterns (e.g., -in, durch-). We begin by introducing a rank-based evaluation metric that provides a more relevant assessment of the models’ practical value and reveals the task to be challenging due to specific properties of German (compounding, capitalization). We also find that performance varies greatly between patterns and even among base-derived term pairs of the same pattern. A regression analysis shows that semantic coherence of the base and derived terms within a pattern, as well as coherence of the semantic shifts from base to derived terms, all significantly impact prediction quality. Finally, we investigate false positives, finding that different models capture complementary aspects of the semantic shifts.

Multi-Level Alignments As An Extensible Representation Basis for Textual Entailment Algorithms.
In: Proceedings of STARSEM, pages 193-198. Denver, CO, 2015.
Tae-Gil Noh, Sebastian Padó, Vered Shwartz, Ido Dagan, Vivi Nastase, Kathrin Eichler and Lili Kotlerman.
[url] [abstract] [BibTeX]

A major problem in research on Textual Entailment (TE) is the high implementation effort for TE systems. Recently, interoperable standards for annotation and preprocessing have been proposed. In contrast, the algorithmic level remains unstandardized, which makes component re-use in this area very difficult in practice. In this paper, we introduce multi-level alignments as a central, powerful representation for TE algorithms that encourages modular, reusable, multilingual algorithm development. We demonstrate that a pilot open-source implementation of multi-level alignment with minimal features competes with state-of-the-art open-source TE engines in three languages.

Combining Seemingly Incompatible Corpora for Implicit Semantic Role Labeling.
In: Proceedings of STARSEM, pages 40-50. Denver, CO, 2015.
Parvin Sadat Feizabadi and Sebastian Padó.
[url] [abstract] [BibTeX]

Implicit semantic role labeling, the task of retrieving locally unrealized arguments from wider discourse context, is a knowledge-intensive task. At the same time, the annotated corpora that exist are all small and scattered across different annotation frameworks, genres, and classes of predicates. Previous work has treated these corpora as incompatible with one another, and has concentrated on optimizing the exploitation of single corpora. In this paper, we show that corpus combination is effective after all when the differences between corpora are bridged with domain adaptation methods. When we combine the SemEval-2010 Task 10 and Gerber and Chai noun corpora, we obtain substantially improved performance on both corpora, for all roles and parts of speech. We also present new insights into the properties of the implicit semantic role labeling task.

Dissecting the Practical Lexical Function Model for Compositional Distributional Semantics.
In: Proceedings of STARSEM, pages 153-158. Denver, CO, 2015.
Abhijeet Gupta, Jason Utt and Sebastian Padó.
[url] [abstract] [BibTeX]

The Practical Lexical Function model (PLF) is a recently proposed compositional distributional semantic model which provides an elegant account of composition, striking a balance between expressiveness and robustness and performing at the state-of-the-art. In this paper, we identify an inconsistency in PLF between the objective function at training and the prediction at testing which leads to an over- counting of the predicate’s contribution to the meaning of the phrase. We investigate two possible solutions of which one (the exclusion of simple lexical vector at test time) improves performance significantly on two out of the three composition datasets.

Crowdsourcing Annotation of Non-Local Semantic Roles..
In: Proceedings of EACL, pages 226-230. Gothenburg, Sweden, 2014.
Parvin Sadat Feizabadi and Sebastian Padó.
[url] [abstract] [BibTeX]

This paper reports on a study of crowdsourcing the annotation of non-local (or implicit) frame-semantic roles, i.e., roles that are realized in the previous discourse context. We describe two annotation setups (marking and gap filling) and find that gap filling works considerably better, attaining an acceptable quality relatively cheaply. The produced data is available for research purposes.

A Language Model Sensitive to Discourse Context.
In: Proceedings of KONVENS, pages 201-206. Hildesheim, Germany, 2014.
Tae-Gil Noh and Sebastian Padó.
[url] [abstract] [BibTeX]

The paper proposes a meta language model that can dynamically incorporate the influence of wider discourse context. The model provides a conditional probability in forms of P (text|context), where the context can be arbitrary length of text, and is used to influence the probability distribution over documents. A preliminary evaluation using a 3-gram model as the base language model shows significant reductions in perplexity by incorporating discourse context.

Polysemy index for nouns: an experiment on Italian using the PAROLE SIMPLE CLIPS lexical database.
In: Proceedings of LREC, pages 2955-2963. Reykjavík, Iceland, 2014.
Francesca Frontini, Valeria Quochi, Sebastian Padó, Jason Utt and Monica Monachini.
[url] [abstract] [BibTeX]

An experiment is presented to induce a set of polysemous basic type alternations (such as ANIMAL-FOOD, or BUILDING-INSTITUTION) by deriving them from the sense alternations found in an existing lexical resource. The paper builds on previous work and applies those results to the Italian lexicon PAROLE SIMPLE CLIPS. The new results show how the set of frequent type alternations that can be induced from the lexicon is partly different from the set of polysemy relations selected and explicitly applied by lexicographers when building it. The analysis of mismatches shows that frequent type alternations do not always correspond to prototypical polysemy relations, nevertheless the proposed methodology represents a useful tool offered to lexicographers to systematically check for possible gaps in their resource.

Towards Semantic Validation of a Derivational Lexicon.
In: Proceedings of COLING, pages 1728-1739. Dublin, Ireland, 2014.
Britta Zeller, Sebastian Padó and Jan Šnajder.
[url] [abstract] [BibTeX]

Derivationally related lemmas like (friend – friendly – friendship) are derived from a common stem. Frequently, their meanings are also systematically related. However, there are also many examples of derivationally related lemma pairs whose meanings differ substantially, e.g., (object – objective). Most broad-coverage derivational lexicons do not reflect this distinction, mixing up semantically related and unrelated word pairs. In this paper, we investigate strategies to recover the above distinction by recognizing semantically related lemma pairs, a process we call semantic validation. We make two main contributions: First, we perform a detailed data analysis on the basis of a large German derivational lexicon. It reveals two promising sources of information (distributional semantics and structural information about derivational rules), but also systematic problems with these sources. Second, we develop a classification model for the task that reflects the noisy nature of the data. It achieves an improvement of 13.6%in precision and 5.8%in F1-score over a strong majority class baseline. Our experiments confirm that both information sources contribute to semantic validation, and that they are complementary enough that the best results are obtained from a combined model.

The EXCITEMENT Open Platform for Textual Inferences.
In: Proceedings of ACL (Demonstration Papers), pages 43-48. Baltimore, MD, 2014.
Ido Dagan, Omer Levy, Bernardo Magnini, Tae-Gil Noh, Sebastian Padó, Asher Stern and Roberto Zanoli.
[url] [abstract] [BibTeX]

This paper presents the Excitement Open Platform (EOP), a generic architecture and a comprehensive implementation for textual inference in multiple languages. The platform includes state-of-art algorithms, a large number of knowledge resources, and facilities for experimenting and testing innovative approaches. The EOP is distributed as an open source software.

What Substitutes Tell Us - Analysis of an 'All-Words' Lexical Substitution Corpus.
In: Proceedings of EACL, pages 540-549. Gothenburg, Sweden, 2014.
Gerhard Kremer, Katrin Erk, Sebastian Padó and Stefan Thater.
[url] [abstract] [BibTeX]

We present the first large-scale English "all-words lexical substitution" corpus. The size of the corpus provides a rich resource for investigations into word meaning. We investigate the nature of lexical substitute sets, comparing them to WordNet synsets. We find them to be consistent with, but more fine-grained than, synsets. We also identify significant differences to results for paraphrase ranking in context reported for the SEMEVAL lexical substitution data. This highlights the influence of corpus construction approaches on evaluation results.

Entailment Graphs for Text Analytics in the Excitement Project.
In: Proceedings of Text, Speech and Dialogue, pages 11-18. Brno, Czech Republic, 2014.
Bernardo Magnini, Ido Dagan, Günter Neumann and Sebastian Padó.
[url] [abstract] [BibTeX]

In the last years, a relevant research line in Natural Language Processing has focused on detecting semantic relations among portions of text, including entailment, similarity, temporal relations, and, with a less degree, causality. The attention on such semantic relations has raised the demand to move towards more informative meaning representations, which express properties of concepts and relations among them. This demand triggered research on "statement entailment graphs", where nodes are natural language statements (propositions), comprising of predicates with their arguments and modifiers, while edges represent entailment relations between nodes. We report initial research that defines the properties of entailment graphs and their potential applications. Particularly, we show how entailment graphs are profitably used in the context of the European project EXCITEMENT, where they are applied for the analysis of customer interactions across multiple channels, including speech, email, chat and social media, and multiple languages (English, German, Italian).

Derivational Smoothing for Syntactic Distributional Semantics.
In: Proceedings of ACL, pages 731-735. Sofia, Bulgaria, 2013.
Sebastian Padó, Jan Šnajder and Britta D. Zeller.
[url] [abstract] [BibTeX]

Syntax-based vector spaces are used widely in lexical semantics and are more versatile than word-based spaces (Baroni and Lenci, 2010). However, they are also sparse, with resulting reliability and coverage problems. We address this problem by derivational smoothing, which uses knowledge about derivationally related words (oldish -- old) to improve semantic similarity estimates. We develop a set of derivational smoothing methods and evaluate them on two lexical semantics tasks in German. Even for models built from very large corpora, simple derivational smoothing can improve coverage considerably.

Fitting, not clashing! A distributional semantic model of logical metonymy.
In: Proceedings of IWCS, pages 404-410. Potsdam, Germany, 2013.
Alessandra Zarcone, Alessandro Lenci, Sebastian Padó and Jason Utt.
[url] [abstract] [BibTeX]

Logical metonymy interpretation (e.g. begin the book → writing) has received wide attention in linguistics. Experimental results have shown higher processing costs for metonymic conditions compared with non-metonymic ones (read the book). According to a widely held interpretation, it is the type clash between the event-selecting verb and the entity-denoting object (begin the book) that triggers coercion mechanisms and leads to additional processing effort. We propose an alternative explanation and argue that the extra processing effort is an effect of thematic fit. This is a more economical hypothesis that does not need to postulate a separate type clash mechanism: entity- denoting objects simply have a low fit as objects of event-selecting verbs. We test linguistic datasets from psycholinguistic experiments and find that a structured distributional model of thematic fit, which does not encode any explicit argument type information, is able to replicate all significant experimental findings. This result provides evidence for a graded account of coercion phenomena in which thematic fit accounts for both the trigger of the coercion and the retrieval of the covert event.

A Textual Entailment Dataset from German Web Forum Text.
In: Proceedings of IWCS, pages 288-299. Potsdam, Germany, 2013.
Britta Zeller and Sebastian Padó.
[url] [abstract] [BibTeX]

We present the first freely available large German dataset for Textual Entailment (TE). Our dataset builds on posts from German online forums concerned with computer problems and models the task of identifying relevant posts for user queries (i.e., descriptions of their computer problems) through TE. We use a sequence of crowdsourcing tasks to create realistic problem descriptions through summarisation and paraphrasing of forum posts. The dataset is represented in RTE-5 Search task style and consists of 172 positive and over 2800 negative pairs. We analyse the properties of the created dataset and evaluate its difficulty by applying two TE algorithms and comparing the results with results on the English RTE-5 Search task. The results show that our dataset is roughly comparable to the RTE-5 data in terms of both difficulty and balancing of positive and negative entailment pairs. Our approach to create task-specific TE datasets can be transferred to other domains and languages.

DErivBase: Inducing and Evaluating a Derivational Morphology Resource for German.
In: Proceedings of ACL, pages 1201-1211. Sofia, Bulgaria, 2013.
Britta D. Zeller, Jan Šnajder and Sebastian Padó.
[url] [abstract] [BibTeX]

Derivational models are still an under-researched area in computational morphology. Even for German, a rather resource-rich language, there is a lack of large-coverage derivational knowledge. This paper describes a rule-based framework for inducing derivational families (i.e., clusters of lemmas in derivational relationships) and its application to create a high-coverage German resource, DERIVBASE, mapping over 280k lemmas into more than 17k non-singleton clusters. We focus on the rule component and a qualitative and quantitative evaluation. Our approach achieves up to 93%precision and 71%recall. We attribute the high precision to the fact that our rules are based on information from grammar books.

Building and Evaluating a Distributional Memory for Croatian.
In: Proceedings of ACL, pages 784-789. Sofia, Bulgaria, 2013.
Jan Šnajder, Sebastian Padó and vZeljko Agić.
[url] [abstract] [BibTeX]

We report on the first structured distributional semantic model for Croatian, DM.HR. It is constructed after the model of the English Distributional Memory (Baroni and Lenci, 2010), from a dependency-parsed Croatian web corpus, and covers about 2M lemmas. We give details on the linguistic processing and the design principles. An evaluation shows state-of-the- art performance on a semantic similarity task with particularly good performance on nouns. The resource is freely available.

A corpus study of clause combination.
In: Proceedings of IWCS, pages 179-190. Potsdam, Germany, 2013.
Olga Nikitina and Sebastian Padó.
[url] [abstract] [BibTeX]

We present a corpus-based investigation of cases of clause combination that can be expressed both through coordination or with subordination. We analyse the data with a two-step computational model which first distinguishes subordination from coordination and then determines the direction for cases of subordination. We find that a wide range of features help with the prediction, notably frequency of predicate participants, presence of adjuncts and sharing of participants between the clause predicates.

Regular polysemy: A distributional model.
In: Proceedings of *SEM, pages 151-160. Montréal, Canada, 2012.
Gemma Boleda, Sebastian Padó and Jason Utt.
[url] [abstract] [BibTeX]

Many types of polysemy are not word specific, but are instances of general sense alternations such as Animal-Food. Despite their pervasiveness, regular alternations have been mostly ignored in empirical computational semantics. This paper presents (a) a general framework which grounds sense alternations in corpus data, generalizes them above individual words, and allows the prediction of alternations for new words; and (b) a concrete unsupervised implementation of the framework, the Centroid Attribute Model. We evaluate this model against a set of 2,400 ambiguous words and demonstrate that it outperforms two baselines.

Corpus-based Acquisition of German Event- and Object-Denoting Nouns.
In: Proceedings of KONVENS, pages 259-263. Vienna, Austria, 2012.
Stefan Gorzitze and Sebastian Padó.
[url] [abstract] [BibTeX]

This paper presents a simple distributional method for acquiring event-denoting and object-denoting nouns from corpora. Its core is a bootstrapping cycle that alternates between acquiring new instances and new features, using a simple log odds ratio for filtering. We acquire 3000 German nouns for each class with precisions of 93%(events) and 98%(objects), respectively.

French and German corpora for audience-based text classification.
In: Proceedings of LREC 2012. Istanbul, Turkey, 2012.
Amalia Todirascu, Sebastian Padó, Max Kisselew, Jennifer Krisch and Ulrich Heid.
[url] [abstract] [BibTeX]

This paper presents some of the results of the CLASSYN project which investigated the classification of text according to audience-related text types. We describe the design principles and the properties of the French and German linguistically annotated corpora that we have created. We report on tools used to collect the data and on the quality of the syntactic annotation. The CLASSYN corpora comprise two text collections to investigate general text types difference between scientific and popular science text on the two domains of medical and computer science.

Inferring covert events in logical metonymies: a probe recognition experiment.
In: Proceeedings of CogSci. Sapporo, Japan, 2012.
Alessandra Zarcone, Alessandro Lenci and Sebastian Padó.
[abstract] [BibTeX]

It has been widely acknowledged that the interpretation of logical metonymies involves the interpretation of covert events (begin the book -> reading / writing). Whether this implicit content is part of our lexicon or rather it comes from our world knowledge, it is currently subject of debate. We present results from a probe recognition experiment, providing novel evidence in support of early metonymy processing, consistent with the hypothesis that covert events are retrieved from knowledge of typical events.

Corpus-Based Acquisition of Support Verb Constructions for Portuguese.
In: Proceedings of PROPOR 2012, pages 73-84. Coimbra, Portugal, 2012.
Britta Zeller and Sebastian Padó.
[url] [abstract] [BibTeX]

We present a resource-poor approach to automatically acquire support verb constructions (SVCs) for Portuguese with a two-stage procedure. First, we apply a cross-lingual approach with a bilingual par- allel corpus: starting with a Portuguese full verb, we use the translations into another language and the corresponding backtranslations to identify Portuguese verb-noun pairs with the same meaning. Since not all of these are SVCs, the candidates are ranked and filtered in a second, monolingual step based on association statistics. We discuss two parametrisations of our procedure for a high-precision and a high-recall setting. In our experiments, these parametrizations achieve a maximum precision of 91%and a maximum recall of 86 respectively.

Towards a model of formal and information address in English.
In: Proceedings of EACL 2012. Avignon, France, 2012.
Manaal Faruqui and Sebastian Padó.
[url] [abstract] [BibTeX]

Informal and formal (T/V) address in dialogue is not distinguished overtly in modern English, e.g. by pronoun choice like in many other languages such as French (tu/vous). Our study investigates the status of the T/V distinction in English literary texts. Our main findings are: (a) human raters can label monolingual English utterances as T or V fairly well, given sufficient context; (b), a bilingual corpus can be exploited to induce a supervised classifier for T/V without human annotation. It assigns T/V at sentence level with up to 68%accuracy, relying mainly on lexical features; (c), there is a marked asymmetry between lexical features for formal speech (which are conventionalized and therefore general) and informal speech (which are text-specific).

LODifier: Generating Linked Data from Unstructured Text.
In: Proceedings of ESWC, pages 210-224. Heraklion, Greece, 2012.
Isabelle Augenstein, Sebastian Padó and Sebastian Rudolph.
[url] [abstract] [BibTeX]

The automated extraction of information from text and its transformation into a formal description is an important goal in both Semantic Web research and computational linguistics. The extracted information can be used for a variety of tasks such as ontology generation, question answering and information retrieval. LODifier is an approach that combines deep semantic analysis with named entity recognition, word sense disambiguation and controlled Semantic Web vocabularies in order to extract named entities and relations between them from text and to convert them into an RDF representation which is linked to DBpedia and WordNet. We present the architecture of our tool and discuss design decisions made. An evaluation of the tool on a story link detection task gives clear evidence of its practical potential.

Automatic Identification of Motion Verbs in WordNet and FrameNet for Locational Inference.
In: Proceedings of KONVENS, pages 70-79. Vienna, Austria, 2012.
Parvin Sadat Feizabadi and Sebastian Padó.
[url] [abstract] [BibTeX]

This paper discusses the automatic identification of motion verbs in the context of "locations inference", that is, the recovery of unrealized location roles from discourse context, a special case of missing argument recovery. We first report on a small corpus study on verb classes for which location roles are particularly relevant. This includes motion, orientation and position verbs. Then, we discuss the automatic recognition of these verbs on the basis of WordNet and FrameNet. For FrameNet, we obtain results up to 67%F-Score.

Generalized Event Knowledge in Logical Metonymy Resolution.
In: Proceedings of CogSci 2011. Boston, MA, 2011.
Alessandra Zarcone and Sebastian Padó.
[abstract] [BibTeX]

The interpretation of logical metonymies like "begin the book" has traditionally been explained by assuming the existence of complex lexical entries containing information about event knowledge (qualia roles: "reading the book/writing the book"). Qualia structure provides concrete constraints on interpretation, which are however too rigid to be cognitively plausible. We suggest "generalized event knowledge" as an alternative source of interpretation. Results from a first self-paced reading experiment, where we capitalize on the verb-final word order in German subordinate phrases to create rich expectations for events, are presented to support this hypothesis. Consequences of this hypothesis for the interpretation logical metonymies are (a), it is primarily driven by pragmatic and world knowledge; (b), it may use the same (rather than distinct) mechanisms and resources as general incremental sentence comprehension does.

``I Thou Thee, Thou Traitor'': Predicting Formal vs. Informal Address in English Literature.
In: Proceedings of ACL/HLT 2011, pages 467-472. Portland, OR, 2011.
Manaal Faruqui and Sebastian Padó.
[url] [abstract] [BibTeX]

In contrast to many languages (like Russian or French), modern English does not distinguish formal and informal ("T/V") address overtly, for example by pronoun choice. We describe an ongoing study which investigates to what degree the T/V distinction is recoverable in English text, and with what textual features it correlates. Our findings are: (a) human raters can label English utterances as T or V fairly well, given sufficient context; (b), lexical cues can predict T/V almost at human level.

Ontology-based Distinction between Polysemy and Homonymy.
In: Proceedings of IWCS 2011. Oxford, UK, 2011.
Jason Utt and Sebastian Padó.
[url] [abstract] [BibTeX]

We consider the problem of distinguishing polysemous from homonymous nouns. This distinction is often taken for granted, but is seldom operationalized in the shape of an empirical model. We present a first step towards such a model, based on WordNet augmented with ontological classes provided by CoreLex. This model provides a polysemy index for each noun which (a), accurately distinguishes between polysemy and homonymy; (b), supports the analysis that polysemy can be grounded in the frequency of the meaning shifts shown by nouns; and (c), improves a regression model that predicts when the "one-sense-per-discourse" hypothesis fails.

Acquiring entailment pairs across languages and domains: A Data Analysis.
In: Proceedings of IWCS 2011. Oxford, UK, 2011.
Manaal Faruqui and Sebastian Padó.
[url] [abstract] [BibTeX]

Entailment pairs are sentence pairs of a premise and a hypothesis, where the premise textually entails the hypothesis. Such sentence pairs are important for the development of Textual Entailment systems. In this paper, we take a closer look at a prominent strategy for their automatic acquisition from newspaper corpora, pairing first sentences of articles with their titles. We propose a simple logistic regression model that incorporates and extends this heuristic and investigate its robustness across three languages and three domains. We manage to identify two predictors which predict entailment pairs with a fairly high accuracy across all languages. However, we find that robustness across domains within a language is more difficult to achieve.

Exemplar-Based Models for Word Meaning In Context.
In: Proceedings of ACL 2010. Uppsala, Sweden, 2010.
Katrin Erk and Sebastian Padó.
[url] [abstract] [BibTeX]

This paper describes ongoing work on distributional models for word meaning in context. We abandon the usual one-vector-per-word paradigm in favor of an exemplar model that activates only relevant occurrences. On a paraphrasing task, we find that a simple exemplar model outperforms more complex state-of-the-art models.

Training and Evaluating a German Named Entity Recognizer with Semantic Generalization.
In: Proceedings of KONVENS 2010. Saarbrücken, Germany, 2010.
Manaal Faruqui and Sebastian Padó.
[url] [abstract] [BibTeX]

We present a freely available optimized Named Entity Recognizer (NER) for German. It alleviates the small size of available NER training corpora for German with distributional generalization features trained on large unlabelled corpora. We vary the size and source of the generalization corpus and find improvements of 6%F1-score (in-domain) and 9%(out-of-domain) over simple supervised training.

Cross-lingual Induction of Selectional Preferences with Bilingual Vector Spaces.
In: Proceedings of NAACL 2010, pages 921-929. Los Angeles, CA, 2010.
Yves Peirsman and Sebastian Padó.
[url] [abstract] [BibTeX]

We describe a cross-lingual method for the induction of selectional preferences for resource-poor languages, where no accurate monolingual models are available. The method uses bilingual vector spaces to ``translate'' foreign language predicate-argument structures into a resource-rich language like English. The only prerequisite for constructing the bilingual vector space is a large unparsed corpus in the resource-poor language, although the model can profit from (even noisy) syntactic knowledge. Our experiments show that the cross-lingual predictions correlate well with human ratings, clearly outperforming monolingual baseline models.

Assessing the Role of Discourse References in Entailment Inference.
In: Proceedings of ACL 2010, pages 1209-1219. Uppsala, Sweden, 2010.
Shachar Mirkin, Ido Dagan and Sebastian Padó.
[url] [abstract] [BibTeX]

Discourse references, notably coreference and bridging, play an important role in many text understanding applications, but their impact on textual entailment is yet to be systematically understood. On the basis of an in-depth analysis of entailment instances, we argue that discourse references have the potential of substantially improving textual entailment recognition, and identify a number of research directions towards this goal.

Robust Machine Translation Evaluation with Entailment Features.
In: Proceedings of ACL 2009, pages 297-305. Singapore, 2009.
Sebastian Padó, Michel Galley, Christopher D. Manning and Daniel Jurafsky.
[url] [abstract] [BibTeX]

Existing evaluation metrics for machine translation lack crucial robustness: their correlations with human quality judgments vary considerably across languages and genres. We believe that the main reason is their inability to properly capture meaning: A good translation candidate means the same thing as the reference translation, regardless of formulation. We propose a metric that evaluates MT output based on a rich set of features motivated by textual entailment, such as lexical-semantic (in-)compatibility and argument structure overlap. We compare this metric against a combination metric of four state-of-the-art scores (BLEU, NIST, TER, and METEOR) in two different settings. The combination metric outperforms the individual scores, but is bested by the entailment-based metric. Combining the entailment and traditional features yields further improvements.

Semantic role assignment for event nominalisations by leveraging verbal data.
In: Proceedings of COLING 2008, pages 665-672. Manchester, UK, 2008.
Sebastian Padó, Marco Pennacchiotti and Caroline Sporleder.
[url] [abstract] [BibTeX]

This paper presents a novel approach to the task of semantic role labelling for event nominalisations, which make up a considerable fraction of predicates in running text, but are underrepresented in terms of training data and difficult to model. We propose to address this situation by data expansion. We construct a model for nominal role labelling solely from verbal training data. The best quality results from salvaging grammatical features where applicable, and generalising over lexical heads otherwise.

Formalising Multi-layer Corpora in OWL DL - Lexicon Modelling, Querying and Consistency Control.
In: Proceedings of IJCNLP 2008. Hyderabad, India, 2008.
Aljoscha Burchardt, Sebastian Padó, Dennis Spohr, Anette Frank and Ulrich Heid.
[url] [abstract] [BibTeX]

We present a general approach to formally modelling corpora with multi-layered annotation, thereby inducing a lexicon model in a typed logical representation language, OWL DL. This model can be interpreted as a graph structure that offers flexible querying functionality beyond current XML-based query languages and powerful methods for consistency control. We illustrate our approach by applying it to the syntactically and semantically annotated SALSA/TIGER corpus.

A Structured Vector Space Model for Word Meaning in Context.
In: Proceedings of EMNLP 2008, pages 897-906. Honolulu, HI, 2008.
Katrin Erk and Sebastian Padó.
[url] [abstract] [BibTeX]

We address the task of computing vector space representations for the meaning of word occurrences, which can vary widely according to context. This task is a crucial step towards a robust, vector-based compositional account of sentence meaning. We argue that existing models for this task do not take syntactic structure sufficiently into account. We present a novel structured vector space model that addresses these issues by incorporating the selectional preferences for argument positions. This makes it possible to integrate syntax into the computation of word meaning in context. In addition, the model performs at and above the state of the art for modeling the contextual adequacy of paraphrases.

Flexible, corpus-based modelling of Human Plausibility Judgments.
In: Proceedings of EMNLP/CoNLL 2007, pages 400-409. Prague, Czech Republic, 2007.
Sebastian Padó, Ulrike Padó and Katrin Erk.
[url] [abstract] [BibTeX]

In this paper, we consider the computational modelling of human plausibility judgements for verb-relation-argument triples, a task equivalent to the computation of selectional preferences. Such models have applications both in psycholinguistics and in computational linguistics. By extending a recent model, we obtain a completely corpus-driven model for this task which achieves significant correlations with human judgements. It rivals or exceeds deeper, resource-driven models while exhibiting higher coverage. Moreover, we show that our model can be combined with deeper models to obtain better predictions than from either model alone.

Annotation précise du français en sémantique de rôles par projection cross-linguistique.
In: Actes de TALN 2007. Toulouse, France, 2007.
Sebastian Padó and Guillaume Pitel.
[url] [abstract] [BibTeX]

Dans le paradigme FrameNet, cet article aborde le problème de l'annotation précise et automatique de rôles sémantiques dans langue sans lexique FrameNet existant. Nous évaluons la méthode proposée par Padó et Lapata (2005, 2006), fondé sur la projection de rôles et appliqué initialement à la paire anglais-allemand. Nous testons sa généralisabilité du point de vue (a) des langues, en l'appliquant à la paire (anglais-français) et (b) de la qualité de la source, en utilisant une annotation automatique du côté anglais. Les expériences montrent des résultats à la hauteur de ceux obtenus pour l'allemand, nous permettant de conclure que cette approche présente un grand potentiel pour réduire la quantité de travail nécessaire à la création de telles ressources dans de nombreuses langues.

Shalmaneser - a flexible toolbox for semantic role assignment.
In: Proceedings of LREC 2006. Genoa, Italy, 2006.
Katrin Erk and Sebastian Padó.
[url] [abstract] [BibTeX]

This paper presents Shalmaneser, a software package for shallow semantic parsing, the automatic assignment of semantic classes and roles to free text. Shalmaneser is a toolchain of independent modules communicating through a common XML format. System output can be inspected graphically. Shalmaneser can be used either as a "black box" to obtain semantic parses for new datasets (classifiers for English and German frame-semantic analysis are included), or as a research platform that can be extended to new parsers, languages, or classification paradigms.

SALTO - A Versatile Multi-Level Annotation Tool.
In: Proceedings of LREC 2006. Genoa, Italy, 2006.
Aljoscha Burchardt, Katrin Erk, Anette Frank, AndreaKowalski, Sebastian Padó and Manfred Pinkal.
[url] [abstract] [BibTeX]

In this paper, we describe the SALTO tool. It was originally developed for the annotation of semantic roles in the frame semantics paradigm, but can be used for graphical annotation of treebanks with general relational information in a simple drag-and-drop fashion. The tool additionally supports corpus management and quality control.

The SALSA corpus: a German corpus resource for lexical semantics.
In: Proceedings of LREC 2006. Genoa, Italy, 2006.
Aljoscha Burchardt, Katrin Erk, Anette Frank, AndreaKowalski, Sebastian Padó and Manfred Pinkal.
[url] [abstract] [BibTeX]

This paper describes the SALSA corpus, a large German corpus manually annotated with role-semantic information, based on the syntactically annotated TIGER newspaper corpus (Brants et al., 2002). The rst release, comprising about 20,000 annotated predicate instances (about half the TIGER corpus), is scheduled for mid-2006. In this paper we discuss the frame-semantic annotation framework and its cross-lingual applicability, problems arising from exhaustive annotation, strategies for quality control, and possible applications.

Optimal Constituent Alignment with Edge Covers for Semantic Projection.
In: Proceedings of COLING/ACL 2006, pages 1161-1168. Sydney, Australia, 2006.
Sebastian Padó and Mirella Lapata.
[url] [abstract] [BibTeX]

Given a parallel corpus, semantic projection attempts to transfer semantic role annotations from one language to another, typically by exploiting word alignments. In this paper, we present an improved method for obtaining constituent alignments between parallel sentences to guide the role projection task. Our extensions are twofold: (a) we model constituent alignment as minimum weight edge covers in a bipartite graph, which allows us to find a globally optimal solution efficiently; (b) we propose tree pruning as a promising strategy for reducing alignment noise. Experimental results on an English-German parallel corpus demonstrate improvements over state-of-the-art models.

Analysing models of semantic role assignment using confusability.
In: Proceedings of HLT/EMNLP 2005. Vancouver, Canada, 2005.
Katrin Erk and Sebastian Padó.
[url] [abstract] [BibTeX]

We analyze models for semantic role assignment by defining a meta-model that abstracts over features and learning paradigms. This meta-model is based on the concept of role confusability, is defined in information-theoretic terms, and predicts that roles realized by less specific grammatical functions are more difficult to assign. We find that confusability is strongly correlated with the performance of classifiers based on syntactic features, but not for classifiers including semantic features. This indicates that syntactic features approximate a description of grammatical functions, and that semantic features provide an independent second view on the data.

Cross-lingual Bootstrapping for Semantic Lexicons: The case of FrameNet.
In: Proceedings of AAAI 2005. Pittsburgh, PA, 2005.
Sebastian Padó and Mirella Lapata.
[url] [abstract] [BibTeX]

This paper considers the problem of unsupervised semantic lexicon acquisition. We introduce a fully automatic approach which exploits parallel corpora, relies on shallow text properties, and is relatively inexpensive. Given the English FrameNet lexicon, our method exploits word alignments to generate frame candidate lists for new languages, which are subsequently pruned automatically using a small set of linguistically motivated filters. Evaluation shows that our approach can produce high-precision multilingual FrameNet lexicons without recourse to bilingual dictionaries or deep syntactic and semantic analysis.

Cross-lingual projection of role-semantic information.
In: Proceedings of HTL/EMNLP 2005, pages 859-866. Vancouver, Canada, 2005.
Sebastian Padó and Mirella Lapata.
[url] [abstract] [BibTeX]

This paper considers the problem of automatically inducing role-semantic annotations in the FrameNet paradigm for new languages. We introduce a general framework for semantic projection which exploits parallel texts, is relatively inexpensive and can potentially reduce the amount of effort involved in creating semantic resources. We propose projection models that exploit lexical and syntactic information. Experimental results on an English- German parallel corpus demonstrate the advantages of this approach.

A powerful and versatile XML Format for representing role-semantic annotation.
In: Proceedings of LREC 2004. Lisbon, Portugal, 2004.
Katrin Erk and Sebastian Padó.
[url] [abstract] [BibTeX]

We present two XML formats for the description and encoding of semantic role information in corpora. The TIGER/SALSA XML format provides a modular representation for semantic roles and syntactic structure. The Text-SALSA XML format is a lightweight version of TIGER/SALSA XML designed for manual annotation with an XML editor rather than a special tool. Both formats can deal with underspecification, roles crossing the sentence boundary, compound splitting, and whole-sentence tags for meta-level comments.

Semantic Role Labelling With Similarity-Based Generalisation Using EM-based Clustering.
In: Proceedings of SENSEVAL 3. Barcelona, Spain, 2004.
Ulrike Baldewein, Katrin Erk, Sebastian Padó and Detlef Prescher.
[url] [abstract] [BibTeX]

We describe a system for semantic role assignment built as part of the Senseval III task, based on an off-the-shelf parser and Maxent and Memory-Based learners. We focus on generalisation using several similarity measures to increase the amount of training data available and on the use of EM-based clustering to improve role assignment. Our final score yields Precision=73.6 Recall=59.4%(F=65.7).

The Influence of Argument Structure on Semantic Role Assignment.
In: Proceedings of EMNLP 2004, pages 103-110. Barcelona, Spain, 2004.
Sebastian Padó and Gemma Boleda Torrent.
[url] [abstract] [BibTeX]

We present a data and error analysis for semantic role labelling. In a first experiment, we build a generic statistical model for semantic role assignment in the FrameNet paradigm and show that there is a high variance in performance across frames. The main hypothesis of our paper is that this variance is to a large extent a result of differences in the underlying argument structure of the predicates in different frames. In a second experiment, we show that frame uniformity, which measures argument structure variation, correlates well with the performance figures, effectively explaining the variance.

Semantic Role Labelling for Chunk Sequences.
In: Proceedings of the CoNLL 2004 shared task. Boston, MA, 2004.
Ulrike Baldewein, Katrin Erk, Sebastian Padó and Detlef Prescher.
[url] [abstract] [BibTeX]

We describe a statistical approach to semantic role labelling that employs only shallow infor- mation. We use a Maximum Entropy learner, augmented by EM-based clustering to model the fit between a verb and its argument can- didate. The instances to be classified are se- quences of chunks that occur frequently as ar- guments in the training corpus. Our best model obtains an F score of 51.70 on the test set.

Querying both time-aligned and hierarchical corpora with NXT search.
In: Proceedings of LREC 2004. Lisbon, Portugal, 2004.
Ulrich Heid, Holger Voormann, Jan-Uwe Milde, Ulrike Gut, Katrin Erk and Sebastian Padó.
[url] [abstract] [BibTeX]

One problem of the (re-)usability and exchange of annotated corpora is in the lack of standards in corpus This paper reports on the NXT Search tool, which was used to query two corpora with very different annotation with automatic data format conversion both corpora can be accessed and searched with NXT Search.

Towards a Resource for Lexical Semantics: A Large German Corpus with Extensive Semantic Annotation.
In: Proceedings of ACL 2003. Sapporo, Japan, 2003.
Katrin Erk, Andrea Kowalski, Sebastian Padó and Manfred Pinkal.
[url] [abstract] [BibTeX]

We describe the ongoing construction of a large, semantically annotated corpus resource as reliable basis for the largescale acquisition of word-semantic information, e.g. the construction of domainindependent lexica. The backbone of the annotation are semantic roles in the frame semantics paradigm. We report experiences and evaluate the annotated data from the first project stage. On this basis, we discuss the problems of vagueness and ambiguity in semantic annotation.

Constructing Semantic Space Models from Parsed Corpora.
In: Proceedings of ACL 2003, pages 128-135. Sapporo, Japan, 2003.
Sebastian Padó and Mirella Lapata.
[url] [abstract] [BibTeX]

Traditional vector-based models use word co-occurrence counts from large corpora to represent lexical meaning. In this paper we present a novel approach for constructing semantic spaces that takes syntactic relations into account. We introduce a formalisation for this class of models and evaluate their adequacy on two modelling tasks: semantic priming and automatic discrimination of lexical relations.

Workshop papers

Regular-pattern-sensitive CRFs for Distant Label Interactions.
In: Proceedings of the First Workshop on Structure-aware Large Language Models. Vienna, Austria, 2025.
Sean Papay, Roman Klinger and Sebastian Padó.
[url] [BibTeX]

Artwork Interpretation with Vision Language Models: A Case Study on Emotions and Emotion Symbols.
In: Proceedings of the IJCNLP-AACL workshop on Multimodal Models for Low-Resource Contexts and Social Impact. Mumbai, India, 2025.
Sebastian Padó and Kerstin Thomas.
[url] [BibTeX]

Low-Resource Sign Language Glossing Profits From Data Augmentation.
In: Proceedings of the IJCNLP-AACL workshop on sign language processing. Mumbai, India, 2025.
Vania Lara-Ortiz and Sebastian Padó.
[url] [BibTeX]

Actor Identification in Discourse: A Challenge for LLMs?.
In: Proceedings of the CODI workshop. St Julian's, Malta, 2024.
Ana Barić, Sebastian Padó and Sean Papay.
[url] [BibTeX]

Investigating semantic subspaces of Transformer sentence embeddings through linear structural probing.
In: Proceedings of the BlackboxNLP workshop. Singapore, 2023.
Dmitry Nikolaev and Sebastian Padó.
[url] [abstract] [BibTeX]

The question of what kinds of linguistic information are encoded in different layers of Transformer-based language models is of considerable interest for the NLP community. Existing work, however, has overwhelmingly focused on word-level representations and encoder-only language models with the masked-token training objective. In this paper, we present experiments with semantic structural probing, a method for studying sentence-level representations via finding a subspace of the embedding space that provides suitable task-specific pairwise distances between data-points. We apply our method to language models from different families (encoder-only, decoder-only, encoder-decoder) and of different sizes in the context of two tasks, semantic textual similarity and natural-language inference. We find that model families differ substantially in their performance and layer dynamics, but that the results are largely model-size invariant.

Understanding the Relation of User and News Representations in Content-Based Neural News Recommendation.
In: Proceedings of the SIGIR Workshop on News Recommendation and Analytics. Madrid, Spain, 2022.
Lucas Möller and Sebastian Padó.
[url] [abstract] [BibTeX]

A number of models for neural content-based news recommendation have been proposed. However, there is limited understanding of the relative importances of the three main components of such systems (news encoder, user encoder, and scoring function) and the trade-offs involved. In this paper, we assess the hypothesis that the most widely used means of matching user and candidate news representations is not expressive enough. We allow our system to model more complex relations between the two by assessing more expressive scoring functions. Across a wide range of baseline and established systems this results in consistent improvements of around 6 points in AUC. Our results also indicate a trade-off between the complexity of news encoder and scoring function: A fairly simple baseline model scores well above 68% AUC on the MIND dataset and comes within 2 points of the published state-of-the-art, while requiring a fraction of the computational costs.

Why Justifications of Claims Matter for Understanding Party Positions.
In: Proceedings of the 2nd Workshop on Computational Linguistics for Political Text Analysis. 2022.
Nico Blokker, Tanise Ceron, André Blessing, Erenay Dayanık, Sebastian Haunss, Jonas Kuhn, Gabriella Lapesa and Sebastian Padó.
[url] [BibTeX]

Word order typology in Multilingual BERT: A case study in subordinate clause detection.
In: Proceedings of the ACL SIGTYP workshop. 2022.
Dmitry Nikolaev and Sebastian Padó.
[url] [BibTeX]

Multi-Task Learning with Sentiment, Emotion, and Target Detection to Recognize Hate Speech and Offensive Language.
In: Proceedings of the FIRE HASOC workshop. 2021.
Flor M. Plaza-del-Arco, Sercan Halat, Sebastian Padó and Roman Klinger.
[url] [BibTeX]

Emotion Ratings: How Intensity, Annotation Conﬁdence and Agreements are Entangled.
In: Proceedings of the EACL WASSA workshop, pages 50-61. 2021.
Enrica Troiano, Sebastian Padó and Roman Klinger.
[url] [BibTeX]

Using Hierarchical Class Structure to Improve Fine-Grained Claim Classification.
In: Proceedings of the ACL Workshop of Structured Prediction. Bangkok, Thailand, 2021.
Erenay Dayanık, Andre Blessing, Nico Blokker, Sebastian Haunss, Jonas Kuhn, Gabriella Lapesa and Sebastian Padó.
[url] [BibTeX]

Disentangling Document Topic and Author Gender in Multiple Languages: Lessons for Adversarial Debiasing.
In: Proceedings of the EACL WASSA workshop, pages 40-49. 2021.
Erenay Dayanık and Sebastian Padó.
[url] [BibTeX]

Swimming with the Tide? Positional Claim Detection across Political Text Types.
In: Proceedings of the NLP+CSS workshop, pages 24-34. Online, 2020.
Nico Blokker, Erenay Dayanık, Gabriella Lapesa and Sebastian Padó.
[url] [BibTeX]

Clustering-Based Article Identification in Historical Newspapers.
In: Proceedings of the NAACL LaTeCH-CLfL workshop. Minneapolis, MN, 2019.
Martin Riedl, Daniela Betz and Sebastian Padó.
[url] [BibTeX]

Modeling Paths for Explainable Knowledge Base Completion.
In: Proceedings of the ACL BlackboxNLP workshop. Florence, Italy, 2019.
Josua Stadelmaier and Sebastian Padó.
[url] [BibTeX]

Using Embeddings to Compare FrameNet Frames Across Languages.
In: COLING Workshop on Linguistic Resources for Natural Language Processing. Santa Fe, NM, 2018.
Jennifer Sikos and Sebastian Padó.
[url] [BibTeX]

Addressing Low-Resource Scenarios with Character-aware Embeddings.
In: Proceedings of the Second Workshop on Subword and Character Level Models. New Orleans, LA, 2018.
Sean Papay, Sebastian Padó and Ngoc Thang Vu.
[url] [abstract] [BibTeX]

Most modern approaches to computing word embeddings assume the availability of text corpora with billions of words. In this paper, we explore a setup where only corpora with millions of words are available, and many words in any new text are out of vocabulary. This setup is both of practical interest – modeling the situation for specific domains and low-resource languages – and of psycholinguistic interest, since it corresponds much more closely to the actual experiences and challenges of human language learning and use. We evaluate skip-gram word embeddings and two types of character-based embeddings on word relatedness prediction. On large corpora, performance of both model types is equal for frequent words, but character awareness already helps for infrequent words. Consistently, on small corpora, the character-based models perform overall better than skip-grams. The concatenation of different embeddings performs best on small corpora and robustly on large corpora.

Annotation, Modelling and Analysis of Fine-Grained Emotions on a Stance and Sentiment Detection Corpus.
In: Proceedings of the EMNLP WASSA workshop. Copenhagen, Denmark, 2017.
Hendrik Schuff, Jeremy Barnes, Julian Mohme, Sebastian Padó and Roman Klinger.
[url] [abstract] [BibTeX]

There is a rich variety of data sets for sentiment analysis (viz., polarity and subjectivity classification). For the more challenging task of detecting discrete emotions following the definitions of Ekman and Plutchik, however, there are much fewer data sets, and notably no resources for the social media domain. This paper contributes to closing this gap by extending the SemEval 2016 stance and sentiment dataset with emotion annotation. We (a) analyse annotation reliability and annotation merging; (b) investigate the relation between emotion annotation and the other annotation layers (stance, sentiment); (c) report modelling results as a baseline for future work.

Towards Cross-Lingual Comparability of Derivational Lexicons: An Extraction Algorithm for CELEX.
In: Proceedings of DeriMo. Milan, Italy, 2017.
Elnaz Shafaei, Diego Frassinelli, Gabriella Lapesa and Sebastian Padó.
[url] [BibTeX]

Investigating the Relationship between Literary Genres and Emotional Plot Development.
In: Proceedings of the ACL LaTeCH-CLfL workshop. Vancouver, BC, 2017.
Evgeny Kim, Sebastian Padó and Roman Klinger.
[url] [abstract] [BibTeX]

Literary genres are commonly viewed as being defined in terms of content and stylistic features. In this paper, we focus on one particular class of lexical features, namely emotion information, and investigate the hypothesis that emotion-related information correlates with particular genres. Using genre classification as a testbed, we compare a model that computes lexicon-based emotion scores globally for complete stories with a model that tracks emotion arcs through stories on a subset of Project Gutenberg with five genres. Our main findings are: (a), the global emotion model is competitive with a large-vocabulary bag-of-words genre classifier (80% F1); (b), the emotion arc model shows a lower performance (59% F1) but shows complementary behavior to the global model, as indicated by very good performance of an oracle ensemble (94%F1); (c), genres differ in the extent to which stories follow the same emotional arcs, with particularly uniform behavior for anger (mystery) and fear (adventures, romance, humor, science fiction).

Evaluating and Improving a Derivational Lexicon with Graph-theoretical Methods.
In: Proceedings of DeriMo. Milan, Italy, 2017.
Sean Papay, Gabriella Lapesa and Sebastian Padó.
[url] [BibTeX]

Predicting the Direction of Derivation in English conversion.
In: Proceedings of the ACL SIGMORPHON workshop, pages 93-98. Berlin, Germany, 2016.
Max Kisselew, Laura Rimell, Alexis Palmer and Sebastian Padó.
[url] [abstract] [BibTeX]

Conversion is a word formation operation that changes the grammatical category of a word in the absence of overt morphology. Conversion is extremely productive in English (e.g., tunnel, talk). This paper investigates whether distributional information can be used to predict the diachronic direction of conversion for homophonous noun–verb pairs. We aim to predict, for example, that tunnel was used as a noun prior to its use as a verb. We test two hypotheses: (1) that derived forms are less frequent than their bases, and (2) that derived forms are more semantically specific than their bases, as approximated by information theoretic measures. We find that hypothesis (1) holds for N-to-V conversion, while hypothesis (2) holds for V-to-N conversion. We achieve the best overall account of the historical data by taking both frequency and semantic specificity into account. These results provide a new perspective on linguistic theories regarding the semantic specificity of derivational morphemes, and on the morphosyntactic status of conversion.

Measuring Semantic Content To Assess Asymmetry in Derivation.
In: Proceedings of the IWCS Workshop on Advances in Distributional Semantics. London, UK, 2015.
Sebastian Padó, Alexis Palmer, Max Kisselew and Jan Šnajder.
[url] [BibTeX]

Mapping conceptual features to referential properties.
In: Proceedings of the 3rd international ESSENCE workshop: Algorithms for processing meaning. Barcelona, Spain, 2015.
Abhijeet Gupta, Gemma Boleda, Marco Baroni and Sebastian Padó.
[BibTeX]

Morphological Priming in German: The Word is Not Enough (Or Is It?).
In: Proceedings of NetWords, pages 42-45. Pisa, Italy, 2015.
Sebastian Padó, Britta Zeller and Jan Šnajder.
[url] [BibTeX]

Same Same but Different: Type and Typicality in a Distributional Model of Complement Coercion.
In: Proceedings of NetWords, pages 91-94. Pisa, Italy, 2015.
Alessandra Zarcone, Sebastian Padó and Alessandro Lenci.
[url] [abstract] [BibTeX]

We aim to model the results from a self-paced reading experiment, which tested the effect of semantic type clash and typicality on the processing of German complement coercion. We present two distributional semantic models to test if they can model the effect of both type and typicality in the psycholinguistic study. We show that one of the models, without explicitly representing type information, can account both for the effect of type and typicality in complement coercion.

GermEval 2014 Named Entity Recognition Shared Task: Companion Paper.
In: Proceedings of the KONVENS GermEval workshop, pages 104-112. Hildesheim, Germany, 2014.
Darina Benikova, Chris Biemann, Max Kisselew and Sebastian Padó.
[BibTeX]

Using UIMA to Structure An Open Platform for Textual Entailment.
In: Proceedings of the 3rd Workshop on Unstructured Information Management Architecture, pages 26-33. Darmstadt, Germany, 2013.
Tae-Gil Noh and Sebastian Padó.
[url] [abstract] [BibTeX]

EXCITEMENT is a novel, open software platform for Textual Entailment (TE) which uses the UIMA framework. This paper discusses the design considerations regarding the roles of UIMA within EXCITEMENT Open Platform (EOP). We focus on two points: a) how to best design the representation of entailment problems within UIMA CAS and its type system. b) the integration and usage of UIMA components among non-UIMA components.

The Curious Case of Metonymic Verbs: A Distributional Characterization.
In: Proceedings of the IWCS workshop ''Towards A Formal Distributional Semantics''. Potsdam, Germany, 2013.
Jason Utt, Alessandro Lenci, Sebastian Pado and Alessandra Zarcone.
[url] [abstract] [BibTeX]

Logical metonymy combines an event-selecting verb with an entity-denoting noun (e.g., The writer began the novel), triggering a covert event interpretation (e.g., reading, writing). Experimental investigations of logical metonymy must assume a binary distinction between metonymic (i.e. event- selecting) verbs and non-metonymic verbs to establish a control condition. However, this binary distinction (whether a verb is metonymic or not) is mostly made on intuitive grounds, which introduces a potential confounding factor. We describe a corpus-based approach which characterizes verbs in terms of their behavior at the syntax-semantics interface. The model assesses the extent to which transitive verbs prefer event-denoting objects over entity-denoting objects. We then test this “eventhood” measure on psycholinguistic datasets, showing that it can distinguish not only metonymic from non-metonymic verbs, but that it can also capture more fine-grained distinctions among different classes of metonymic verbs, putting such distinctions into a new graded perspective.

Modeling covert event retrieval in logical metonymy: probabilistic and distributional accounts.
In: Proceedings of the NAACL Workshop on Cognitive Modeling in Computational Linguistics, pages 70-79. Montreal, QC, 2012.
Alessandra Zarcone, Jason Utt and Sebastain Pado.
[url] [abstract] [BibTeX]

Logical metonymies (The student finished the beer) represent a challenge to compositionality since they involve semantic content not overtly realized in the sentence (covert events -> drinking the beer). We present a contrastive study of two classes of computational models for logical metonymy in German, namely a probabilistic model and a distributional, similarity-based model. We build both models from the SDEWAC corpus and evaluate them against a dataset from a self-paced reading and a probe recognition study for their sensitivity to thematic fit effects via their accuracy in predicting the correct covert event in a metonymical context. The similarity-based models allow for better coverage while maintaining the accuracy of the probabilistic models.

A Distributional Memory for German.
In: Proceedings of the KONVENS workshop on recent developments and applications of lexical-semantic resources. Vienna, Austria, 2012.
Sebastian Padó and Jason Utt.
[url] [abstract] [BibTeX]

This paper describes the creation of a Distri- butional Memory (Baroni and Lenci 2010) resource for German. Distributional Mem- ory is a generalized distributional resource for lexical semantics that does not have to commit to a particular vector space at the time of creation. We induce a resource from a German corpus, following the original de- sign decisions as closely as possible, and discuss the steps necessary for a new lan- guage. We evaluate the German DM model on a synonym selection task, finding that it can compete with existing models.

Dependency-based Question Validation for German.
In: CLEF Working Notes. Amsterdam, Netherlands, 2011.
Svitlana Babych, Alexander Henn, Jan Pawellek and Sebastian Padó.
[url] [abstract] [BibTeX]

This article describes the Heidelberg contribution to the CLEF 2011 QA4MRE task for German. We focus on the objective of not using any external resources, building a system that represents questions, answers and texts as formulae in propositional logic derived from dependency structure. Background knowledge is extracted from the background corpora using several knowledge extraction strategies. We answer questions by attempting to infer answers from the test documents complemented by background knowledge, with a distance measure as fall-back. The main challenge is to specify the translation from dependency structure into a logical representation. For this step, we suggest different rule sets and evaluate various configuration parameters that tune accuracy and coverage. All of runs exceed a random baseline, but show different coverage/accuracy profiles (accuracy up to 44 coverage up to 65.

Soundex-based Translation Correction in Urdu-English Cross-Language Information Retrieval.
In: Proceedings of the IJCNLP Workshop on Cross-Lingual Information Retrieval, pages 25-29. Chiang Mai, Thailand, 2011.
Manaal Faruqui, Prasenjit Majumdar and Sebastian Padó.
[url] [abstract] [BibTeX]

Cross-language information retrieval is difficult for languages with few processing tools or resources such as Urdu. An easy way of translating content words is provided by Google Translate, but due to lexicon limitations named entities (NEs) are transliterated letter by letter. The resulting NEs errors (zynydyny zdn for Zinedine Zidane) hurts retrieval. We propose to replace English non-words in the translation output. First, we determine phonetically similar English words with the Soundex algorithm. Then, we choose among them by a modified Levenshtein distance that models correct transliteration patterns. This strategy yields an improvement of 4%MAP (from 41.2 to 45.1, monolingual 51.4) on the FIRE-2010 dataset.

Multi-Way Classification of Semantic Relations Between Pairs of Nominals.
In: Proceedings of the 5th SIGLEX Workshop on Semantic Evaluation, pages 33-38. Uppsala, Sweden, 2010.
Iris Hendrickx, Su Nam Kim, Zornitsa Kozareva, Preslav Nakov, Diarmuid Ó Séaghdha, Sebastian Padó, Marco Pennacchiotti, Lorenza Romano and Stan Szpakowicz.
[url] [abstract] [BibTeX]

We describe the SemEval-2 task 8 (multi-way classification of semantic relations between pairs of nominals). The task was designed to compare different approaches to the problem and to provide a standard testbed for future research. This paper defines the task, describes the training and test data and the process of their creation, lists the participating systems (10 teams, 28 runs), and discusses their results.

``I like work: I can sit and look at it for hours'' - Type clash vs. plausibility in covert event recovery.
In: Proceedings of the VERB 2010 workshop. Pisa, Italy, 2010.
Alessandra Zarcone and Sebastian Padó.
[BibTeX]

Textual Entailment Features for Machine Translation Evaluation.
In: Proceedings of the EACL Workshop on Machine Translation, pages 37-41. Athens, Greece, 2009.
Sebastian Padó, Michel Galley, Christopher D. Manning and Daniel Jurafsky.
[url] [abstract] [BibTeX]

We present two regression models for the prediction of pairwise preference judgments among MT hypotheses. Both models are based on feature sets that are motivated by textual entailment and incorporate lexical similarity as well as local syntactic features and specific semantic phenomena. One model predicts absolute scores; the other one direct pairwise judgments. We find that both models are competitive with regression models built over the scores of established MT evaluation metrics. Further data analysis clarifies the complementary behavior of the two feature sets.

Paraphrase assessment in structured vector space: Exploring parameters and datasets.
In: Proceedings of the EACL Workshop on Geometrical Methods for Natural Language Semantics, pages 57-65. Athens, Greece, 2009.
Katrin Erk and Sebastian Padó.
[url] [abstract] [BibTeX]

The appropriateness of paraphrases for words depends often on context: ``grab'' can replace ``catch'' in ``catch a ball'', but not in ``catch a cold''. Structured Vector Space (SVS) is a model that computes word meaning in context in order to assess the appropriateness of such paraphrases. This paper investigates ``best-practice'' parameter settings for SVS, and it presents a method to obtain large datasets for paraphrase assessment from corpora with WSD annotation.

Multi-word expressions in Textual Entailment: Much ado about nothing?.
In: Proceedings of the ACL TextInfer workshop, pages 1-9. Singapore, 2009.
Marie-Catherine de Marneffe, Sebastian Padó and Christopher D. Manning.
[url] [abstract] [BibTeX]

Multi-word expressions (MWE) have seen much attention from the NLP community. In this paper, we investigate their impact on the recognition of textual entailment (RTE). Using the manual Microsoft Research annotations, we first manually count and classify MWEs in RTE data. We find few, most of which are arguably unlikely to cause processing problems. We then consider the impact of MWEs on a current RTE system. We are unable to confirm that entailment recognition suffers from wrongly aligned MWEs. In addition, MWE alignment is difficult to improve, since MWEs are poorly represented in state-of-the-art paraphrase resources, the only available sources for multi-word similarities. We conclude that RTE should concentrate on other phenomena impacting entailment, and that paraphrase knowledge is best understood as capturing general lexico-syntactic variation.

SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals.
In: Proceedings of the NAACL Workshop on Semantic Evaluations: Recent Achievements and Future Directions, pages 94-99. Boulder, CO, 2009.
Iris Hendrickx, Su Nam Kim, Zornitsa Kozareva, Preslav Nakov, Diarmuid Ó Séaghdha, Sebastian Padó, Marco Pennacchiotti, Lorenza Romano and Stan Szpakowicz.
[url] [abstract] [BibTeX]

We present a brief overview of the main challenges in the extraction of semantic relations from English text, and discuss the shortcomings of previous data sets and shared tasks. This leads us to introduce a new task, which will be part of SemEval-2010: multi-way classification of mutually exclusive semantic relations between pairs of common nominals. The task is designed to compare different approaches to the problem and to provide a standard testbed for future research, which can benefit many applications in Natural Language Processing.

The CoNLL-2009 Shared Task: Syntactic and Semantic Dependencies in Multiple Languages.
In: Proceedings of CoNLL-2009, pages 1-18. Boulder, CO, 2009.
Jan Hajič, Massimiliano Ciaramita, Richard Johansson, Daisuke Kawahara, Maria A. Martì, Lluís Màrquez, Adam Meyers, Joakim Nivre, Sebastian Padó, Jan Štepánek, Pavel Straňák, Mihai Surdeanu, Niawen Xue and Yi Zhang.
[url] [abstract] [BibTeX]

For the 11th straight year, the Conference on Computational Natural Language Learning has been accompanied by a shared task whose purpose is to promote natural language processing applications and evaluate them in a standard setting. In 2009, the shared task was dedicated to the joint parsing of syntactic and semantic dependencies in multiple languages. This shared task combines the shared tasks of the previous five years under a unique dependency-based formalism similar to the 2008 task. In this paper, we define the shared task, describe how the data sets were created and show their quantitative properties, report the results and summarize the approaches of the participating systems.

Deciding Entailment and Contradiction with Stochastic and Edit Distance-based Alignment.
In: Proceedings of the Text Analysis Conference. Gaithersburg, VA, 2008.
Sebastian Padó, Marie-Catherine de Marneffe, Bill MacCartney, Anna N. Rafferty, Eric Yeh and Christopher D. Manning.
[BibTeX]

Cross-lingual Parallelism and Translational Equivalence: The Case of FrameNet Frames.
In: Proceedings of the NODALIDA Workshop on Building Frame Semantics Resources for Scandinavian and Baltic Languages. Tartu, Estonia, 2007.
Sebastian Padó.
[url] [abstract] [BibTeX]

Annotation projection is a strategy for the cross-lingual transfer of annotations which can be used to bootstrap linguistic resources for low-density languages, such as role-semantic databases similar to FrameNet. In this paper, we investigate the main assumption underlying annotation projection, cross-lingual parallelism, which states that annotation is parallel across languages. Concentrating on the level of frames, we provide a qualitative and quantitative characterisation of the relationship between translation and cross-lingual parallelism on the basis of a trilingual English–French– German corpus. We link frame (non)-parallelism to different kinds of translational shifts and show that a simple heuristic can detect the majority of such shifts.

Inducing a Computational Lexicon from a Corpus with Syntactic and Semantic Annotation.
In: Proceedings of IWCS-7. Tilburg, The Netherlands, 2007.
Dennis Spohr, Aljoscha Burchardt, Sebastian Padó, Anette Frank and Ulrich Heid.
[url] [abstract] [BibTeX]

To date, linguistically annotated corpora are mainly exploited for feature-based training of automatic labelling systems. In this paper, we present a general approach for the Description Logics-based modelling of multi-layered annotated corpora which offers (i) flexible and enhanced querying functionality that goes beyond current XML-based query languages, (ii) a basis for consistency checking, and (iii) a general method for defining abstractions over corpus annotations. We apply this method to the syntactically and semantically annotated SALSA/TIGER corpus . By defining abstractions over the corpus data, we generalise from a large set of individual corpus annotations to a corresponding lexicon model. We discuss issues arising from modelling multi-layered corpus annotations in Description Logics and illustrate the benefits of our approach at concrete examples.

Towards a Computational Model of Gradience in Word Sense.
In: Proceedings of IWCS-7. Tilburg, The Netherlands, 2007.
Katrin Erk and Sebastian Padó.
[BibTeX]

To cause or not to cause: Cross-lingual semantic matching for paraphrase modelling.
In: Proceedings of the Cross-Language Knowledge Induction Workshop. Cluj-Napoca, Romania, 2005.
Sebastian Padó and Katrin Erk.
[BibTeX]

PropBank, SALSA and FrameNet: How Design Determines Product.
In: Proceedings of the Workshop on Building Lexical Resources From Semantically Annotated Corpora, LREC-2004. Lisbon, Portugal, 2004.
Michael Ellsworth, Katrin Erk, Paul Kingsbury and Sebastian Padó.
[url] [abstract] [BibTeX]

We compare three projects that annotate semantic roles: PropBank, FrameNet, and SALSA. The first part of our analysis is a comparison of the different word sense distinction criteria underlying the annotation. Then, we study the effects of these criteria at the level of actual phenomena that require annotation. In particular, we discuss metaphor, support constructions, words with multiple meaning aspects, phrases realizing more than one semantic role, and nonlocal semantic roles.

Building a Resource for Lexical Semantics.
In: Proceedings of the Workshop on Frame Semantics, XVII. International Congress of Linguists. Prague, Czech Republic, 2003.
Katrin Erk, Andrea Kowalski, Sebastian Padó and Manfred Pinkal.
[BibTeX]

The SALSA Annotation Tool.
In: Proceedings of the Workshop on Prospects and Advances in the Syntax/Semantics Interface. Nancy, France, 2003.
Katrin Erk, Andrea Kowalski and Sebastian Padó.
[BibTeX]

Towards a better understanding of frame element assignment errors.
In: Proceedings of the Workshop on Prospects and Advances in the Syntax/Semantics Interface. Nancy, France, 2003.
Sebastian Padó and Gemma Boleda Torrent.
[BibTeX]

Book chapters

Emotion Analysis for Literary Studies.
In: N. Reiter, A. Pichler and J. Kuhn, editors, Reflected Computational Text Analysis, pages 237-268. De Gruyter, 2020.
Roman Klinger, Evgeny Kim and Sebastian Padó.
[url] [abstract] [BibTeX]

Most approaches to emotion analysis in fictional texts focus on detecting the emotion class expressed over the course of a text, either with machine learning-based classification or with dictionaries. These approaches do not consider who experiences the emotion and what triggers it and therefore, as a necessary simplicifaction, aggregate across different characters and events. This constitutes a research gap, as emotions play a crucial role in the interaction between characters and the events they are involved in. We fill this gap with the development of two corpora and associated computational models which represent individual events together with their experiencers and stimuli. The first resource, REMAN (Relational EMotion ANnotation), aims at a fine-grained annotation of all these aspects on the text level. The second corpus, FANFIC, contains complete stories, annotated on the experiencer-stimulus level, i. e., focuses on emotional relations among characters. FANFIC is therefore a character relation corpus while REMAN considers event descriptions in addition. Our experiments show that computational stimuli detection is particularly challenging. Furthermore, predicting roles in joint models has the potential to perform better than separate predictions. These resources provide a starting point for future research on the recognition of emotions and associated entities in text. They support qualitative literary studies and digital humanities research. The corpora are freely available at http://www.ims.uni-stuttgart.de/data/emotion.

Statistical Machine Translation Support Improves Human Adjective Translation.
In: O. Culo and S. Hansen-Schirra, editors, Crossroads between Contrastive Linguistics, Translation Studies and Machine Translation: TC3 II, pages 121-152. Language Science Press, 2017.
Gerhard Kremer, Matthias Hartung, Sebastian Padó and Stefan Riezler.
[url] [abstract] [BibTeX]

In this paper we present a study in computer-assisted translation, investigating whether non-professional translators can profit directly from automatically constructed bilingual phrase pairs. Our support is based on state-of-the-art statistical machine translation (smt), consisting of a phrase table that is generated from large parallel corpora, and a large monolingual language model. In our experiment, human translators were asked to translate adjective–noun pairs in context in the presence of suggestions created by the smt model. Our results show that smt support results in an acceptable slowdown in translation time while significantly improving translation quality.

Textual Entailment.
In: R. Mitkov, editor, Oxford Handbook of Computational Linguistics, edition 2nd. Oxford University Press, 2016.
Sebastian Padó and Ido Dagan.
[url] [abstract] [BibTeX]

Textual entailment is a binary relation between two natural-language texts (called ‘text’ and ‘hypothesis’), where readers of the ‘text’ would agree the ‘hypothesis’ is most likely true (Peter is snoring → A man sleeps). Its recognition requires an account of linguistic variability ( an event may be realized in different ways, e.g. Peter buys the car ↔ The car is purchased by Peter) and of relationships between events (e.g. Peter buys the car → Peter owns the car). Unlike logics-based inference, textual entailment also covers cases of probable but still defeasible entailment (A hurricane hit Peter’s town → Peter’s town was damaged). Since human common-sense reasoning often involves such defeasible inferences, textual entailment is of considerable interest for real-world language processing tasks, as a generic, application-independent framework for semantic inference. This chapter discusses the history of textual entailment, approaches to recognizing it, and its integration in various NLP tasks.

Semantics in Computational Lexicons.
In: C. Maienborn, K. von Heusinger and P. Portner, editors, Semantics: An International Handbook of Natural Language Meaning, pages 2887-2917. De Gruyter, 2012.
Anette Frank and Sebastian Padó.
[url] [abstract] [BibTeX]

This chapter gives an overview of work on the representation of semantic information in lexicon resources for computational natural language processing (NLP). It starts with a broad overview of the history and state of the art of different types of semantic lexicons in Computational Linguistics, and discusses their main use cases. Section 2 is devoted to questions of how to construct semantic lexicons for Computational Linguistics. We discuss diverse modelling principles for semantic lexicons and methods for their construction, ranging from largely manual resource creation to automated methods for learning lexicons from text, semi-structured or unstructured. Section 3 addresses issues related to the cross-lingual and multi-lingual creation of broad-coverage semantic lexicon resources. Section 4 discusses interoperability, i.e., the combination of lexical (and other) resources describing different meaning aspects. Section 5 concludes with an outlook on future research directions.

Machine Translation Evaluation and Optimization.
In: J. Olive, C. Christianson and J. McCary, editors, Handbook of Natural Language Processing and Machine Translation: DARPA Global Autonomous Language Exploitation, pages 745-843. Springer, 2011.
B. Dorr, Y. Al-Onaizan, M. Galley, N. Habash, D. Jones, S. Kulick, A. Lavie, G. Leusch, N. Madnani, C. Manning, M. Marcus, A. Mauser, M. Ostendorf, S. Padó, M. Przybocki, A. Rosti, R. Schwartz, M. Snover, C. Tate, S. Vogel and C. Voss.
[url] [BibTeX]

Using FrameNet for the Semantic Analysis of German: Annotation, Representation, and Automation.
In: H. C. Boas, editor, Multilingual FrameNets in Computational Lexicography - Methods and Applications, pages 209-244. De Gruyter, 2009.
Aljoscha Burchardt, Katrin Erk, Anette Frank, Andrea Kowalski, Sebastian Padó and Manfred Pinkal.
[url] [BibTeX]

Abstracts, preprints and other types of publications [some not peer reviewed]

Efficient Language Modeling for Low-Resource Settings with Hybrid RNN–Transformer Architectures.
2025. Manuscript.
Gabriel Lindenmaier, Sean Papay and Sebastian Padó.
[url] [BibTeX]

Democratizing News Recommenders: Modeling Multiple Perspectives for News Candidate Generation with VQ-VAE.
2025. Manuscript.
Hardy Hardy, Sebastian Padó, Amelie Wührl and Tanise Ceron.
[url] [BibTeX]

Meta Learning for Code Summarization.
2022. Manuscript.
Moiz Rauf, Sebastian Padó and Michael Pradel.
[url] [BibTeX]

Identifying Humor, Critique, and Gender: Computational Analysis of the Gracioso Archetype in Spanish Golden Age Theater.
In: Digital Humanities. Lisbon, Portugal, 2025.
Allison Keith, Antonio Rojas Castro, Kerstin Jung, Hanno Ehrlicher and Sebastian Padó.
[BibTeX]

A Cross-Lingual Evaluation of Political Opinions in Multilingual Large Language Models.
In: Proceedings of IC2S2. 2025.
Franziska Weeber, Tanise Ceron and Sebastian Padó.
[BibTeX]

Transformers fail to predict consistent effects for agreement attraction configurations.
In: Proceedings of AMLaP. Prague, Czech Republic, 2025.
Titus von der Malsburg and Sebastian Padó.
[BibTeX]

From Annotations in TEI to Natural Language Processing: A Computational Analysis of Characters in Calderón Drama Corpus.
In: Book of Abstracts, 'Texts, Languages and Communities' - TEI 2024. Buenos Aires, Argentina, 2024.
Hanno Ehrlicher, Antonio Rojas Castro, Sebastian Padó, Kerstin Jung and Allison Keith.
[BibTeX]

Codificación TEI y análisis de redes: a propósito de Calderón Drama Corpus (CalDraCor) v.2.0.
In: Book of Abstracts, 'Texts, Languages and Communities' - TEI 2024. Buenos Aires, Argentina, 2024.
Hanno Ehrlicher, Antonio Rojas Castro, Sebastian Padó, Kerstin Jung and Allison Keith.
[BibTeX]

How computers (attempt to) translate emotions.
Circuit -- le magazine d'information des langagiers, 154. 2022.
Sebastian Padó, Enrica Troiano and Roman Klinger.
[url] [BibTeX]

Difference of first attestation dates as evidence for directionality in zero derivation.
In: Proceedings of the DGfS workshop on the semantics of derivational morphology. Freiburg im Breisgau, Germany, 2021.
Gianina Iordăchioaia, Gabriella Lapesa, Sabrina Meyer and Sebastian Padó.
[BibTeX]

Who's in the news? Methodological challenges and opportunities in studying 19th century writers in historical newspapers.
Europeanatech Insight, Issue 16: Newspapers. 2020.
Jana Keck, Mortitz Knabben and Sebastian Padó.
[url] [BibTeX]

Learning Trilingual Dictionaries for Urdu - Roman Urdu - English.
In: Proceedings of the ACL Workshop on Widening NLP. Florence, Italy, 2019.
Moiz Rauf and Sebastian Padó.
[BibTeX]

Supporting Discourse Network Analysis through Machine Learning for Claim Detection and Classification.
In: Proceedings of the 4th European Conference on Social Networks. Zurich, Switzerland, 2019.
Sebastian Haunss, Nico Blokker, Sebastian Pado, Jonas Kuhn, Andre Blessing, Gabriella Lapesa and Erenay Dayanık.
[BibTeX]

Entities as a window into distributional semantics.
In: RANLP 2019. Varna, Bulgaria, 2019. Slides for invited talk.
Sebastian Padó.
[url] [BibTeX]

Distributional Semantics reveals cross-cultural differences in food concepts.
In: Proceedings of AmLaP. Moscow, Russia, 2019.
Diego Frassinelli, Gabriella Vigliocco and Sebastian Padó.
[BibTeX]

Digitale Modellierung von Figurenkomplexität am Beispiel des Parzival von Wolfram von Eschenbach.
In: Digital Humanities im Deutschsprachigen Raum. Cologne, Germany, 2018.
Manuel Braun, Roman Klinger, Sebastian Padó and Gabriel Viehhauser.
[BibTeX]

Type disambiguation of English -ment derivatives.
In: Proceedings of the 11th Mediterranean Morphology Meeting. Nikosia, Cyprus, 2017.
Gabriella Lapesa, Lea Kawaletz, Marios Andreou, Max Kisselew, Sebastian Pado and Ingo Plag.
[BibTeX]

'Over reference': A comparative study on German prefix verbs.
In: ESSLLI SemRefPlus Workshop: Referential semantics one step further: Incorporating insights from conceptual and distributional approaches to meaning. Bolzano, Italy, 2016.
Tillmann Pross, Antje Rossdeutscher, Sebastian Padó, Gabriella Lapesa and Max Kisselew.
[BibTeX]

Instance-based disambiguation of English -ment derivatives.
In: Proceedings of the conference on cognitive structures: Linguistic, Philosophical and Psychological Perspectives. Düsseldorf, Germany, 2016.
Marios Andreou, Lea Kawaletz, Max Kisselew, Gabriella Lapesa, Sebastian Pado and Ingo Plag.
[BibTeX]

Quantifying regularity in morphological processes: An ongoing study on nominalization in German.
In: ESSLLI DISSALT Workshop: Distributional Semantics and Semantic Theory. Bolzano, Italy, 2016.
Rossella Varvara, Gabriella Lapesa and Sebastian Padó.
[BibTeX]

Characterizing the pragmatic component of distributional vectors in terms of polarity: Experiments on German über verbs.
In: ESSLLI DISSALT Workshop: Distributional Semantics and Semantic Theory. Bolzano, Italy, 2016.
Gabriella Lapesa, Max Kisselew, Sebastian Padó, Tillmann Pross and Antje Rossdeutscher.
[BibTeX]

CRETA (Centrum für reflektierte Textanalyse) - Fachübergreifende Methodenentwicklung in den Digital Humanities.
In: Digital Humanities im Deutschsprachigen Raum. Leipzig, Germany, 2016.
Jonas Kuhn, Artemis Alexiadou, Manuel Braun, Thomas Ertl, Sabine Holtz, Cathleen Kantner, Catrin Misselhorn, Sebastian Padó, Sandra Richter, Achim Stein and Claus Zittel.
[BibTeX]

Logical metonymy: Disentangling object type and thematic fit.
In: Architecture and Mechanisms of Language Processing. Aix en Provence, France, 2013.
Alessandra Zarcone and Sebastian Padó.
[BibTeX]

Challenges in lexical semantics: Non-compositionality in SALSA corpus annotation.
In: Deutsche Gesellschaft für Sprachwissenschaft. Bielefeld, Germany, 2006.
Aljoscha Burchardt, Katrin Erk, Anette Frank, Andrea Kowalski, Sebastian Padó and Manfred Pinkal.
[BibTeX]

Consistency and Coverage: Challenges for exhaustivesemantic annotation.
In: Deutsche Gesellschaft für Sprachwissenschaft. Bielefeld, Germany, 2006.
Aljoscha Burchardt, Katrin Erk, Anette Frank, Andrea Kowalski, Sebastian Padó and Manfred Pinkal.
[BibTeX]

PhD Thesis

Cross-Lingual Annotation Projection Models for Role- Semantic Information.
Institute for Computational Linguistics, Saarland University, 2007.
Sebastian Padó
[a4] [a5]