B. Zeller and S. Pado and J. Šnajder. Towards Semantic Validation of a Derivational Lexicon. Proceedings of COLING 2014, Dublin, Ireland.

Note: Data available here.

Derivationally related lemmas like (friend – friendly – friendship) are derived from a common stem. Frequently, their meanings are also systematically related. However, there are also many examples of derivationally related lemma pairs whose meanings differ substantially, e.g., (object – objective). Most broad-coverage derivational lexicons do not reflect this distinction, mixing up semantically related and unrelated word pairs. In this paper, we investigate strategies to recover the above distinction by recognizing semantically related lemma pairs, a process we call semantic validation. We make two main contributions: First, we perform a detailed data analysis on the basis of a large German derivational lexicon. It reveals two promising sources of information (distributional semantics and structural information about derivational rules), but also systematic problems with these sources. Second, we develop a classification model for the task that reflects the noisy nature of the data. It achieves an improvement of 13.6% in precision and 5.8% in F1-score over a strong majority class baseline. Our experiments confirm that both information sources contribute to semantic validation, and that they are complementary enough that the best results are obtained from a combined model.

  author    = {Zeller, Britta  and  Pad\'{o}, Sebastian  and  \v{S}najder, Jan},
  title     = {Towards Semantic Validation of a Derivational Lexicon},
  booktitle = {Proceedings of COLING 2014, the 25th International 
               Conference on Computational Linguistics: Technical Papers},
  month     = {August},
  year      = {2014},
  address   = {Dublin, Ireland},
  publisher = {Dublin City University and Association for Computational Linguistics},
  pages     = {1728--1739},
  url       = {http://www.aclweb.org/anthology/C14-1163}