Gupta and Boleda and Baroni and Pado 2015

A. Gupta and G. Boleda and M. Baroni and S. Pado: Distributional vectors encode referential attributes. Proceedings of EMNLP-15, Lisbon, Portugal.

Distributional methods have proven to excel at capturing fuzzy, graded aspects of meaning (Italy is more similar to Spain than to Germany). In contrast, it is difficult to extract the values of more specific attributes of word referents from distributional representations, attributes of the kind typically found in structured knowledge bases (Italy has 60 million inhabitants). In this paper, we pursue the hypothesis that distributional vectors also implicitly encode referential attributes. We show that a standard supervised regression model is in fact sufficient to retrieve such attributes to a reasonable degree of accuracy: When evaluated on the prediction of both categorical and numeric attributes of countries and cities, the model consistently reduces baseline error by 30%, and is not far from the upper bound. Further analysis suggests that our model is able to "objectify" distributional representations for entities, anchoring them more firmly in the external world in measurable ways.

@InProceedings{gupta-EtAl:2015:EMNLP,
  author    = {Gupta, Abhijeet  and  Boleda, Gemma  and  
               Baroni, Marco  and  Pad\'{o}, Sebastian},
  title     = {Distributional vectors encode referential attributes},
  booktitle = {Proceedings of the 2015 Conference on Empirical 
               Methods in Natural Language Processing},
  year      = {2015},
  address   = {Lisbon, Portugal},
  pages     = {12--21}
}

Sebastian Padó