M Faruqui and S Pado. Towards a model of formal and informal address in English. Proceedings of EACL 2012, Avignon.

Note: Data available here.

Informal and formal (T/V) address in dialogue is not distinguished overtly in modern English, e.g. by pronoun choice like in many other languages such as French (tu/vous). Our study investigates the status of the T/V distinction in English literary texts. Our main findings are: (a) human raters can label monolingual English utterances as T or V fairly well, given sufficient context; (b), a bilingual corpus can be exploited to induce a supervised classifier for T/V without human annotation. It assigns T/V at sentence level with up to 68% accuracy, relying mainly on lexical features; (c), there is a marked asymmetry between lexical features for formal speech (which are conventionalized and therefore general) and informal speech (which are text-specific).

  author    = {Manaal Faruqui and Sebastian Pad\'{o}},
  title     = {Towards a model of formal and informal address in English},
  booktitle = {Proceedings of EACL-2012},
  address   = {Avignon, France},
  year = {2012},
  pages = {623--633}