C. Scheible and R. Klinger and S. Pado. Model Architectures for Quotation Detection. Proceedings of ACL. Berlin, Germany.


Quotation detection is the task of locating spans of quoted speech in text. The state of the art treats this problem as a sequence labeling task and employs linear-chain conditional random fields. We question the efficacy of this choice: The Markov assumption in the model prohibits it from making joint decisions about the begin, end, and internal context of a quotation. We perform an extensive analysis with two new model architectures. We find that (a), simple boundary classification combined with a greedy prediction strategy is competitive with the state of the art; (b), a semi-Markov model significantly outperforms all others, by relaxing the Markov assumption.


@InProceedings{scheible16:_model_archit_quotat_detec,
  author = 	 {Christian Scheible and Roman Klinger and Sebastian Padó},
  title = 	 {Model Architectures for Quotation Detection},
  booktitle = {Proceedings of ACL},
  year = 	 2016,
  address = 	 {Berlin, Germany},
  pages = 	"1736--1745",
}