M. Faruqui and S. Pado. Training and Evaluating a German Named Entity Recognizer with Semantic Generalization. Proceedings of Konvens 2010, Saarbrücken, Germany.


We present a freely available optimized Named Entity Recognizer (NER) for German. It alleviates the small size of available NER training corpora for German with distributional generalization features trained on large unlabelled corpora. We vary the size and source of the generalization corpus and find improvements of 6% F1-score (in-domain) and 9% (out-of-domain) over simple supervised training.


The software and data is available by following this link.


@InProceedings{faruqui10:_training
  author =       {Manaal Faruqui and Sebastian Pad\'{o}},
  title =        {Training and Evaluating a German Named Entity Recognizer 
                  with Semantic Generalization},
  booktitle = {Proceedings of KONVENS 2010},
  year =         2010,
  address =      {Saarbr\"ucken, Germany}}