Computational Linguistics Software

I maintain and quasi-maintain some NLP software packages.


The software on this page and its subpages is made available under the GNU GPL. By downloading the software, you acknowledge the terms and conditions of the GPL. If you use the software, please cite the papers indicated on the respective pages.


  • 2010. German Named Entity Recognition. German classifiers for the Stanford CRF-based NER systems (optimized in April 2010) and manually annotated EUROPARL data as out-of-domain testset. Follow this link.
  • 2007. DependencyVectors. A software package to produce vector space models from dependency-parsed corpora; available from this page.
  • 2006. Shalmaneser. A role-semantic parser, pre-trained for English and German in the FrameNet paradigm. Stored externally -- please follow the link to
  • 2006. Significance testing. A small Java implementation of approximate randomization to test the significance of differences in evaluation statistics. It is applicable if the sufficient stastistics of the metric can be computed for small, (approximately) independent strata of the data. (This is possible (e.g.) for accuracies or F-Scores, but not for correlations. In that case, use the "bootstrap".) Implementation available from this page.