CRFsuite

A fast implementation of Conditional Random Fields (CRFs)

Introduction

CRFSuite is an implementation of Conditional Random Fields (CRFs) [Lafferty01, Sha03] for labeling sequential data. Among the various implementations of CRFs, this software provides following features.

  • Speed-oriented implementation written in pure C. The first priority of this software is to train and use CRF models as fast as possible even at the expense of its memory space and code generality. CRFsuite is 5.4 - 61.8 times faster than C++ implementations in training on the CoNLL 2000 chunking shared task. See the benchmark result for more information.
  • Simple data I/O format. Users can design a large number of arbitrary state features. Edge features are generated automatically from the set of labels in the training data.
  • Linear-chain (first-order Markovian) CRF.
  • Fast parameter estimation using Limited-memory BFGS (L-BFGS) method[Nocedal80], which is reported to outperform other algorithms such as Generalized Iterative Scaling (GIS) [Malouf02]. CRFsuite employs libLBFGS for implementing L-BFGS and OW-LQN methods.
  • L1 regularization (Laplacian prior) using Orthant-Wise Limited-memory Quasi-Newton (OW-LQN) method [Andrew07]. This is useful to obtain sparse models in which a number of ineffective features are removed from models (with zero weights).
  • L2 regularization (Gaussian prior). L2 regularization is reported to achieve better accuracy than L1, but the obtained model will be large.
  • Forward/backward algorithm using the scaling method[Rabiner90]. The scaling method seems faster than computing the forward/backward scores in logarithm domain.
  • An efficient file format for storing/accessing CRF models using Constant Quark Database (CQDB). It takes a little time to start up a tagger since a preparation is done only by reading an entire model file to a memory block. Retriving the weight of a feature is also very quick.

For more information about CRFsuite, please refer to these pages.

Download

The current release is CRFsuite version 0.5.

CRFsuite is distributed under the modified BSD license.

Please use the following BibTex entry when you cite CRFsuite in your papers.

@misc{CRFsuite,
	author = {Naoaki Okazaki},
	title = {CRFsuite: a fast implementation of Conditional Random Fields (CRFs)},
	url = {http://www.chokkan.org/software/crfsuite/},
	year = {2007}
}

Change log

CRFsuite 0.5 (2008-11-19)
  • Updated the L-BFGS routine to liblbfgs 1.6.
  • New parameters lbfgs.stop, lbfgs.delta, and lbfgs.linesearch were added.
  • Fixed a bug in which the frontend tools could not parse "item:value" format correctly.
  • Fixed a bug in computing the accuracy.
  • Fixed a bug when the tagger receives an item with no feature.
CRFsuite 0.4 (2008-02-05)
  • Website and documentation for CRFsuite.
  • Tutorial on the CoNLL 2000 chunking shared task.
  • Performance comparison on the CoNLL 2000 chunking shared task.
  • Bug fix in L2 regularization.
  • A number of small improvements for the public release.
CRFsuite 0.3 (2007-12-12)
  • Implemented scaling method for forward/backward algorithm.
  • Removed the code for computing the forward/backward algorithm in logarithm domain.
CRFsuite 0.2 (2007-11-30)
  • Orthant-Wise Limited-memory Quasi-Newton (OW-LQN) method for L1 regularization.
  • Configurable L-BFGS parameters (number of limited memories, epsilon).
CRFsuite 0.1 (2007-10-29)
  • Initial release.

References

[Andrew07] Galen Andrew and Jianfeng Gao. Scalable training of L1-regularized log-linear models”. Proceedings of the 24th International Conference on Machine Learning (ICML 2007). 33-40. 2007.

[Lafferty01] John Lafferty, Andrew McCallum, and Fernando Pereira. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data”. Proceedings of the 18th International Conference on Machine Learning. 282-289. 2001.

[Malouf02] Robert Malouf. A comparison of algorithms for maximum entropy parameter estimation”. Proceedings of the 6th conference on Natural language learning (CoNLL-2002). 49-55. 2002.

[Nocedal80] Jorge Nocedal. “Updating Quasi-Newton Matrices with Limited Storage”. Mathematics of Computation. 35. 151. 773-782. 1980.

[Rabiner90] Lawrence R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition”. Readings in speech recognition. 267-296. 1990. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.

[Sha03] Fei Sha and Fernando Pereira. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data”. NAACL '03: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology. 134-141. 2003.