CRFSuite is an implementation of Conditional Random Fields (CRFs) [Lafferty01, Sha03] for labeling sequential data. Among the various implementations of CRFs, this software provides following features.
- Speed-oriented implementation written in pure C. The first priority of this software is to train and use CRF models as fast as possible even at the expense of its memory space and code generality. CRFsuite is 2.2 - 56.4 times faster than C++ implementations in training on the CoNLL 2000 chunking shared task. See the benchmark result for more information.
- Very fast parameter estimation using Stochastic Gradient Descent (SGD). The iterative algorithm is based on Pegasos[Shalev-Shwartz07]. Calibration of learning rate is inspired by sgd.
- Fast parameter estimation using Limited-memory BFGS (L-BFGS) method[Nocedal80], which is reported to outperform other algorithms such as Generalized Iterative Scaling (GIS) [Malouf02]. CRFsuite employs libLBFGS for implementing L-BFGS and OWL-QN methods.
- L1 regularization (Laplacian prior) using Orthant-Wise Limited-memory Quasi-Newton (OWL-QN) method [Andrew07]. This is useful to obtain sparse models in which a number of ineffective features are removed from models (with zero weights).
- L2 regularization (Gaussian prior). L2 regularization is reported to achieve better accuracy than L1, but the obtained model will be large.
- Forward/backward algorithm using the scaling method[Rabiner90]. The scaling method seems faster than computing the forward/backward scores in logarithm domain.
- Linear-chain (first-order Markovian) CRF.
- Performance evaluation on training. CRFsuite can output precision, recall, F1 scores of the model evaluated on test data.
- Simple data I/O format. Users can design a large number of arbitrary state features. Edge features are generated automatically from the set of labels in the training data.
- An efficient file format for storing/accessing CRF models using Constant Quark Database (CQDB). It takes a little time to start up a tagger since a preparation is done only by reading an entire model file to a memory block. Retriving the weight of a feature is also very quick.
For more information about CRFsuite, please refer to these pages.
The current release is CRFsuite version 0.10.
- Source package (the source package requires libLBFGS 1.8 or later)
- Win32 binary (this binary requires Microsoft Visual C++ 2008 SP1 Redistributable Package to be installed on your computers)
- Linux 32bit binary
- Linux 64bit binary
CRFsuite is distributed under the modified BSD license.
Please use the following BibTex entry when you cite CRFsuite in your papers.
@misc{CRFsuite,
author = {Naoaki Okazaki},
title = {CRFsuite: a fast implementation of Conditional Random Fields (CRFs)},
url = {http://www.chokkan.org/software/crfsuite/},
year = {2007}
}
- CRFsuite 0.10 (2009-01-29)
-
- A patch submitted by Hiroshi Manabe (at Kodensha Co., Ltd.) to fix memory leak problems in the tagger.
- Added a new option -r (--reference) for the tagger to output reference labels in parallel with predicted labels.
- CRFsuite 0.9 (2009-09-24)
-
- Fixed a build problem with liblbfgs 1.8.
- Linux binaries for x86 32bit and 64bit architectures.
- CRFsuite 0.8 (2009-03-17)
-
- Revised the format of model files; new model files are portable across CPUs with different byte orders, e.g., x86 (little endian) and SPARC (big endian). For example, one can train a CRF model on an x86 machine and use the obtained model transparently on different CPUs (e.g., SPARC or PowerPC). Note that this fix breaks the compatibility of model files; CRFsuite 0.8 cannot read model files generated by CRFsuite 0.7 or earlier.
- Fixed a crash problem in tagging on some machine architectures.
- CRFsuite 0.7 (2009-03-10)
-
- Updated RumAVL library to version 4.0.0. This fixes a crash problem occurring in feature generation on some architectures.
- CRFsuite 0.6 (2009-03-07)
-
- A new training algorithm, Stochastic Gradient Descent (SGD), for maximizing L2-regularized log-likelihood. Add "-p algorithm=sgd" option to use SGD for training. The iterative algorithm is based on Pegasos [Shalev-Shwartz07]. Calibration of learning rate is inspired by sgd.
- Updated the benchmark page with sgd and MALLET.
- Updated the L-BFGS routine to liblbfgs 1.7.
- Reduced memory usage in training.
- Supported escape sequences in training/test data; "\:" and "\\" represent ':' and '\', respectively. The conversion script for CoNLL-2000 to_crfsuite.py was also updated. Please regenerate the training and test sets by using the latest conversion script.
- Restructured the source code so that we can easily add training algorithms in future.
- Added a parameter to configure the number of trials for line-search algorithms.
- CRFsuite 0.5 (2008-11-19)
-
- Updated the L-BFGS routine to liblbfgs 1.6.
- New parameters lbfgs.stop, lbfgs.delta, and lbfgs.linesearch were added.
- Fixed a bug in which the frontend tools could not parse "item:value" format correctly.
- Fixed a bug in computing the accuracy.
- Fixed a bug when the tagger receives an item with no feature.
- CRFsuite 0.4 (2008-02-05)
-
- Website and documentation for CRFsuite.
- Tutorial on the CoNLL 2000 chunking shared task.
- Performance comparison on the CoNLL 2000 chunking shared task.
- Bug fix in L2 regularization.
- A number of small improvements for the public release.
- CRFsuite 0.3 (2007-12-12)
-
- Implemented scaling method for forward/backward algorithm.
- Removed the code for computing the forward/backward algorithm in logarithm domain.
- CRFsuite 0.2 (2007-11-30)
-
- Orthant-Wise Limited-memory Quasi-Newton (OW-LQN) method for L1 regularization.
- Configurable L-BFGS parameters (number of limited memories, epsilon).
- CRFsuite 0.1 (2007-10-29)
-
- Initial release.
[Andrew07] “Scalable training of L1-regularized log-linear models”. Proceedings of the 24th International Conference on Machine Learning (ICML 2007). 33-40. 2007.
[Lafferty01] “Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data”. Proceedings of the 18th International Conference on Machine Learning. 282-289. 2001.
[Malouf02] “A comparison of algorithms for maximum entropy parameter estimation”. Proceedings of the 6th conference on Natural language learning (CoNLL-2002). 49-55. 2002.
[Nocedal80] “Updating Quasi-Newton Matrices with Limited Storage”. Mathematics of Computation. 151. 773-782. 1980.
[Rabiner90] “A tutorial on hidden Markov models and selected applications in speech recognition”. Readings in speech recognition. 267-296. 1990. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.
[Sha03] “Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data”. NAACL '03: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology. 134-141. 2003.
[Shalev-Shwartz07] “Pegasos: Primal Estimated sub-GrAdient SOlver for SVM”. Proceedings of the 24th International Conference on Machine Learning (ICML 2007). 807-814. 2007.