CRFsuite - Tutorial

Tutorial on CoNLL 2000 chunking task

Training and testing data

This tutorial demonstrates the use of CRFsuite for text chunking, which is to divide a text into syntactically correlated parts of words. We use the training and testing data distributed by the CoNLL 2000 shared task. Necessary scripts for this tutorial are included under example/CoNLL2000 directory in the CRFsuite distribution. Firstly, move the current directory to the example directory and download the training and testing data from their website:

$ cd example/CoNLL2000/
$ wget http://www.cnts.ua.ac.be/conll2000/chunking/train.txt.gz
$ wget http://www.cnts.ua.ac.be/conll2000/chunking/test.txt.gz
$ less train.txt.gz
... (snip) ...

London JJ B-NP
shares NNS I-NP
closed VBD B-VP
moderately RB B-ADVP
lower JJR I-ADVP
in IN B-PP
thin JJ B-NP
trading NN I-NP
. . O

At IN B-PP
Tokyo NNP B-NP
, , O
the DT B-NP
Nikkei NNP I-NP
index NN I-NP
of IN B-PP
225 CD B-NP
selected VBN I-NP
issues NNS I-NP
was VBD B-VP
up IN B-ADVP
112.16 CD B-NP
points NNS I-NP
to TO B-PP
35486.38 CD B-NP
. . O

... (snip) ...

The data consists of a set of sentences (sequences) each of which contains a series of words (e.g., 'London', 'shares'), part-of-speech tags (e.g., 'JJ', 'NNS'), and chunk labels (e.g., 'B-NP', 'I-NP') separated by space characters. In this tutorial, we would like to construct a CRF model that assigns a sequence of chunk labels, given a sequence of words and part-of-speech codes. Please refer to CoNLL 2000 shared task website for more information about this task.

Feature (attribute) generation

The next step is to preprocess the training and testing data to extract attributes that express the characteristics of words (items) in the data. In general, this is the most important process for machine-learning applications, affecting the labeling accuracy a lot. In this tutorial, we express a word at position t (in offsets from the begining of a sequence) with following 19 kinds of attributes:

  • ${x[t-2].token}, ${x[t-1].token}, ${x[t].token}, ${x[t+1].token}, ${x[t+2].token}
  • ${x[t-1].token}/${x[t].token}, ${x[t].token}/${x[t+1].token}
  • ${x[t-2].pos}, ${x[t-1].pos}, ${x[t].pos}, ${x[t+1].pos}, ${x[t+2].pos}
  • ${x[t-2].pos}/${x[t-1].pos}, ${x[t-1].pos}/${x[t].pos}, ${x[t].pos}/${x[t+1].pos}, ${x[t+1].pos}/${x[t+2].pos}
  • ${x[t-2].pos}/${x[t-1].pos}/${x[t].pos}, ${x[t-1].pos}/${x[t].pos}/${x[t+1].pos}, ${x[t].pos}/${x[t+1].pos}/${x[t+2].pos}

In this list, ${x[t].token} and ${x[t].pos} present the word and part-of-speech respectively at position t in a sequence. These features express the characteristic of the word at position t by using information from surrounding words, e.g., ${x[t-1].token} and ${x[t+1].pos}. This rule is compatible with the feature template for CoNLL 2000 in the CRF++ distribution.

It is easy to implement the conversion from the training/testing data to CRFsuite data. The CRFsuite distribution includes a Python script to_crfsuite.py that generates attributes from the CoNLL 2000 data. The procedure below converts train.txt.gz and test.txt.gz into train.crfsuite.txt and test.crfsuite.txt that are compatible with the CRFsuite data format.

$ zcat train.txt.gz | ./to_crfsuite.py > train.crfsuite.txt
$ zcat test.txt.gz | ./to_crfsuite.py > test.crfsuite.txt
$ less train.crfsuite.txt
... (snip) ...

B-NP    U00=    U01=    U02=London      U03=shares      U04=closed      U05=/Lon
don     U06=London/shares       U10=    U11=    U12=JJ  U13=NNS U14=VBD U15=/   
U16=/JJ U17=JJ/NNS      U18=NNS/VBD     U20=//JJ        U21=/JJ/NNS     U22=JJ/N
NS/VBD
I-NP    U00=    U01=London      U02=shares      U03=closed      U04=moderately  
U05=London/shares       U06=shares/closed       U10=    U11=JJ  U12=NNS U13=VBD 
U14=RB  U15=/JJ U16=JJ/NNS      U17=NNS/VBD     U18=VBD/RB      U20=/JJ/NNS     
U21=JJ/NNS/VBD  U22=NNS/VBD/RB
B-VP    U00=London      U01=shares      U02=closed      U03=moderately  U04=lowe
r       U05=shares/closed       U06=closed/moderately   U10=JJ  U11=NNS U12=VBD 
U13=RB  U14=JJR U15=JJ/NNS      U16=NNS/VBD     U17=VBD/RB      U18=RB/JJR      
U20=JJ/NNS/VBD  U21=NNS/VBD/RB  U22=VBD/RB/JJR
B-ADVP  U00=shares      U01=closed      U02=moderately  U03=lower       U04=in  
U05=closed/moderately   U06=moderately/lower    U10=NNS U11=VBD U12=RB  U13=JJR 
U14=IN  U15=NNS/VBD     U16=VBD/RB      U17=RB/JJR      U18=JJR/IN      U20=NNS/
VBD/RB  U21=VBD/RB/JJR  U22=RB/JJR/IN
I-ADVP  U00=closed      U01=moderately  U02=lower       U03=in  U04=thin        
U05=moderately/lower    U06=lower/in    U10=VBD U11=RB  U12=JJR U13=IN  U14=JJ  
U15=VBD/RB      U16=RB/JJR      U17=JJR/IN      U18=IN/JJ       U20=VBD/RB/JJR  
U21=RB/JJR/IN   U22=JJR/IN/JJ
B-PP    U00=moderately  U01=lower       U02=in  U03=thin        U04=trading     
U05=lower/in    U06=in/thin     U10=RB  U11=JJR U12=IN  U13=JJ  U14=NN  U15=RB/J
JR      U16=JJR/IN      U17=IN/JJ       U18=JJ/NN       U20=RB/JJR/IN   U21=JJR/
IN/JJ   U22=IN/JJ/NN
B-NP    U00=lower       U01=in  U02=thin        U03=trading     U04=.   U05=in/t
hin     U06=thin/trading        U10=JJR U11=IN  U12=JJ  U13=NN  U14=.   U15=JJR/
IN      U16=IN/JJ       U17=JJ/NN       U18=NN/.        U20=JJR/IN/JJ   U21=IN/J
J/NN    U22=JJ/NN/.
I-NP    U00=in  U01=thin        U02=trading     U03=.   U04=    U05=thin/trading
        U06=trading/.   U10=IN  U11=JJ  U12=NN  U13=.   U14=    U15=IN/JJ       
U16=JJ/NN       U17=NN/.        U18=./  U20=IN/JJ/NN    U21=JJ/NN/.     U22=NN/.
/
O       U00=thin        U01=trading     U02=.   U03=    U04=    U05=trading/.   
U06=./  U10=JJ  U11=NN  U12=.   U13=    U14=    U15=JJ/NN       U16=NN/.        
U17=./  U18=/   U20=JJ/NN/.     U21=NN/./       U22=.//

B-PP    U00=    U01=    U02=At  U03=Tokyo       U04=,   U05=/At U06=At/Tokyo    
U10=    U11=    U12=IN  U13=NNP U14=,   U15=/   U16=/IN U17=IN/NNP      U18=NNP/
... (snip) ...

Note that "U00=", "U01=", ... are prefixes to prevent name collisions of different kinds of attributes.

Training

Now we are ready to use CRFsuite for training. Simply type the following command to train a CRF model from train.crfsuite.txt. CRFsuite will read the training data, generate necessary state and transition features based on the data, maximize the log-likelihood of the conditional probability distribution, and store the model into CoNLL2000.model.

$ crfsuite learn -m CoNLL2000.model train.crfsuite.txt
CRFsuite 0.6  Copyright (c) 2007-2009 Naoaki Okazaki

Start time of the training: 2009-03-07T15:50:07Z

Reading the training data
0....1....2....3....4....5....6....7....8....9....10
Number of instances: 8936
Total number of items: 211727
Number of attributes: 338547
Number of labels: 22
Seconds required: 5.410

Training first-order linear-chain CRFs (trainer.crf1m)

Feature generation
feature.minfreq: 0.000000
feature.possible_states: 0
feature.possible_transitions: 0
feature.bos_eos: 1
0....1....2....3....4....5....6....7....8....9....10
Number of features: 456480
Seconds required: 1.900

L-BFGS optimization
regularization: L2
regularization.sigma: 10.000000
lbfgs.num_memories: 6
lbfgs.max_iterations: 2147483647
lbfgs.epsilon: 0.000010
lbfgs.stop: 10
lbfgs.delta: 0.000010
lbfgs.linesearch: MoreThuente
lbfgs.linesearch.max_iterations: 20

***** Iteration #1 *****
Log-likelihood: -264449.110672
Feature norm: 5.000000
Error norm: 42832.056705
Active features: 456480
Line search trials: 2
Line search step: 0.000048
Seconds required for this iteration: 6.310

***** Iteration #2 *****
Log-likelihood: -163057.244350
Feature norm: 8.506562
Error norm: 26117.210073
Active features: 456480
Line search trials: 1
Line search step: 1.000000
Seconds required for this iteration: 2.180

... (snip) ...

***** Iteration #89 *****
Log-likelihood: -704.485807
Feature norm: 331.586428
Error norm: 19.138697
Active features: 456480
Line search trials: 3
Line search step: 0.038164
Seconds required for this iteration: 6.450

L-BFGS terminated with error code (-1002)
Total seconds required for L-BFGS: 274.920

Storing the model
Number of active features: 456480 (456480)
Number of active attributes: 338547 (338547)
Number of active labels: 22 (22)
Writing labels
Writing attributes
Writing feature references for transitions
Writing feature references for attributes
Seconds required: 0.530

End time of the training: 2009-03-07T15:54:51Z

Although the training process terminated with "L-BFGS terminated with error code (-1002)", you do not have to worry about this error.

You can also train a CRF model, with -t option, watching its performance (accuracy, precision, recall, f1 score) evaluated on the test data. It should be exciting to see your model improved as the training process advances!

$ crfsuite learn -m CoNLL2000.model -t test.crfsuite.txt train.crfsuite.txt
CRFsuite 0.6  Copyright (c) 2007-2009 Naoaki Okazaki

Start time of the training: 2009-03-07T15:58:21Z

Reading the training data
0....1....2....3....4....5....6....7....8....9....10
Number of instances: 8936
Total number of items: 211727
Number of attributes: 338547
Number of labels: 22
Seconds required: 5.370

Reading the evaluation data
0....1....2....3....4....5....6....7....8....9....10
Number of instances: 2012
Number of total items: 47377
Seconds required: 1.260

Training first-order linear-chain CRFs (trainer.crf1m)

Feature generation
feature.minfreq: 0.000000
feature.possible_states: 0
feature.possible_transitions: 0
feature.bos_eos: 1
0....1....2....3....4....5....6....7....8....9....10
Number of features: 456482
Seconds required: 1.920

L-BFGS optimization
regularization: L2
regularization.sigma: 10.000000
lbfgs.num_memories: 6
lbfgs.max_iterations: 2147483647
lbfgs.epsilon: 0.000010
lbfgs.stop: 10
lbfgs.delta: 0.000010
lbfgs.linesearch: MoreThuente
lbfgs.linesearch.max_iterations: 20

***** Iteration #1 *****
Log-likelihood: -268663.973857
Feature norm: 5.000000
Error norm: 43686.795219
Active features: 456482
Line search trials: 2
Line search step: 0.000048
Seconds required for this iteration: 6.580
Performance by label (#match, #model, #ref) (precision, recall, F1):
    B-NP: (8282, 10425, 12422) (0.7944, 0.6667, 0.7250)
    B-PP: (3842, 5775, 4811) (0.6653, 0.7986, 0.7259)
    I-NP: (14133, 27651, 14376) (0.5111, 0.9831, 0.6726)
    B-VP: (0, 0, 4658) (0.0000, 0.0000, 0.0000)
    I-VP: (0, 0, 2646) (0.0000, 0.0000, 0.0000)
    B-SBAR: (0, 0, 535) (0.0000, 0.0000, 0.0000)
    O: (3483, 3526, 6180) (0.9878, 0.5636, 0.7177)
    B-ADJP: (0, 0, 438) (0.0000, 0.0000, 0.0000)
    B-ADVP: (0, 0, 866) (0.0000, 0.0000, 0.0000)
    I-ADVP: (0, 0, 89) (0.0000, 0.0000, 0.0000)
    I-ADJP: (0, 0, 167) (0.0000, 0.0000, 0.0000)
    I-SBAR: (0, 0, 4) (0.0000, 0.0000, 0.0000)
    I-PP: (0, 0, 48) (0.0000, 0.0000, 0.0000)
    B-PRT: (0, 0, 106) (0.0000, 0.0000, 0.0000)
    B-LST: (0, 0, 5) (0.0000, 0.0000, 0.0000)
    B-INTJ: (0, 0, 2) (0.0000, 0.0000, 0.0000)
    I-INTJ: (0, 0, 0) (******, ******, ******)
    B-CONJP: (0, 0, 9) (0.0000, 0.0000, 0.0000)
    I-CONJP: (0, 0, 13) (0.0000, 0.0000, 0.0000)
    I-PRT: (0, 0, 0) (******, ******, ******)
    B-UCP: (0, 0, 0) (******, ******, ******)
    I-UCP: (0, 0, 0) (******, ******, ******)
    I-LST: (0, 0, 2) (0.0000, 0.0000, 0.0000)
Macro-average precision, recall, F1: (0.128637, 0.130956, 0.123527)
Item accuracy: 29740 / 47377 (0.6277)
Instance accuracy: 37 / 2012 (0.0184)

... (snip) ...

***** Iteration #82 *****
Log-likelihood: -826.646530
Feature norm: 366.823234
Error norm: 28.532190
Active features: 456482
Line search trials: 1
Line search step: 1.000000
Seconds required for this iteration: 2.480
Performance by label (#match, #model, #ref) (precision, recall, F1):
    B-NP: (12000, 12403, 12422) (0.9675, 0.9660, 0.9668)
    B-PP: (4699, 4854, 4811) (0.9681, 0.9767, 0.9724)
    I-NP: (13931, 14444, 14376) (0.9645, 0.9690, 0.9668)
    B-VP: (4459, 4668, 4658) (0.9552, 0.9573, 0.9563)
    I-VP: (2526, 2656, 2646) (0.9511, 0.9546, 0.9528)
    B-SBAR: (452, 518, 535) (0.8726, 0.8449, 0.8585)
    O: (5941, 6149, 6180) (0.9662, 0.9613, 0.9637)
    B-ADJP: (313, 403, 438) (0.7767, 0.7146, 0.7444)
    B-ADVP: (702, 861, 866) (0.8153, 0.8106, 0.8130)
    I-ADVP: (49, 75, 89) (0.6533, 0.5506, 0.5976)
    I-ADJP: (110, 156, 167) (0.7051, 0.6587, 0.6811)
    I-SBAR: (2, 15, 4) (0.1333, 0.5000, 0.2105)
    I-PP: (34, 46, 48) (0.7391, 0.7083, 0.7234)
    B-PRT: (79, 103, 106) (0.7670, 0.7453, 0.7560)
    B-LST: (0, 0, 5) (0.0000, 0.0000, 0.0000)
    B-INTJ: (1, 2, 2) (0.5000, 0.5000, 0.5000)
    I-INTJ: (0, 0, 0) (******, ******, ******)
    B-CONJP: (5, 9, 9) (0.5556, 0.5556, 0.5556)
    I-CONJP: (10, 13, 13) (0.7692, 0.7692, 0.7692)
    I-PRT: (0, 0, 0) (******, ******, ******)
    B-UCP: (0, 0, 0) (******, ******, ******)
    I-UCP: (0, 2, 0) (******, ******, ******)
    I-LST: (0, 0, 2) (0.0000, 0.0000, 0.0000)
Macro-average precision, recall, F1: (0.567818, 0.571426, 0.564693)
Item accuracy: 45313 / 47377 (0.9564)
Instance accuracy: 1134 / 2012 (0.5636)

L-BFGS terminated with error code (-1002)
Total seconds required for L-BFGS: 271.120

Storing the model
Number of active features: 456482 (456482)
Number of active attributes: 338547 (390781)
Number of active labels: 23 (23)
Writing labels
Writing attributes
Writing feature references for transitions
Writing feature references for attributes
Seconds required: 0.530

End time of the training: 2009-03-07T16:03:02Z

This log message reports that the CRF model obtained from the training data achieved 95.6% item accuracy.

Tagging

You can apply the CRF model and tag chunk labels to the test data. Even though the test data distributed by the CoNLL 2000 shared task has chunk labels annotated (for evaluation purposes), CRFsuite ignores the existing labels and outputs label sequences (one label per line; delimitered by empty lines) predicted by the model.

$ cat test.crfsuite.txt
B-NP    U00=    U01=    U02=Rockwell    U03=International       U04=Corp.
U05=/Rockwell   U06=Rockwell/International      U10=    U11=    U12=NNP U13=NNP
U14=NNP U15=/   U16=/NNP        U17=NNP/NNP     U18=NNP/NNP     U20=//NNP
U21=/NNP/NNP    U22=NNP/NNP/NNP
I-NP    U00=    U01=Rockwell    U02=International       U03=Corp.       U04='s
U05=Rockwell/International      U06=International/Corp. U10=    U11=NNP U12=NNP
U13=NNP U14=POS U15=/NNP        U16=NNP/NNP     U17=NNP/NNP     U18=NNP/POS
U20=/NNP/NNP    U21=NNP/NNP/NNP U22=NNP/NNP/POS
I-NP    U00=Rockwell    U01=International       U02=Corp.       U03='s  U04=Tuls
a       U05=International/Corp. U06=Corp./'s    U10=NNP U11=NNP U12=NNP U13=POS
U14=NNP U15=NNP/NNP     U16=NNP/NNP     U17=NNP/POS     U18=POS/NNP     U20=NNP/
NNP/NNP U21=NNP/NNP/POS U22=NNP/POS/NNP
B-NP    U00=International       U01=Corp.       U02='s  U03=Tulsa       U04=unit
        U05=Corp./'s    U06='s/Tulsa    U10=NNP U11=NNP U12=POS U13=NNP U14=NN
U15=NNP/NNP     U16=NNP/POS     U17=POS/NNP     U18=NNP/NN      U20=NNP/NNP/POS
U21=NNP/POS/NNP U22=POS/NNP/NN
I-NP    U00=Corp.       U01='s  U02=Tulsa       U03=unit        U04=said
U05='s/Tulsa    U06=Tulsa/unit  U10=NNP U11=POS U12=NNP U13=NN  U14=VBD U15=NNP/
POS     U16=POS/NNP     U17=NNP/NN      U18=NN/VBD      U20=NNP/POS/NNP U21=POS/
NNP/NN  U22=NNP/NN/VBD
I-NP    U00='s  U01=Tulsa       U02=unit        U03=said        U04=it  U05=Tuls
a/unit  U06=unit/said   U10=POS U11=NNP U12=NN  U13=VBD U14=PRP U15=POS/NNP
U16=NNP/NN      U17=NN/VBD      U18=VBD/PRP     U20=POS/NNP/NN  U21=NNP/NN/VBD
U22=NN/VBD/PRP
B-VP    U00=Tulsa       U01=unit        U02=said        U03=it  U04=signed
U05=unit/said   U06=said/it     U10=NNP U11=NN  U12=VBD U13=PRP U14=VBD U15=NNP/
NN      U16=NN/VBD      U17=VBD/PRP     U18=PRP/VBD     U20=NNP/NN/VBD  U21=NN/V
BD/PRP  U22=VBD/PRP/VBD
... (snip) ...

$ crfsuite tag -m CoNLL2000.model test.crfsuite.txt
B-NP
I-NP
I-NP
B-NP
I-NP
I-NP
B-VP
B-NP
B-VP
B-NP
I-NP
I-NP
B-VP
B-NP
I-NP
B-PP
B-NP
I-NP
B-VP
I-VP
B-NP
I-NP
B-PP
... (snip) ...

CRFsuite can also evaluate the CRF model with labeled test data with "-qt" options.

$ crfsuite tag -qt -m CoNLL2000.model test.crfsuite.txt
CRFsuite 0.6  Copyright (c) 2007-2009 Naoaki Okazaki

Performance by label (#match, #model, #ref) (precision, recall, F1):
    B-NP: (11997, 12400, 12422) (0.9675, 0.9658, 0.9666)
    B-PP: (4699, 4854, 4811) (0.9681, 0.9767, 0.9724)
    I-NP: (13931, 14444, 14376) (0.9645, 0.9690, 0.9668)
    B-VP: (4459, 4668, 4658) (0.9552, 0.9573, 0.9563)
    I-VP: (2526, 2656, 2646) (0.9511, 0.9546, 0.9528)
    B-SBAR: (452, 518, 535) (0.8726, 0.8449, 0.8585)
    O: (5941, 6149, 6180) (0.9662, 0.9613, 0.9637)
    B-ADJP: (313, 403, 438) (0.7767, 0.7146, 0.7444)
    B-ADVP: (702, 861, 866) (0.8153, 0.8106, 0.8130)
    I-ADVP: (49, 75, 89) (0.6533, 0.5506, 0.5976)
    I-ADJP: (110, 156, 167) (0.7051, 0.6587, 0.6811)
    I-SBAR: (2, 15, 4) (0.1333, 0.5000, 0.2105)
    I-PP: (34, 46, 48) (0.7391, 0.7083, 0.7234)
    B-PRT: (79, 103, 106) (0.7670, 0.7453, 0.7560)
    B-LST: (0, 0, 5) (0.0000, 0.0000, 0.0000)
    B-INTJ: (1, 2, 2) (0.5000, 0.5000, 0.5000)
    I-INTJ: (0, 0, 0) (******, ******, ******)
    B-CONJP: (5, 9, 9) (0.5556, 0.5556, 0.5556)
    I-CONJP: (10, 13, 13) (0.7692, 0.7692, 0.7692)
    I-PRT: (0, 0, 0) (******, ******, ******)
    B-UCP: (0, 1, 0) (******, ******, ******)
    I-UCP: (0, 4, 0) (******, ******, ******)
    I-LST: (0, 0, 2) (0.0000, 0.0000, 0.0000)
Macro-average precision, recall, F1: (0.567817, 0.571415, 0.564688)
Item accuracy: 45310 / 47377 (0.9564)
Instance accuracy: 1131 / 2012 (0.5621)
Elapsed time: 0.840000 [sec] (2395.2 [instance/sec])

Dumping the model file

When we improve the accuracy of a CRF model by tweaking the feature set, it may be useful to see the feature weights assigned by a trainer. You cannot simply read the model file since CRFsuite stores models in a binary format for the efficiency reason. Therefore, you need to use the dump command to read a model in plain text format.

$ crfsuite dump CoNLL2000.model
FILEHEADER = {
  magic: lCRF
  size: 28242501
  type: FOMC
  version: 100
  num_features: 0
  num_labels: 23
  num_attrs: 338547
  off_features: 0x30
  off_labels: 0x8B4EE4
  off_attrs: 0x8B5A0C
  off_labelrefs: 0x169C145
  off_attrrefs: 0x169C515
}

LABELS = {
      0: B-NP
      1: B-PP
      2: I-NP
      3: B-VP
      4: I-VP
      5: B-SBAR
      6: O
      7: B-ADJP
      8: B-ADVP
      9: I-ADVP
     10: I-ADJP
     11: I-SBAR
     12: I-PP
     13: B-PRT
     14: B-LST
     15: B-INTJ
     16: I-INTJ
     17: B-CONJP
     18: I-CONJP
     19: I-PRT
     20: B-UCP
     21: I-UCP
     22: I-LST
}

ATTRIBUTES = {
      0: U00=
      1: U01=
      2: U02=Confidence
      3: U03=in
      4: U04=the
      5: U05=/Confidence
      6: U06=Confidence/in
      7: U10=
... (snip) ...
}

TRANSITIONS = {
  (1) B-NP --> B-NP: 2.327985
  (1) B-NP --> B-PP: 4.391125
  (1) B-NP --> I-NP: 30.372649
  (1) B-NP --> B-VP: 7.725525
  (1) B-NP --> B-SBAR: 1.821388
  (1) B-NP --> O: 3.805715
  (1) B-NP --> B-ADJP: 4.801651
  (1) B-NP --> B-ADVP: 3.842473
... (snip) ...
}

TRANSITIONS_FROM_BOS = {
  (2) BOS --> B-NP: 17.875605
  (2) BOS --> B-PP: -0.318745
  (2) BOS --> I-NP: -4.387101
  (2) BOS --> B-VP: -0.383031
  (2) BOS --> I-VP: -1.163315
  (2) BOS --> B-SBAR: 1.368176
  (2) BOS --> O: 2.783132
... (snip) ...
}

TRANSITIONS_TO_EOS = {
  (3) B-NP --> EOS: 16.156051
  (3) B-PP --> EOS: -1.045312
  (3) I-NP --> EOS: -2.762051
  (3) B-VP --> EOS: -0.767247
  (3) I-VP --> EOS: -1.113502
  (3) B-SBAR --> EOS: -2.407145
  (3) O --> EOS: 4.131429
... (snip) ...
}

STATE_FEATURES = {
  (0) U00= --> B-NP: -2.622045
  (0) U00= --> B-PP: -1.562976
  (0) U00= --> I-NP: -2.555526
  (0) U00= --> B-VP: -1.329829
  (0) U00= --> I-VP: -1.152970
  (0) U00= --> B-SBAR: -2.590170
  (0) U00= --> O: -1.584688
  (0) U00= --> B-ADJP: -1.526879
... (snip) ...
}