Parser parameters¶
A parser is defined by a sequence of stages, and a set of global options:
stages=[
stage1,
stage2,
],
corpusfmt='...',
traincorpus=dict(...),
testcorpus=dict(...),
binarization=dict(...),
key1=val1,
key2=val2,
The parameters consist of a Python expression surrounded by an implicit
'dict('
and ')'
. Note that each key=value
is separated by a comma.
Corpora¶
corpusfmt: | The corpus format; choices:
|
||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
traincorpus: | a dictionary with the following keys:
|
||||||||||||
testcorpus: | a dictionary with the following keys:
|
Binarization¶
binarization: | a dictionary with the following keys:
|
---|
Stages¶
Through the use of stages it is possible to run multiple parsers on the same test set, or to exploit coarse-to-fine pruning.
A stage has the form:
dict(
key1=val1,
key2=val2,
...
)
Where the keys and values are:
name: | identifier, used for filenames |
||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
mode: | The type of parser to use
|
||||||||||||||
prune: | specify the name of a previous stage to enable coarse-to-fine pruning. |
||||||||||||||
split: | split disc. nodes |
||||||||||||||
markorigin: | mark origin of split nodes: |
||||||||||||||
k: | pruning parameter:
|
||||||||||||||
m: | number of k-best derivations to enumerate. |
||||||||||||||
dop: | enable DOP mode:
|
||||||||||||||
estimator: | DOP estimator. Choices:
|
||||||||||||||
objective: | Objective function to choose DOP parse tree. Choices:
|
||||||||||||||
sldop_n: | When using sl-dop or sl-dop-simple, number of most likely parse trees to consider. |
||||||||||||||
maxdepth: | with |
||||||||||||||
maxfrontier: | with |
||||||||||||||
collapse: | apply a multilevel coarse-to-fine preset. values are of the form
|
||||||||||||||
packedgraph: | use packed graph encoding for DOP reduction |
||||||||||||||
neverblockre: | do not prune nodes with label that match this regex |
||||||||||||||
estimates: | compute, store & use context-summary (outside) estimates |
||||||||||||||
beam_beta: | beam pruning factor, between 0 and 1; 1 to disable.
if enabled, new constituents must have a larger probability
than the probability of the best constituent in a cell multiplied by this
factor; i.e., a smaller value implies less pruning.
Suggested value: |
||||||||||||||
beam_delta: | if beam pruning is enabled, only apply it to spans up to this length. |
Other options¶
evalparam: | EVALB-style parameter file to use for reporting F-scores |
||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
postagging: | To disable POS tagging and use the gold POS tags from the
test set, set this to
|
||||||||||||||||||||||||||||||||||||||
punct: | one of …
|
||||||||||||||||||||||||||||||||||||||
functions: | one of …
|
||||||||||||||||||||||||||||||||||||||
morphology: | one of …
|
||||||||||||||||||||||||||||||||||||||
lemmas: | one of …
|
||||||||||||||||||||||||||||||||||||||
removeempty: |
|
||||||||||||||||||||||||||||||||||||||
ensureroot: | Ensure every tree has a root node with this label |
||||||||||||||||||||||||||||||||||||||
transformations: | |||||||||||||||||||||||||||||||||||||||
Apply specific treebank transforms; available presets:
|
|||||||||||||||||||||||||||||||||||||||
relationalrealizational: | |||||||||||||||||||||||||||||||||||||||
apply RR-transform;
see |
|||||||||||||||||||||||||||||||||||||||
verbosity: | control the amount of output to console;
a logfile
|
||||||||||||||||||||||||||||||||||||||
numproc: | default 1; increase to use multiple CPUs; |