discodop.runexp

Run an experiment given a parameter file.

Does grammar extraction, parsing, and evaluation.

Functions

dobinarization(trees, sents, binarization, …) Apply binarization to treebank.
doparsing(**kwds) Parse a set of sentences using worker processes.
getgrammars(trees, sents, stages, …) Read off the requested grammars.
getposmodel(postagging, train_tagged_sents) Apply unknown word model to sentences before extracting grammar.
initworker(params) Set global parameter object.
loadtraincorpus(corpusfmt, traincorpus, …) Load the training corpus.
mpworker(args) Multiprocessing wrapper of worker.
oldeval(results, goldbrackets) Simple evaluation.
parsetepacoc([stages, trainmaxwords, …]) Parse the tepacoc test set.
readtepacoc() Read the tepacoc test set.
startexp(prm[, resultdir, rerun]) Execute an experiment.
worker(args) Parse a sentence using global Parser object, and evaluate incrementally.
writeresults(results, params) Write parsing results to files in same format as the original corpus.
discodop.runexp.initworker(params)[source]

Set global parameter object.

discodop.runexp.startexp(prm, resultdir='results', rerun=False)[source]

Execute an experiment.

discodop.runexp.loadtraincorpus(corpusfmt, traincorpus, binarization, punct, functions, morphology, removeempty, ensureroot, transformations, relationalrealizational, resultdir)[source]

Load the training corpus.

discodop.runexp.getposmodel(postagging, train_tagged_sents)[source]

Apply unknown word model to sentences before extracting grammar.

discodop.runexp.dobinarization(trees, sents, binarization, relationalrealizational, logmsg=True)[source]

Apply binarization to treebank.

discodop.runexp.getgrammars(trees, sents, stages, testmaxwords, resultdir, numproc, lexmodel, top)[source]

Read off the requested grammars.

discodop.runexp.doparsing(**kwds)[source]

Parse a set of sentences using worker processes.

discodop.runexp.worker(args)[source]

Parse a sentence using global Parser object, and evaluate incrementally.

Returns:a string with diagnostic information, as well as a list of DictObj instances with the results for each stage.
discodop.runexp.writeresults(results, params)[source]

Write parsing results to files in same format as the original corpus. (Or export if writer not implemented).

discodop.runexp.oldeval(results, goldbrackets)[source]

Simple evaluation.

discodop.runexp.readtepacoc()[source]

Read the tepacoc test set.

discodop.runexp.parsetepacoc(stages=({'mode': 'pcfg', 'split': True, 'markorigin': True}, {'mode': 'plcfrs', 'prune': True, 'k': 10000}, {'mode': 'plcfrs', 'prune': True, 'k': 5000, 'dop': 'doubledop', 'estimator': 'rfe', 'objective': 'mpp'}), trainmaxwords=999, trainnumsents=25005, testmaxwords=999, binarization=DictObj(method='default', h=1, v=1, factor='right', tailmarker='', headrules='negra.headrules', leftmostunary=True, rightmostunary=True, markhead=False, fanout_marks_before_bin=False), transformations=None, usetagger='stanford', resultdir='tepacoc', numproc=1)[source]

Parse the tepacoc test set.