discodop.runexp¶

Run an experiment given a parameter file.

Does grammar extraction, parsing, and evaluation.

Functions

`dobinarization`(trees, sents, binarization, …)	Apply binarization to treebank.
`doparsing`(**kwds)	Parse a set of sentences using worker processes.
`getgrammars`(trees, sents, stages, …)	Read off the requested grammars.
`getposmodel`(postagging, train_tagged_sents)	Apply unknown word model to sentences before extracting grammar.
`initworker`(params)	Set global parameter object.
`loadtraincorpus`(corpusfmt, traincorpus, …)	Load the training corpus.
`mpworker`(args)	Multiprocessing wrapper of `worker`.
`oldeval`(results, goldbrackets)	Simple evaluation.
`parsetepacoc`([stages, trainmaxwords, …])	Parse the tepacoc test set.
`readtepacoc`()	Read the tepacoc test set.
`startexp`(prm[, resultdir, rerun])	Execute an experiment.
`worker`(args)	Parse a sentence using global Parser object, and evaluate incrementally.
`writeresults`(results, params)	Write parsing results to files in same format as the original corpus.

discodop.runexp.initworker(params)[source]¶: Set global parameter object.

discodop.runexp.startexp(prm, resultdir='results', rerun=False)[source]¶: Execute an experiment.

discodop.runexp.loadtraincorpus(corpusfmt, traincorpus, binarization, punct, functions, morphology, removeempty, ensureroot, transformations, relationalrealizational, resultdir)[source]¶: Load the training corpus.

discodop.runexp.getposmodel(postagging, train_tagged_sents)[source]¶: Apply unknown word model to sentences before extracting grammar.

discodop.runexp.dobinarization(trees, sents, binarization, relationalrealizational, logmsg=True)[source]¶: Apply binarization to treebank.

discodop.runexp.getgrammars(trees, sents, stages, testmaxwords, resultdir, numproc, lexmodel, top)[source]¶: Read off the requested grammars.

discodop.runexp.doparsing(**kwds)[source]¶: Parse a set of sentences using worker processes.

discodop.runexp.worker(args)[source]¶

Parse a sentence using global Parser object, and evaluate incrementally.

Returns:	a string with diagnostic information, as well as a list of DictObj instances with the results for each stage.

discodop.runexp.writeresults(results, params)[source]¶: Write parsing results to files in same format as the original corpus. (Or export if writer not implemented).

discodop.runexp.oldeval(results, goldbrackets)[source]¶: Simple evaluation.

discodop.runexp.parsetepacoc(stages=({'mode': 'pcfg', 'split': True, 'markorigin': True}, {'mode': 'plcfrs', 'prune': True, 'k': 10000}, {'mode': 'plcfrs', 'prune': True, 'k': 5000, 'dop': 'doubledop', 'estimator': 'rfe', 'objective': 'mpp'}), trainmaxwords=999, trainnumsents=25005, testmaxwords=999, binarization=DictObj(method='default', h=1, v=1, factor='right', tailmarker='', headrules='negra.headrules', leftmostunary=True, rightmostunary=True, markhead=False, fanout_marks_before_bin=False), transformations=None, usetagger='stanford', resultdir='tepacoc', numproc=1)[source]¶: Parse the tepacoc test set.