discodop.parser¶
Parser object that performs coarse-to-fine and postprocessing.
Additionally, a simple command line interface similar to bitpar.
Functions
doparsing(parser, infile, out, printprob, …) |
Parse sentences from file and write results to file, log to stdout. |
estimateitems(sent, prune, mode, dop) |
Estimate number of chart items needed for a given sentence. |
initworker(parser, printprob, usetags, …) |
Load parser for a worker process. |
main() |
Handle command line arguments. |
mpworker(args) |
Parse a single sentence (multiprocessing wrapper). |
probmult(prob1, prob2) |
Multiply probabilities (and optionally number of subtrees). |
probstr(prob) |
Render probability (and optionally number of subtrees) as string. |
readgrammars(resultdir, stages[, …]) |
Read the grammars from a previous experiment. |
readinputbitparstyle(infile) |
Yields lists of tokens, where ‘nn’ identifies a sentence break. |
readparam(filename) |
Parse a parameter file. |
worker(args) |
Parse a single sentence. |
Classes
DictObj(*args, **kwds) |
Trivial class to wrap a dictionary for reasons of syntactic sugar. |
Parser(prm[, funcclassifier, loadtrees]) |
A coarse-to-fine parser based on a given set of parameters. |
-
class
discodop.parser.DictObj(*args, **kwds)[source]¶ Trivial class to wrap a dictionary for reasons of syntactic sugar.
-
class
discodop.parser.Parser(prm, funcclassifier=None, loadtrees=False)[source]¶ A coarse-to-fine parser based on a given set of parameters.
Parameters: - prm – A DictObj with parameters as returned by
parser.readparam(). - funcclassifier – optionally, a function tag classifier trained by
functiontags.trainfunctionclassifier().
-
parse(sent, tags=None, root=None, goldtree=None, require=(), block=())[source]¶ Parse a sentence and perform postprocessing.
Yields a dictionary from parse trees to probabilities for each stage.
Parameters: - sent – a sequence of tokens.
- tags – optionally, a list of POS tags as strings to be given to the parser instead of trying all possible tags.
- root – optionally, specify a non-default root label.
- goldtree – if given, will be used to evaluate pruned parse forests.
- require – optionally, a list of tuples
(label, indices); only parse trees containing these labeled spans will be returned. For example,('NP', [0, 1, 2]). - block – optionally, a list of tuples
(label, indices); these labeled spans will be pruned.
- prm – A DictObj with parameters as returned by
-
discodop.parser.readgrammars(resultdir, stages, postagging=None, transformations=None, top='ROOT', cache=False)[source]¶ Read the grammars from a previous experiment.
Expects a directory
resultdirwhich contains the relevant grammars and the parameter fileparams.prm, as produced byrunexp.
-
discodop.parser.probstr(prob)[source]¶ Render probability (and optionally number of subtrees) as string.
-
discodop.parser.readparam(filename)[source]¶ Parse a parameter file.
Parameters: filename – The file should contain a list of comma-separated attribute=valuepairs and will be read usingeval('dict(%s)' % open(file).read()).Returns: A DictObj.
-
discodop.parser.readinputbitparstyle(infile)[source]¶ Yields lists of tokens, where ‘nn’ identifies a sentence break.
Lazy version of
infile.read().split('\n\n').