discodop.coarsetofine

Project selected items from a chart to corresponding items in next grammar.

Functions

bitparkbestitems(Chart chart, int k, ...) Produce ChartItems occurring in a dictionary of derivations.
doctftest(coarse, fine, sent, tree, k, split) Test coarse-to-fine methods on a sentence.
getinside(Chart chart) Compute inside probabilities for a chart given its parse forest.
getoutside(Chart chart) Compute outside probabilities for a chart given its parse forest.
posteriorthreshold(Chart chart, double threshold) Prune labeled spans from chart below given posterior threshold.
prunechart(Chart coarsechart, Grammar fine, ...) Produce a white list of selected chart items.
test()
discodop.coarsetofine.prunechart(Chart coarsechart, Grammar fine, k, bool splitprune, bool markorigin, bool finecfg, bool bitpar)

Produce a white list of selected chart items.

The criterion is that they occur in the k-best derivations of chart, or with posterior probability > k. Labels X in coarse.toid are projected to the labels in the mapping of the fine grammar, e.g., to X and X@n-m for a DOP reduction.

Parameters:
  • coarsechart – a Chart object produced by the PCFG or PLCFRS parser, or derivations from bitpar.
  • fine – the grammar to map labels to after pruning. must have a mapping to the coarse grammar established by fine.getmapping().
  • k – when k >= 1: number of k-best derivations to consider; when k==0, the chart is not pruned but filtered to contain only items that contribute to a complete derivation; when 0 < k < 1, inside-outside probabilities are computed and items with a posterior probabilities < k are pruned.
  • splitprune – coarse stage used a split-PCFG where discontinuous node appear as multiple CFG nodes. Every discontinuous node will result in multiple lookups into whitelist to see whether it should be allowed on the agenda.
  • markorigin – in combination with splitprune, coarse labels include an integer to distinguish components; e.g., CFG nodes NP*0 and NP*1 map to the discontinuous node NP_2.
  • bitpar – prune from bitpar derivations instead of actual chart
Returns:

(whitelist, items, msg)

For LCFRS, the white list is indexed as follows:
whitelisted:whitelist[label][item] == None
blocked:item not in whitelist[label]
For a CFG, indexing is as follows:
whitelisted:whitelist[span][label] == None
blocked:label not in whitelist[cell]
discodop.coarsetofine.bitparkbestitems(Chart chart, int k, bool finecfg)

Produce ChartItems occurring in a dictionary of derivations.

Parameters:chart – a chart where rankededges is a dictionary of CFG derivations.
Returns:a dictionary of ChartItems (mapping to None) occurring in the derivations.
discodop.coarsetofine.posteriorthreshold(Chart chart, double threshold)

Prune labeled spans from chart below given posterior threshold.

Returns:dictionary of remaining items.
discodop.coarsetofine.getinside(Chart chart)

Compute inside probabilities for a chart given its parse forest.

discodop.coarsetofine.getoutside(Chart chart)

Compute outside probabilities for a chart given its parse forest.