discodop.coarsetofine¶
Select suitably probable items from a chart and produce whitelist.
Functions
doctftest(coarse, fine, sent, tree, k, split) |
Test coarse-to-fine methods on a sentence. |
getinside(Chart chart) |
Compute inside probabilities for a chart given its parse forest. |
getmatchingitems(Chart chart, …) |
|
getoutside(Chart chart) |
Compute outside probabilities for a chart given its parse forest. |
posteriorthreshold(Chart chart, double threshold) |
Prune labeled spans from chart below given posterior threshold. |
prunechart(Chart coarsechart, Grammar fine, …) |
Produce a white list of selected chart items. |
test() |
-
discodop.coarsetofine.prunechart(Chart coarsechart, Grammar fine, k, bool splitprune, bool markorigin, bool finecfg, set require=None, set block=None)¶ Produce a white list of selected chart items.
The criterion is that they occur in the k-best derivations of
chart, or with posterior probability > k. LabelsXincoarse.toidare projected to the labels in the mapping of the fine grammar, e.g., toXandX@n-mfor a DOP reduction.Parameters: - coarsechart – a Chart object produced by the PCFG or PLCFRS parser.
- fine – the grammar to map labels to after pruning. must have a
mapping to the coarse grammar established by
fine.getmapping(). - k – when
k >= 1: number of k-best derivations to consider; whenk==0, the chart is not pruned but filtered to contain only items that contribute to a complete derivation; when0 < k < 1, inside-outside probabilities are computed and items with a posterior probabilities < k are pruned. - splitprune – coarse stage used a split-PCFG where discontinuous node appear as multiple CFG nodes. Every discontinuous node will result in multiple lookups into whitelist to see whether it should be allowed on the agenda.
- markorigin – in combination with splitprune, coarse labels include an integer to distinguish components; e.g., CFG nodes NP*0 and NP*1 map to the discontinuous node NP_2.
- require – optionally, a list of tuples
(label, indices); only k-best derivations containing these labeled spans will be selected. For example,('NP', [0, 1, 2]); expectsk > 1. - block – optionally, a list of tuples
(label, indices); these labeled spans will be pruned.
Returns: (whitelist, msg)- For LCFRS, the white list is indexed as follows:
whitelisted: item in whitelist[label],itemis a SmallChartItem or FatChartItem depending on sent. len.blocked: item not in whitelist[label]- For a CFG, indexing is as follows:
whitelisted: label in whitelist[span],spanis an integer encoding both begin and end; different from a cell because does not include no. of nonterminals.blocked: label not in whitelist[span]
-
discodop.coarsetofine.posteriorthreshold(Chart chart, double threshold)¶ Prune labeled spans from chart below given posterior threshold.
Returns: dictionary of remaining items.
-
discodop.coarsetofine.getinside(Chart chart)¶ Compute inside probabilities for a chart given its parse forest.
-
discodop.coarsetofine.getoutside(Chart chart)¶ Compute outside probabilities for a chart given its parse forest.