discodop.treedist

Tree edit distance implementations.

Functions

geteditstats(forest1, forest2) Recursively get edit distance.
newtreedist(tree1, tree2[, debug]) Tree edit distance implementation as in Billie (2005).
prepare(tree[, includeterms]) Return a copy of tree prepared for tree edit distance calculation.
strdist(a, b) Default categorical distance function.
test() Tree edit distance demonstration.
treedist(tree1, tree2[, debug]) Zhang-Shasha tree edit distance.

Classes

AnnotatedTree(root) Wrap a tree to add some extra information.
EditStats([distance, matched, editscript]) Collect edit operations on a tree.
Terminal(node) Auxiliary class to add indices to terminal nodes of Tree objects.
class discodop.treedist.Terminal(node)[source]

Auxiliary class to add indices to terminal nodes of Tree objects.

discodop.treedist.prepare(tree, includeterms=False)[source]

Return a copy of tree prepared for tree edit distance calculation.

  • sort children to have canonical order
  • merge preterminals and terminals in single nodes
    (unless includeterms=True).
class discodop.treedist.AnnotatedTree(root)[source]

Wrap a tree to add some extra information.

discodop.treedist.strdist(a, b)[source]

Default categorical distance function.

discodop.treedist.treedist(tree1, tree2, debug=False)[source]

Zhang-Shasha tree edit distance.

discodop.treedist.newtreedist(tree1, tree2, debug=False)[source]

Tree edit distance implementation as in Billie (2005).

Based on rparse code. Slower than treedist() but records edit script. Should be rewritten to use a set of matrices as dynamic programming tables.

class discodop.treedist.EditStats(distance=0, matched=0, editscript=None)[source]

Collect edit operations on a tree.

discodop.treedist.geteditstats(forest1, forest2)[source]

Recursively get edit distance.