discodop.punctuation

Punctuation related functions.

Functions

applypunct(method, tree, sent) Apply punctuation strategy to tree (in-place).
balancedpunctraise(tree, sent) Move balanced punctuation " ' - ( ) [ ] to a common constituent.
ispunct(word, tag) Test whether a word and/or tag is punctuation.
punctlower(tree, sent) Find suitable constituent for punctuation marks and add it there.
punctprune(tree, sent) Remove quotes and period at sentence beginning and end.
punctraise(tree, sent[, rootpreterms]) Attach punctuation nodes to an appropriate constituent.
punctremove(tree, sent[, rootpreterms]) Remove any punctuation nodes, and any empty ancestors.
punctroot(tree, sent) Move punctuation directly under ROOT, as in the Negra annotation.
discodop.punctuation.applypunct(method, tree, sent)[source]

Apply punctuation strategy to tree (in-place).

Parameters:method – one of remove, removeall, move, moveall, prune, or root.
discodop.punctuation.punctremove(tree, sent, rootpreterms=False)[source]

Remove any punctuation nodes, and any empty ancestors.

discodop.punctuation.punctprune(tree, sent)[source]

Remove quotes and period at sentence beginning and end.

discodop.punctuation.punctroot(tree, sent)[source]

Move punctuation directly under ROOT, as in the Negra annotation.

discodop.punctuation.punctlower(tree, sent)[source]

Find suitable constituent for punctuation marks and add it there.

Initial candidate is the root node. Note that punctraise() performs better. Based on rparse code.

discodop.punctuation.punctraise(tree, sent, rootpreterms=False)[source]

Attach punctuation nodes to an appropriate constituent.

Trees in the Negra corpus have punctuation attached to the root; i.e., it is not part of the phrase-structure. This function moves the punctuation to an appropriate level in the tree. A punctuation node is a POS tag with a punctuation terminal. Modifies trees in-place.

Parameters:rootpreterms – if True, move all preterminals under root, instead of only recognized punctuation.
discodop.punctuation.balancedpunctraise(tree, sent)[source]

Move balanced punctuation " ' - ( ) [ ] to a common constituent.

Based on rparse code.

discodop.punctuation.ispunct(word, tag)[source]

Test whether a word and/or tag is punctuation.