discodop.util¶
Misc code to avoid cyclic imports.
Functions
genericcompressor(cmd, filename[, encoding, …]) |
Run command line compressor on file and return file object. |
genericdecompressor(cmd, filename[, encoding]) |
Run command line decompressor on file and return file object. |
graphemecenter(text, width[, fillchar]) |
Return text centered in a string of grapheme length width (not len()). |
graphemelength(text) |
Return number of graphemes in string. |
merge(*iterables[, key]) |
Generator that performs an n-way merge of sorted iterables. |
openread(filename[, encoding]) |
Open stdin/file for reading; decompress gz/lz4/zst files on-the-fly. |
readbytes(filename) |
Read bytes from stdin/file; decompress gz/lz4/zst files on-the-fly. |
run(*popenargs, **kwargs) |
Run command with arguments and return (returncode, stdout, stderr). |
slice_bounds(seq, slice_obj[, allow_step]) |
Calculate the effective (start, stop) bounds of a slice. |
tokenize(text) |
A basic tokenizer following English/French PTB/FTB conventions. |
which(program[, exception]) |
Return first match for program in search path. |
workerfunc(func) |
Wrap a multiprocessing worker function to produce a full traceback. |
Classes
Entry(key, value, count) |
A PyAgenda entry. |
OrderedSet([iterable]) |
A frozen, ordered set which maintains a regular list/tuple and set. |
PyAgenda([iterable]) |
Priority Queue implemented with array-based heap. |
-
discodop.util.which(program, exception=True)[source]¶ Return first match for program in search path.
Parameters: exception – By default, ValueError is raised when program not found. Pass False to return None in this case.
-
discodop.util.workerfunc(func)[source]¶ Wrap a multiprocessing worker function to produce a full traceback.
-
discodop.util.genericdecompressor(cmd, filename, encoding='utf8')[source]¶ Run command line decompressor on file and return file object.
Parameters: - cmd – executable in path with gzip-like command line interface;
e.g.,
gzip, zstd, lz4, bzip2, lzop - filename – the file to decompress.
- encoding – if None, mode is binary; otherwise, text.
Raises: ValueError – if command returns an error.
Returns: a file-like object that must be used in a with-statement; supports .read() and iteration, but not seeking.
- cmd – executable in path with gzip-like command line interface;
e.g.,
-
discodop.util.genericcompressor(cmd, filename, encoding='utf8', compresslevel=8)[source]¶ Run command line compressor on file and return file object.
Parameters: - cmd – executable in path with gzip-like command line interface;
e.g.,
gzip, zstd, lz4, bzip2, lzop - filename – the compressed output file.
- encoding – if None, mode is binary; otherwise, text.
Raises: ValueError – if command returns an error.
Returns: a file-like object that must be used in a with-statement; supports .write() but not seeking.
- cmd – executable in path with gzip-like command line interface;
e.g.,
-
discodop.util.openread(filename, encoding='utf8')[source]¶ Open stdin/file for reading; decompress gz/lz4/zst files on-the-fly.
Parameters: encoding – if None, mode is binary; otherwise, text.
-
discodop.util.readbytes(filename)[source]¶ Read bytes from stdin/file; decompress gz/lz4/zst files on-the-fly.
-
discodop.util.slice_bounds(seq, slice_obj, allow_step=False)[source]¶ Calculate the effective (start, stop) bounds of a slice.
Takes into account
Noneindices and negative indices.Returns: tuple (start, stop, 1), s.t.0 <= start <= stop <= len(seq)Raises: ValueError – if slice_obj.step is not None. Parameters: allow_step – If true, then the slice object may have a non-None step. If it does, then return a tuple (start, stop, step).
-
class
discodop.util.OrderedSet(iterable=None)[source]¶ A frozen, ordered set which maintains a regular list/tuple and set.
The set is indexable. Equality is defined _without_ regard for order.
-
class
discodop.util.PyAgenda(iterable=None)[source]¶ Priority Queue implemented with array-based heap.
Implements decrease-key and remove operations by marking entries as invalid. Provides dictionary-like interface.
Can be initialized with an iterable; equivalent values are preserved in insertion order and the best priorities are retained on duplicate keys.
-
discodop.util.merge(*iterables, key=None)[source]¶ Generator that performs an n-way merge of sorted iterables.
>>> list(merge([0, 1, 2], [0, 1, 2, 3])) [0, 0, 1, 1, 2, 2, 3]
Similar to
heapq.merge, butkeycan be specified.NB: while a sort key may be specified, the individual iterables must already be sorted with this key.