string Module

String and binary data functions.

wpull.string.coerce_str_to_ascii(string)[source]

Force the contents of the string to be ASCII.

Anything not ASCII will be replaced with with a replacement character.

Deprecated since version 0.1002: Use printable_str() instead.

wpull.string.detect_encoding(data, encoding=None, fallback='latin1', is_html=False)[source]

Detect the character encoding of the data.

Returns:

The name of the codec

Return type:

str

Raises:
  • ValueError – The codec could not be detected. This error can only
  • occur if fallback is not a “lossless” codec.
wpull.string.format_size(num, format_str='{num:.1f} {unit}')[source]

Format the file size into a human readable text.

http://stackoverflow.com/a/1094933/1524507

wpull.string.normalize_codec_name(name)[source]

Return the Python name of the encoder/decoder

Returns:str, None
wpull.string.printable_bytes(data)[source]

Remove any bytes that is not printable ASCII.

This function is intended for sniffing content types such as UTF-16 encoded text.

wpull.string.printable_str(text, keep_newlines=False)[source]

Escape any control or non-ASCII characters from string.

This function is intended for use with strings from an untrusted source such as writing to a console or writing to logs. It is designed to prevent things like ANSI escape sequences from showing.

Use repr() or ascii() instead for things such as Exception messages.

wpull.string.to_bytes(instance, encoding='utf-8', error='strict')[source]

Convert an instance recursively to bytes.

wpull.string.to_str(instance, encoding='utf-8')[source]

Convert an instance recursively to string.

wpull.string.try_decoding(data, encoding)[source]

Return whether the Python codec could decode the data.