Or what to do if your Python programs complain they can’t output a string because of encoding problems.

Like this:

>>> print u'\\N{left-pointing double angle quotation mark}' Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeEncodeError: 'ascii' codec can't encode character u'\\xab' in position 0: ordinal not in range(128)

It can’t perform the output because sys.stdout.encoding is ‘ascii’ and the ascii encoding can’t encoding that weird unicode character. I think I approve of this strict approach. It encourages explicitness and The Right Thing.

Okay. But in this instance I know what I’m doing and I want stdout to be treated as if it uses the UTF-8 encoding.

It turns out that Python picks the value for sys.stdout.encoding based on the value of the environment variable LC_CTYPE , so the following works:

$ LC_CTYPE=en_GB.utf-8 python >>> print u'\\N{left-pointing double angle quotation mark}' «

It even prints out the right character.

BUT WHERE THE HELL IS THIS DOCUMENTED?

Share this: Twitter

Facebook

Like this: Like Loading... Related