python locale strange error. what's going on here exactly?

Question:

So today I upgraded to bazaar 2.0.2, and I started receiving this message (I’m on snow leopard, btw):

bzr: warning: unknown locale: UTF-8
  Could not determine what text encoding to use.
  This error usually means your Python interpreter
  doesn't support the locale set by $LANG (en_US.UTF-8)
  Continuing with ascii encoding.

very strange, since my LANG is actually empty. Similar thing happen when I try to tinker with the locale module

Python 2.5.4 (r254:67916, Nov 30 2009, 14:09:22) 
[GCC 4.3.4] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.getdefaultlocale()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/sbo/runtimes/lib/python2.5/locale.py", line 443, in getdefaultlocale
    return _parse_localename(localename)
  File "/Users/sbo/runtimes/lib/python2.5/locale.py", line 375, in _parse_localename
    raise ValueError, 'unknown locale: %s' % localename
ValueError: unknown locale: UTF-8

exporting LANG does not help

sbo@dhcp-045:~ $ export LANG=en_US.UTF-8
sbo@dhcp-045:~ $ bzr
bzr: warning: unknown locale: UTF-8
  Could not determine what text encoding to use.
  This error usually means your Python interpreter
  doesn't support the locale set by $LANG (en_US.UTF-8)
  Continuing with ascii encoding.

However, this solved the problem

sbo@dhcp-045:~ $ export LANG=en_US.UTF-8
sbo@dhcp-045:~ $ export LC_ALL=en_US.UTF-8

Python 2.5.4 (r254:67916, Nov 30 2009, 14:09:22) 
[GCC 4.3.4] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.getdefaultlocale()
('en_US', 'UTF8')

Could you please explain what’s going on here, for better googlability ?

Asked By: Stefano Borini

||

Answers:

It’s a Mac OS X problem. To see your locale settings, run locale in terminal. locale -a should list all locales that you have defined (that you may use as argument to LC_ALL).

Notice that LC_ALL and other LC_* variables take precedence over LANG when defined.

Answered By: u0b34a0f6ae

2016 UPDATE: Turns out that this is a Python bug since at least 2013, very probably earlier too, consisting in Python not reacting well to non-GNU locales – like those found in Mac OS X and the BSDs. The bug is still open as of September 2016, and affects every Python version.


If there was no LANG environment variable set, chances are you had either an LC_CTYPE (the key variable) or LC_ALL (which overrides if set) environment variable set to UTF-8, which is not a valid OS X locale. It’s easy enough to reproduce with the Apple-supplied /usr/bin/python or with a custom python, as in your case, that was built with the 10.6 SDK (probably also the 10.5 SDK). You won’t be able to reproduce it that way with a python.org python; they are currently built with the 10.4 SDK where the locale APIs behave differently.

$ unset LANG
$ env | grep LC_
$ export LC_CTYPE="UTF-8"
$ /usr/bin/python  # Apple-supplied python
Python 2.6.1 (r261:67515, Jul  7 2009, 23:51:51) 
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale ; locale.getdefaultlocale()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/locale.py", line 459, in getdefaultlocale
    return _parse_localename(localename)
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/locale.py", line 391, in _parse_localename
    raise ValueError, 'unknown locale: %s' % localename
ValueError: unknown locale: UTF-8
^D
$ /usr/local/bin/python2.6   # python.org python
Python 2.6.4 (r264:75821M, Oct 27 2009, 19:48:32) 
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale ; locale.getdefaultlocale()
(None, 'mac-roman')
>>> 

EDIT:

There may be another piece to the puzzle. A quick look at the bzr 2.0.1 I have installed indicates that the message you cite should only show up if locale.getpreferredencoding() raises a locale.Error. One way that can happen is if the python _locale.so C extension can’t be loaded and that can happen if there are permission problems on it. For example, MacPorts currently is known to have problems setting permissions if you have a customized umask; I’ve been burned by that issue myself. Check the permissions of _locale.so in the python lib/python2.5/lib-dynload directory and ensure it is 755. The full path for MacPorts should be:

/opt/local/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/
Answered By: Ned Deily

I faced the same problem. When I ran locale, I noticed that the LANG and LC_ALL were unset. So I fixed this by adding the following lines in the .bash_profile file:

export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8

Then I simply ran:

source ~/.bash_profile 

And this issue was fixed thereafter on my Mac.

Answered By: Archit Kapoor