I enabled the compatibility check in my Python IDE and now I realize that the inherited Python 2.7 code has a lot of calls to
unicode() which are not allowed in Python 3.x.
I looked at the docs of Python2 and found no hint how to upgrade:
I don’t want to switch to Python3 now, but maybe in the future.
The code contains about 500 calls to
How to proceed?
The comment of user vaultah to read the pyporting guide has received several upvotes.
My current solution is this (thanks to Peter Brittain):
from builtins import str
… I could not find this hint in the pyporting docs…..
You can test whether there is such a function as
unicode() in the version of Python that you’re running. If not, you can create a
unicode() alias for the
str() function, which does in Python 3 what
unicode() did in Python 2, as all strings are unicode in Python 3.
# Python 3 compatibility hack try: unicode('') except NameError: unicode = str
Note that a more complete port is probably a better idea; see the porting guide for details.
As has already been pointed out in the comments, there is already advice on porting from 2 to 3.
Having recently had to port some of my own code from 2 to 3 and maintain compatibility for each for now, I wholeheartedly recommend using python-future, which provides a great tool to help update your code (
futurize) as well as clear guidance for how to write cross-compatible code.
In your specific case, I would simply convert all calls to unicode to use str and then import str from builtins. Any IDE worth its salt these days will do that global search and replace in one operation.
Of course, that’s the sort of thing futurize should catch too, if you just want to use automatic conversion (and to look for other potential issues in your code).
First, as a strategy, I would take a small part of your program and try to port it. The number of
unicode calls you are describing suggest to me that your application cares about string representations more than most and each use-case is often different.
The important consideration is that all strings are unicode in Python 3. If you are using the
str type to store “bytes” (for example, if they are read from a file), then you should be aware that those will not be bytes in Python3 but will be unicode characters to begin with.
Let’s look at a few cases.
First, if you do not have any non-ASCII characters at all and really are not using the Unicode character set, it is easy. Chances are you can simply change the
unicode() function to
str(). That will assure that any object passed as an argument is properly converted. However, it is wishful thinking to assume it’s that easy.
Most likely, you’ll need to look at the argument to
unicode() to see what it is, and determine how to treat it.
For example, if you are reading UTF-8 characters from a file in Python 2 and converting them to Unicode your code would look like this:
data = open('somefile', 'r').read() udata = unicode(data)
However, in Python3,
read() returns Unicode data to begin with, and the unicode decoding must be specified when opening the file:
udata = open('somefile', 'r', encoding='UTF-8').read()
As you can see, transforming
unicode() simply when porting may depend heavily on how and why the application is doing Unicode conversions, where the data has come from, and where it is going to.
Python3 brings greater clarity to string representations, which is welcome, but can make porting daunting. For example, Python3 has a proper
bytes type, and you convert byte-data to unicode like this:
udata = bytedata.decode('UTF-8')
or convert Unicode data to character form using the opposite transform.
bytedata = udata.encode('UTF-8')
I hope this at least helps determine a strategy.
Short answer: Replace all
unicode calls with
Long answer: In Python 3, Unicode was replaced with strings because of its abundance. The following solution should work if you are only using Python 3:
unicode = str # the rest of your goes goes here
If you are using it with both Python 2 or Python 3, use this instead:
import sys if sys.version_info.major == 3: unicode = str # the rest of your code goes here
The other way: run this in the command line
$ 2to3 package -w
You can use six library which have
text_type function (
unicode in py2,
str in py3):
from six import text_type