How to write unicode strings into a file?

Question:

I am using python 2.6.5
I want to write some japanese characters to a file.
I am getting this error & I don’t know how to change the encoding.

Python 2.6.5 (r265:79063, Jun 12 2010, 17:07:01)
[GCC 4.3.4 20090804 (release) 1] on cygwin
>>> s = u'u5E73u621015'
>>> with open("yop", "wb") as f:
...   f.write( s + "n" );
...
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: 
  ordinal not in range(128)
>>> type( s )
<type 'unicode'>
Asked By: Frankie Ribery

||

Answers:

you’re going to have to ‘encode’ the unicode string.

s = u'u5E73u621015'
with open("yop", "wb") as f:
   f.write(s.encode("UTF-8"))

try this out for a bit of a friendly look at unicode and python: http://farmdev.com/talks/unicode/

Answered By: Mike Ramirez

As an alternative, you can use the codecs module:

import codecs
s = u'u5E73u621015'
with codecs.open("yop", "w", encoding="utf-8") as f:
    f.write(s)
Answered By: Philipp

The codecs.open() function in 2.6 is very similar to the built-in open() function in python3.x (which makes sense since Py3k strings are always Unicode). For future proofing your code in case it is used under Py3k you could do the following.

import sys

if sys.version_info[0] < 3:
    import codecs
    _open_func_bak = open # Make a back up, just in case
    open = codecs.open

with open('myfile', 'w', encoding='utf-8') as f:
    f.write(u'u5E73u621015')

Now your code should work the same in both 2.x and 3.3+.

Answered By: eestrada

Inserting this at the beginning of my script tends to solve unicode problems.

import sys
reload(sys)
sys.setdefaultencoding('utf8')
Answered By: petra
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.