Error with urlencode in python

Question:

I have this:

a = {'album': u'Metamorphine', 'group': 'monoku', 'name': u'Son Of Venus (Dannyxb4s Song)', 'artist': u'Leandra', 'checksum': '2836e33d42baf947e8c8adef48921f2f76fcb37eea9c50b0b59d7651', 'track_number': 8, 'year': '2008', 'genre': 'Darkwave', 'path': u'/media/data/musik/Leandra/2008. Metamorphine/08. Son Of Venus (Dannyxb4s Song).mp3', 'user_email': '[email protected]', 'size': 6624104}
data = urllib.urlencode(mp3_data)

And that raise an exception:

Traceback (most recent call last):
  File "playkud.py", line 44, in <module>
    main()
  File "playkud.py", line 29, in main
    craw(args, options.user_email, options.group)
  File "/home/diegueus9/workspace/playku/src/client/playkud/crawler/crawler.py", line 76, in craw
    index(root, file, data, user_email, group)
  File "/home/diegueus9/workspace/playku/src/client/playkud/crawler/crawler.py", line 58, in index
    done = add_song(data[mp3file])
  File "/home/diegueus9/workspace/playku/src/client/playkud/service.py", line 32, in add_song
    return make_request(URL+'add_song/', data)
  File "/home/diegueus9/workspace/playku/src/client/playkud/service.py", line 14, in make_request
    data = urllib.urlencode(dict([k.encode('utf-8'),v] for k,v in mp3_data.items()))
  File "/usr/lib/python2.5/urllib.py", line 1250, in urlencode
    v = quote_plus(str(v))
UnicodeEncodeError: 'ascii' codec can't encode character u'xb4' in position 19: ordinal not in range(128)

and with ipython (2.5):

In [7]: urllib.urlencode(a)
UnicodeEncodeError                        Traceback (most recent call last)

/home/diegueus9/<ipython console> in <module>()

/usr/lib/python2.5/urllib.pyc in urlencode(query, doseq)
   1248         for k, v in query:
   1249             k = quote_plus(str(k))
-> 1250             v = quote_plus(str(v))
   1251             l.append(k + '=' + v)
   1252     else:

UnicodeEncodeError: 'ascii' codec can't encode character u'xb4' in position 19: ordinal not in range(128)

How i can fix it?

Asked By: diegueus9

||

Answers:

the problem is, that you want to cast a unicode-string to a string, but there are some characters that have to be converted to ASCII first. So I would advice you to search for strings that are not ASCII and then encode them as follows:

try to change for example where v is a unicode-string to:

quote_plus(str(v))

to

quote_plus(str(v.encode("utf-8")))

that should help


If you do not have to use Python 2.x, you could switch to Python 3.x, where all strings are unicode by default. But you have to convert some things for it (you could automate this party or full with 2to3).

Answered By: Joschua

The urlencode library expects data in str format, and doesn’t deal well with Unicode data since it doesn’t provide a way to specify an encoding. Try this instead:

mp3_data = {'album': u'Metamorphine',
     'group': 'monoku',
     'name': u'Son Of Venus (Dannyxb4s Song)',
     'artist': u'Leandra',
     'checksum': '2836e33d42baf947e8c8adef48921f2f76fcb37eea9c50b0b59d7651',
     'track_number': 8,
     'year': '2008', 'genre': 'Darkwave',
     'path': u'/media/data/musik/Leandra/2008. Metamorphine/08. Son Of Venus (Dannyxb4s Song).mp3',
     'user_email': '[email protected]',
     'size': 6624104}

str_mp3_data = {}
for k, v in mp3_data.iteritems():
    str_mp3_data[k] = unicode(v).encode('utf-8')
data = urllib.urlencode(str_mp3_data)

What I did was ensure that all data is encoded into str using UTF-8 before passing the dictionary into the urlencode function.

Answered By: Walter Mundt

The problem is that some of the values in your mp3_data dict are unicode strings that can’t be represented in the default encoding used by urlencode() (while others are ASCII and still others are integers). You can fix this by encoding those values before passing them to urlencode(). On line 14 of /home/diegueus9/workspace/playku/src/client/playkud/service.py, in make_request(), try changing this:

data = urllib.urlencode(dict([k.encode('utf-8'),v] for k,v in mp3_data.items()))

to this:

data = urllib.urlencode(dict([k.encode('utf-8'),unicode(v).encode('utf-8')] for k,v in mp3_data.items()))
Answered By: ʇsәɹoɈ

If you are using Django, take a look at Django’s QueryDict class, it has a urlencode() method.

Or, for the helper function itself you may use urlencode. It basically does what is described in the other answers as a wrapper around the original urllib.encode.

Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.