How to get the Unicode character from a code point variable?

Question:

I have a variable which stores the string "u05e2" (The value is constantly changing because I set it within a loop). I want to print the Hebrew letter with that Unicode value. I tried the following but it didn’t work:

>>> a = 'u05e2'
>>> print(u'{}'.format(a))

I got u05e2 instead of ע(In this case).

I also tried to do:

>>> a = 'u05e2'
>>> b = '\' + a
>>> print(u'{}'.format(b))

Neither one worked. How can I fix this?

Thanks in advance!

Asked By: Omer Shalev

||

Answers:

This is happening because you have to add the suffix u outside of the string.

a = u'u05e2'
print(a)
ע

Hope this helps you.

Answered By: acesaran

All you need is a before u05e2. To print a Unicode character, you must provide a unicode format string.

a = 'u05e2'
print(u'{}'.format(a))

#Output
ע

When you try the other approach by printing the within the print() function, Python first escapes the and does not show the desired result.

a = 'u05e2'
print(u'{}'.format(a))

#Output
u05e2

A way to verify the validity of Unicode format strings is using the ord() built-in function in the Python standard library. This returns the Unicode code point(an integer) of the character passed to it. This function only expects either a Unicode character or a string representing a Unicode character.

a = 'u05e2'
print(ord(a)) #1506, the Unicode code point for the Unicode string stored in a

To print the Unicode character for the above Unicode code value(1506), use the character type formatting with c. This is explained in the Python docs.

print('{0:c}'.format(1506))

#Output
ע

If we pass a normal string literal to ord(), we get an error. This is because this string does not represent a Unicode character.

a = 'u05e2'
print(ord(a))

#Error
TypeError: ord() expected a character, but string of length 5 found
Answered By: amanb

This seems like an X-Y Problem. If you want the Unicode character for a code point, use an integer variable and the function chr (or unichr on Python 2) instead of trying to format an escape code:

>>> for a in range(0x5e0,0x5eb):
...  print(hex(a),chr(a))
...
0x5e0 נ
0x5e1 ס
0x5e2 ע
0x5e3 ף
0x5e4 פ
0x5e5 ץ
0x5e6 צ
0x5e7 ק
0x5e8 ר
0x5e9 ש
0x5ea ת
Answered By: Mark Tolonen
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.