Read and print unicode literal string from file in Python 3

Question:

If we want to print symbols of alpha and beta in Python then one way is:

print('u03b1')
print('u03b2')

Output:

α
β

What I wish to do is to write the unicode for these symbols in a file: data.txt , read the file and then print the symbols.

data.txt

03b1
03b2

So, I tried

file = open('data.txt')
for word in file:
    greek_word = '\u' + word
    print(greek_word)

However, I got the output as:

u03b1

u03b2

I am not able to figure out how to print u03b1 into α. I have read through unicode documentation, performed several permutation of encoding, decoding utf-8 etc. but could not succeed.

Python shows type of both variables as str only

Asked By: bhavesh

||

Answers:

Use int(hex_string, 16) to convert the hex representation into the numerical unicode code point, and use chr() to turn that into the corresponding character:

file = open('data.txt')
for word in file:
    greek_word = chr(int(word, 16))
    print(greek_word)

Note that this only handles single characters, not words, since you didn’t specify a format in which complete words should be written in data.txt.

Answered By: Hans-Martin Mosner

The conversion from 'u03b1' to 'α' happens when the expression is being evaluated. In your case, you are evaluating '\u' and '03b1' independently and then just concatenating the result. So they are just appended.

What you actually want is to evaluate the concatenated result. This can be done using built-in eval function. This code should work as expected:

file = open('data.txt')
for word in file:
    greek_word = '\u' + word
    print(eval(f"'{greek_word}'"))

Here, the concatenated value, u03b1 is first quoted, which results in 'u03b1', which is then passes to eval, which evaluates it to 'α' as it would have normally done.

Answered By: Sourav Kannantha B
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.