convert ansi escape to utf-8 in python

Question:

I may be wrong in accessing weather this string is ansi or anything else but it comes from rtf docs with heading.

{rtf1ansiansicpg1252

the string of interest from doc is:

ansi_string = r'3 u176? u177? 0.2u176? (2u952?)'

when i open the doc with word it gives me : 3° ± 0.2° 2θ

Questions are:
1) what are these escape codes?
is it possible to convert this string to utf-8 using python inbuilt methods?

Asked By: Rahul

||

Answers:

I don’t think this is the best answer but to make a point what I want, here is the working code.

import clr
clr.AddReference("System")
clr.AddReference("System.Windows.Forms")
import System.Windows.Forms as WinForms

def rtf_to_text(rtf_str):
    rtf = r"{rtf1ansiansicpg1252" + 'n' + rtf_str + 'n' + '}'
    richTextBox = WinForms.RichTextBox()
    richTextBox.Rtf = rtf
    return richTextBox.Text

print(rtf_to_text(r'3 u176? u177? 0.2u176? (2u952?)'))
-->'3 ° ± 0.2° (2θ)'
Answered By: Rahul
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.