Reading strings with special characters in Python

Question:

I have a string with special characters as follows

req_str = 'Nx08NAx08AMx08ME'
## If I print it I correctly get the word "NAME"
print(req_str)

>>> print(req_str)
NAME

Now I want to extract the string NAME from the string.
I tried

''.join(c for c in 'Nx08NAx08AMx08ME' if c.isprintable())
## this produces
'NNAAMME'

I understand this has got to do with some special encoding. I am not very familiar with string encodings. My question is how can I extract the word ‘NAME` as a string in this situation ?

Asked By: sayan dasgupta

||

Answers:

According to the ASCII table, x08 is for backspace character. It can also be produced by b:

req_str1 = "Nx08NAx08AMx08ME"
req_str2 = "NbNAbAMbME"
print(req_str1)
print(req_str2)
print(req_str1 == req_str2)

output:

NAME
NAME
True

Basically it writes a N and then applies backspace then writes another N. That’s why you see one N in the final output. Same thing for A, M and E.

To extract NAME you can do what terminal does with it:

(thanks to @DarkKnight)

def extract(s):
    BS = "x08"
    r = []
    for c in s:
        if c == BS:
            r = r[:-1]
        else:
            r.append(c)
    return ''.join(r)

req_str = 'Nx08NAx08AMx08ME'

s = extract(req_str)

print(len(req_str))
print(s)
print(len(s))

Additional Information: If you wonder what the root of this is: back in the old days printers/typewriters used this technique to type a character twice to make it Bold. It’s called overstriking or overtyping

Answered By: S.B
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.