How to unify all the "-" signs?
Question:
I have a simple program that takes data from the user. Here is an abbreviated version of it:
a = "0-1"
b = "0‑1"
print(a in b) # prints False
Problem:
ord(‘-‘) for a = 45
ord(‘‑’) for b = 8209
How can I make sure that the "-" sign is always the same and checking a in b returns True?
Answers:
It’s not clear if your example is part of a more general, but for the example provided you can handle this using replace
:
a = "0-1"
b = "0‑1"
print(a.replace("‑", "-") in b.replace("‑", "-")) # True
I’ve called replace on both sides, because it’s not clear which side is your input and which is not. In principle though this comes down to "sanitize your input".
If this is more of a general problem, you might want to look at using .translate
to produce a mapping of characters to apply in one go.
The most robust way would be to use the unidecode
module to convert all non-ASCII characters to their closest ASCII equivalent automatically.
import unidecode
print(unidecode.unidecode(a) in unidecode.unidecode(b))
I have a simple program that takes data from the user. Here is an abbreviated version of it:
a = "0-1"
b = "0‑1"
print(a in b) # prints False
Problem:
ord(‘-‘) for a = 45
ord(‘‑’) for b = 8209
How can I make sure that the "-" sign is always the same and checking a in b returns True?
It’s not clear if your example is part of a more general, but for the example provided you can handle this using replace
:
a = "0-1"
b = "0‑1"
print(a.replace("‑", "-") in b.replace("‑", "-")) # True
I’ve called replace on both sides, because it’s not clear which side is your input and which is not. In principle though this comes down to "sanitize your input".
If this is more of a general problem, you might want to look at using .translate
to produce a mapping of characters to apply in one go.
The most robust way would be to use the unidecode
module to convert all non-ASCII characters to their closest ASCII equivalent automatically.
import unidecode
print(unidecode.unidecode(a) in unidecode.unidecode(b))