What is regex for currency symbol?
Question:
In java I can use the regex : p{Sc}
for detecting currency symbol in text. What is the equivalent in Python?
Answers:
You can use the unicode category if you use regex
package:
>>> import regex
>>> regex.findall(r'p{Sc}', '$99.99 / €77') # Python 3.x
['$', '€']
>>> regex.findall(ur'p{Sc}', u'$99.99 / €77') # Python 2.x (NoteL unicode literal)
[u'$', u'xa2']
>>> print _[1]
¢
UPDATE
Alterantive way using unicodedata.category
:
>>> import unicodedata
>>> [ch for ch in '$99.99 / €77' if unicodedata.category(ch) == 'Sc']
['$', '€']
If you want to stick with re, supply the characters from Sc manually:
u"[$¢£¤¥֏؋৲৳৻૱௹฿៛u20a0-u20bdua838ufdfcufe69uff04uffe0uffe1uffe5uffe6]"
will do.
In java I can use the regex : p{Sc}
for detecting currency symbol in text. What is the equivalent in Python?
You can use the unicode category if you use regex
package:
>>> import regex
>>> regex.findall(r'p{Sc}', '$99.99 / €77') # Python 3.x
['$', '€']
>>> regex.findall(ur'p{Sc}', u'$99.99 / €77') # Python 2.x (NoteL unicode literal)
[u'$', u'xa2']
>>> print _[1]
¢
UPDATE
Alterantive way using unicodedata.category
:
>>> import unicodedata
>>> [ch for ch in '$99.99 / €77' if unicodedata.category(ch) == 'Sc']
['$', '€']
If you want to stick with re, supply the characters from Sc manually:
u"[$¢£¤¥֏؋৲৳৻૱௹฿៛u20a0-u20bdua838ufdfcufe69uff04uffe0uffe1uffe5uffe6]"
will do.