character-properties

Matching only a unicode letter in Python re

Matching only a unicode letter in Python re Question: I have a string from which i want to extract 3 groups: ’19 janvier 2012′ -> ’19’, ‘janvier’, ‘2012’ Month name could contain non ASCII characters, so [A-Za-z] does not work for me: >>> import re >>> re.search(ur'(d{,2}) ([A-Za-z]+) (d{4})’, u’20 janvier 2012′, re.UNICODE).groups() (u’20’, u’janvier’, …

Total answers: 1

matching unicode characters in python regular expressions

matching unicode characters in python regular expressions Question: I have read thru the other questions at Stackoverflow, but still no closer. Sorry, if this is allready answered, but I didn`t get anything proposed there to work. >>> import re >>> m = re.match(r’^/by_tag/(?P<tag>w+)/(?P<filename>(w|[.,!#%{}()@])+)$’, ‘/by_tag/xmas/xmas1.jpg’) >>> print m.groupdict() {‘tag’: ‘xmas’, ‘filename’: ‘xmas1.jpg’} All is well, then …

Total answers: 3

Python regex matching Unicode properties

Python regex matching Unicode properties Question: Perl and some other current regex engines support Unicode properties, such as the category, in a regex. E.g. in Perl you can use p{Ll} to match an arbitrary lower-case letter, or p{Zs} for any space separator. I don’t see support for this in either the 2.x nor 3.x lines …

Total answers: 6

Python and regular expression with Unicode

Python and regular expression with Unicode Question: I need to delete some Unicode symbols from the string ‘بِسْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ’ I know they exist here for sure. I tried: re.sub(‘([u064B-u0652u06D4u0670u0674u06D5-u06ED]+)’, ”, ‘بِسْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ’) but it doesn’t work. String stays the same. What am I doing wrong? Asked By: bsn || Source Answers: …

Total answers: 2

Regex and unicode

Regex and unicode Question: I have a script that parses the filenames of TV episodes (show.name.s01e02.avi for example), grabs the episode name (from the www.thetvdb.com API) and automatically renames them into something nicer (Show Name – [01×02].avi) The script works fine, that is until you try and use it on files that have Unicode show-names …

Total answers: 4