How to get the first word in the string
Question:
text is :
WYATT - Ranked # 855 with 0.006 %
XAVIER - Ranked # 587 with 0.013 %
YONG - Ranked # 921 with 0.006 %
YOUNG - Ranked # 807 with 0.007 %
I want to get only
WYATT
XAVIER
YONG
YOUNG
I tried :
(.*)?[ ]
But it gives me the :
WYATT - Ranked
Answers:
Use this regex
^w+
w+
matches 1 to many characters.
w
is similar to [a-zA-Z0-9_]
^
depicts the start of a string
About Your Regex
Your regex (.*)?[ ]
should be ^(.*?)[ ]
or ^(.*?)(?=[ ])
if you don’t want the space
Regex is unnecessary for this. Just use some_string.split(' ', 1)[0]
or some_string.partition(' ')[0]
.
You don’t need regex to split a string on whitespace:
In [1]: text = '''WYATT - Ranked # 855 with 0.006 %
...: XAVIER - Ranked # 587 with 0.013 %
...: YONG - Ranked # 921 with 0.006 %
...: YOUNG - Ranked # 807 with 0.007 %'''
In [2]: print 'n'.join(line.split()[0] for line in text.split('n'))
WYATT
XAVIER
YONG
YOUNG
Don’t need a regex
.
string[: string.find(' ')]
You shoud do something like :
print line.split()[0]
If you want to feel especially sly, you can write it as this:
(firstWord, rest) = yourLine.split(maxsplit=1)
This is supposed to bring the best from both worlds:
- optimality tweak with
maxsplit
while splitting with any whitespace
- improved reliability and readability, as argued by the author of the technique.
I kind of fell in love with this solution and it’s general unpacking capability, so I had to share it.
text is :
WYATT - Ranked # 855 with 0.006 %
XAVIER - Ranked # 587 with 0.013 %
YONG - Ranked # 921 with 0.006 %
YOUNG - Ranked # 807 with 0.007 %
I want to get only
WYATT
XAVIER
YONG
YOUNG
I tried :
(.*)?[ ]
But it gives me the :
WYATT - Ranked
Use this regex
^w+
w+
matches 1 to many characters.
w
is similar to [a-zA-Z0-9_]
^
depicts the start of a string
About Your Regex
Your regex (.*)?[ ]
should be ^(.*?)[ ]
or ^(.*?)(?=[ ])
if you don’t want the space
Regex is unnecessary for this. Just use some_string.split(' ', 1)[0]
or some_string.partition(' ')[0]
.
You don’t need regex to split a string on whitespace:
In [1]: text = '''WYATT - Ranked # 855 with 0.006 %
...: XAVIER - Ranked # 587 with 0.013 %
...: YONG - Ranked # 921 with 0.006 %
...: YOUNG - Ranked # 807 with 0.007 %'''
In [2]: print 'n'.join(line.split()[0] for line in text.split('n'))
WYATT
XAVIER
YONG
YOUNG
Don’t need a regex
.
string[: string.find(' ')]
You shoud do something like :
print line.split()[0]
If you want to feel especially sly, you can write it as this:
(firstWord, rest) = yourLine.split(maxsplit=1)
This is supposed to bring the best from both worlds:
- optimality tweak with
maxsplit
while splitting with any whitespace - improved reliability and readability, as argued by the author of the technique.
I kind of fell in love with this solution and it’s general unpacking capability, so I had to share it.