How to find a word that starts with a specific character
Question:
I want to sort out words which are started with ‘s’ in sentence by python.
Here is my code:
import re
text = "I was searching my source to make a big desk yesterday."
m = re.findall(r'[s]w+', text)
print m
But the result of code is :
['searching', 'source', 'sk', 'sterday'].
How do I write a code about regular expression? Or, is there any method to sort out words?
Answers:
>>> import re
>>> text = "I was searching my source to make a big desk yesterday."
>>> re.findall(r'bsw+', text)
['searching', 'source']
For lowercase and uppercase s
use: r'b[sS]w+'
-
If you want to match a single character, you don’t need to put it in a character class, so s
is the same than [s]
.
-
What you want to find is a word boundary. A word boundary b
is an anchor that matches on a change from a non word character (W
) to a word character (w
) or vice versa.
The solution is:
bsw+
this regex will match on a s
with not a word character before (works also on the start of the string) and needs at least one word character after it. w+
is matching all word characters it can find, so no need for a b
at the end.
See it here on Regexr
I know it is not a regex solution, but you can use startswith
>>> text="I was searching my source to make a big desk yesterday."
>>> [ t for t in text.split() if t.startswith('s') ]
['searching', 'source']
I would like to add one small thing here,
Let’s say you have a line to find words which starts with 's'
line = "someone should show something to [email protected]"
if you write regular expression like,
swords = re.findall(r"b[sS]w+", line)
output will be,
['someone','should','show','something','some']
But if you modify regular expression to,
# use S instead of w
swords = re.findall(r"b[sS]S+", line)
output will be,
['someone','should','show','something','[email protected]']
I tried this sample of code and I think it does exactly what you want:
import re
text = "I was searching my source to make a big desk yesterday."
m = re.findall (r'b[s]w+', text)
print (m)
Lambda style:
text = 'I was searching my source to make a big desk yesterday.'
list(filter(lambda word: word[0]=='s', text.split()))
Output:
['searching', 'source']
how to apply regex for this DE92600501010004508900, as we have multiple words starts with DE, in a string with numbers
I want to sort out words which are started with ‘s’ in sentence by python.
Here is my code:
import re
text = "I was searching my source to make a big desk yesterday."
m = re.findall(r'[s]w+', text)
print m
But the result of code is :
['searching', 'source', 'sk', 'sterday'].
How do I write a code about regular expression? Or, is there any method to sort out words?
>>> import re
>>> text = "I was searching my source to make a big desk yesterday."
>>> re.findall(r'bsw+', text)
['searching', 'source']
For lowercase and uppercase s
use: r'b[sS]w+'
-
If you want to match a single character, you don’t need to put it in a character class, so
s
is the same than[s]
. -
What you want to find is a word boundary. A word boundary
b
is an anchor that matches on a change from a non word character (W
) to a word character (w
) or vice versa.
The solution is:
bsw+
this regex will match on a s
with not a word character before (works also on the start of the string) and needs at least one word character after it. w+
is matching all word characters it can find, so no need for a b
at the end.
See it here on Regexr
I know it is not a regex solution, but you can use startswith
>>> text="I was searching my source to make a big desk yesterday."
>>> [ t for t in text.split() if t.startswith('s') ]
['searching', 'source']
I would like to add one small thing here,
Let’s say you have a line to find words which starts with 's'
line = "someone should show something to [email protected]"
if you write regular expression like,
swords = re.findall(r"b[sS]w+", line)
output will be,
['someone','should','show','something','some']
But if you modify regular expression to,
# use S instead of w
swords = re.findall(r"b[sS]S+", line)
output will be,
['someone','should','show','something','[email protected]']
I tried this sample of code and I think it does exactly what you want:
import re
text = "I was searching my source to make a big desk yesterday."
m = re.findall (r'b[s]w+', text)
print (m)
Lambda style:
text = 'I was searching my source to make a big desk yesterday.'
list(filter(lambda word: word[0]=='s', text.split()))
Output:
['searching', 'source']
how to apply regex for this DE92600501010004508900, as we have multiple words starts with DE, in a string with numbers