Python Split String at First Non-Alpha Character
Question:
Say I have strings such as 'ABC)D.'
or 'AB:CD/'
. How can I split them at the first non-alphabetic character to end up with ['ABC', 'D.']
and ['AB', 'CD/']
? Is there a way to do this without regex?
Answers:
One option would be to find the location of the first non-alphabetic character:
def split_at_non_alpha(s):
try:
split_at = next(i for i, x in enumerate(s) if not x.isalpha())
return s[:split_at], s[split_at+1:]
except StopIteration: # if not found
return (s,)
print(split_at_non_alpha('ABC)D.')) # ('ABC', 'D.')
print(split_at_non_alpha('AB:CD/')) # ('AB', 'CD/')
print(split_at_non_alpha('.ABCD')) # ('', 'ABCD')
print(split_at_non_alpha('ABCD.')) # ('ABCD', '')
print(split_at_non_alpha('ABCD')) # ('ABCD',)
You can use a loop
a = 'AB$FDWRE'
i = 0
while i<len(a) and a[i].isalpha():
i += 1
>>> a[:i]
'AB'
>>> a[i:]
'$FDWRE'
Barmar’s suggestion‘s worked best for me. The other answers had near the same execution time but I chose the former for readability.
from itertools import takewhile
str = 'ABC)D.'
alphStr = ''.join(takewhile(lambda x: x.isalpha(), str))
print(alphStr) # Outputs 'ABC'
With for loop
, enumerate
, and string indexing:
def first_non_alpha_splitter(word):
for index, char in enumerate(word):
if not char.isalpha():
break
return [word[:index], word[index+1:]]
The result
first_non_alpha_splitter('ABC)D.')
# Output: ['ABC', 'D.']
first_non_alpha_splitter('AB:CD/')
# Output: ['AB', 'CD/']
Say I have strings such as 'ABC)D.'
or 'AB:CD/'
. How can I split them at the first non-alphabetic character to end up with ['ABC', 'D.']
and ['AB', 'CD/']
? Is there a way to do this without regex?
One option would be to find the location of the first non-alphabetic character:
def split_at_non_alpha(s):
try:
split_at = next(i for i, x in enumerate(s) if not x.isalpha())
return s[:split_at], s[split_at+1:]
except StopIteration: # if not found
return (s,)
print(split_at_non_alpha('ABC)D.')) # ('ABC', 'D.')
print(split_at_non_alpha('AB:CD/')) # ('AB', 'CD/')
print(split_at_non_alpha('.ABCD')) # ('', 'ABCD')
print(split_at_non_alpha('ABCD.')) # ('ABCD', '')
print(split_at_non_alpha('ABCD')) # ('ABCD',)
You can use a loop
a = 'AB$FDWRE'
i = 0
while i<len(a) and a[i].isalpha():
i += 1
>>> a[:i]
'AB'
>>> a[i:]
'$FDWRE'
Barmar’s suggestion‘s worked best for me. The other answers had near the same execution time but I chose the former for readability.
from itertools import takewhile
str = 'ABC)D.'
alphStr = ''.join(takewhile(lambda x: x.isalpha(), str))
print(alphStr) # Outputs 'ABC'
With for loop
, enumerate
, and string indexing:
def first_non_alpha_splitter(word):
for index, char in enumerate(word):
if not char.isalpha():
break
return [word[:index], word[index+1:]]
The result
first_non_alpha_splitter('ABC)D.')
# Output: ['ABC', 'D.']
first_non_alpha_splitter('AB:CD/')
# Output: ['AB', 'CD/']