Python Split String at First Non-Alpha Character

Question:

Say I have strings such as 'ABC)D.' or 'AB:CD/'. How can I split them at the first non-alphabetic character to end up with ['ABC', 'D.'] and ['AB', 'CD/']? Is there a way to do this without regex?

Asked By: Happy John

||

Answers:

One option would be to find the location of the first non-alphabetic character:

def split_at_non_alpha(s):
    try:
        split_at = next(i for i, x in enumerate(s) if not x.isalpha())
        return s[:split_at], s[split_at+1:]
    except StopIteration: # if not found
        return (s,)

print(split_at_non_alpha('ABC)D.')) # ('ABC', 'D.')
print(split_at_non_alpha('AB:CD/')) # ('AB', 'CD/')
print(split_at_non_alpha('.ABCD')) # ('', 'ABCD')
print(split_at_non_alpha('ABCD.')) # ('ABCD', '')
print(split_at_non_alpha('ABCD')) # ('ABCD',)
Answered By: j1-lee

You can use a loop

a = 'AB$FDWRE'
i = 0
while i<len(a) and a[i].isalpha():
    i += 1

>>> a[:i]
'AB'
>>> a[i:]
'$FDWRE'
Answered By: AndrzejO

Barmar’s suggestion‘s worked best for me. The other answers had near the same execution time but I chose the former for readability.

from itertools import takewhile

str = 'ABC)D.'
alphStr = ''.join(takewhile(lambda x: x.isalpha(), str))

print(alphStr) # Outputs 'ABC'
Answered By: Happy John

With for loop, enumerate, and string indexing:

def first_non_alpha_splitter(word):
    for index, char in enumerate(word):
        if not char.isalpha():
            break
    return [word[:index], word[index+1:]]

The result

first_non_alpha_splitter('ABC)D.')
# Output: ['ABC', 'D.']

first_non_alpha_splitter('AB:CD/')
# Output: ['AB', 'CD/']
Answered By: Reincoder
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.