Python – How to split a string until an occurence of an integer?

Question:

In Python, I am trying to split a string until an occurence of an integer, the first occurence of integer will be included, rest will not.

Example strings that I will have are shown below:

SOME STRING (IT WILL ALWAYS END WITH PARANTHESIS) 2 3 ---
SOME OTHER STRING (PARANTHESIS AGAIN) 5 --- 3
AND SOME OTHER (AGAIN) 2 1 4

And the outputs that I need for these examples are going to be:

SOME STRING (IT WILL ALWAYS END WITH PARANTHESIS) 2
SOME OTHER STRING (PARANTHESIS AGAIN) 5
AND SOME OTHER (AGAIN) 2

Structure of all input strings will be in this format. Any help will be appreciated. Thank you in advance.

I’ve basically tried to split it with using spaces (" "), but it of course did not work. Then, I tried to split it with using "—" occurence, but "—" may not exist in every input, so I failed again.
I also referred to this: How to split a string into a string and an integer?
However, the answer suggests to split it using spaces, so it didn’t help me.

Asked By: codeine

||

Answers:

It’s ideal case for regular expression.

import re

s = "SOME STRING (IT WILL ALWAYS END WITH PARANTHESIS) 2 3 ---"
m = re.search(r".*?[0-9]+", s)
print(m.group(0))

Explanation:

  • .* matches any number of characters
  • ? tells to not be greedy (without it it will stop in last integer)
  • [0-9]+ – matches one or more digits

It can be done without regular expressions too:

result = []
for word in s.split(" "):
    result.append(word)
    if word.isdigit(): # it returns True if string can be converted to int
        break
print(" ".join(result))
Answered By: kosciej16

try the following regular expression:

import re
r = re.compile('(D*d+).*')
r.match('SOME STRING (IT WILL ALWAYS END WITH PARANTHESIS) 2 3 -').groups()[0]
==> 'SOME STRING (IT WILL ALWAYS END WITH PARANTHESIS) 2'
Answered By: Alberto Garcia

Solution without re:

lst = [
    "SOME STRING (IT WILL ALWAYS END WITH PARANTHESIS) 2 3 ---",
    "SOME OTHER STRING (PARANTHESIS AGAIN) 5 --- 3",
    "AND SOME OTHER (AGAIN) 2 1 4",
]

for item in lst:
    idx = next(idx for idx, ch in enumerate(item) if ch.isdigit())
    print(item[: idx + 1])

Prints:

SOME STRING (IT WILL ALWAYS END WITH PARANTHESIS) 2
SOME OTHER STRING (PARANTHESIS AGAIN) 5
AND SOME OTHER (AGAIN) 2
Answered By: Andrej Kesely
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.