Python: Extract integers from string based on position of specific letters
Question:
New to Python
Scenario
- A string is a combination of letters and digits.
- There is always an integer following a letter. But the number of digits of an integer varies.
- For letter "A", there is a pattern to locate its position. For example, if I replace all letters in the string by a delimiter, such as commas, "A" is always on the position of multiple of 3. I have figured a way to extract the integers that following "A".
- However, for letter "B", there is no pattern where in the string it might appear.
My Goal
- Extract all integers following letter "A" and "B", respectively.
- Sum them after extraction, respectively.
Trial and Error
The way I handled "A" pattern is to use modulus operation (%) to achieve my goal
s = 'A4B2d2A14d3B11A20B10d10'
s_replace = s.replace('A', ',').replace('B', ',').replace('d', ',')[1:]
test_n = [int(n) for n in s_replace.split(',') if n.isdigit()]
test_Asum = sum(test_n[i] for i in range(len(test_n)) if i % 3 == 0)
print(test_Asum)
# 38
Please help with "B" issue. I can find the index of "B" in the string. However, how could I extract how ever many digits starting from the index of "B"?
test_B_index = [Bindx for Bindx, char in enumerate(s) if char == "B"]
print(test_B_index)
# [2, 11, 17]
Answers:
My favorite way to extract things is to use regex.
Here an example in your case.
I hope it helps.
import re
s = 'A4B2d2A14d3B11A20B10d10'
m = re.findall(r'B([0-9]+)', s)
print(m)
#['2', '11', '10']
print(sum([int(n) for n in m]))
#23
New to Python
Scenario
- A string is a combination of letters and digits.
- There is always an integer following a letter. But the number of digits of an integer varies.
- For letter "A", there is a pattern to locate its position. For example, if I replace all letters in the string by a delimiter, such as commas, "A" is always on the position of multiple of 3. I have figured a way to extract the integers that following "A".
- However, for letter "B", there is no pattern where in the string it might appear.
My Goal
- Extract all integers following letter "A" and "B", respectively.
- Sum them after extraction, respectively.
Trial and Error
The way I handled "A" pattern is to use modulus operation (%) to achieve my goal
s = 'A4B2d2A14d3B11A20B10d10'
s_replace = s.replace('A', ',').replace('B', ',').replace('d', ',')[1:]
test_n = [int(n) for n in s_replace.split(',') if n.isdigit()]
test_Asum = sum(test_n[i] for i in range(len(test_n)) if i % 3 == 0)
print(test_Asum)
# 38
Please help with "B" issue. I can find the index of "B" in the string. However, how could I extract how ever many digits starting from the index of "B"?
test_B_index = [Bindx for Bindx, char in enumerate(s) if char == "B"]
print(test_B_index)
# [2, 11, 17]
My favorite way to extract things is to use regex.
Here an example in your case.
I hope it helps.
import re
s = 'A4B2d2A14d3B11A20B10d10'
m = re.findall(r'B([0-9]+)', s)
print(m)
#['2', '11', '10']
print(sum([int(n) for n in m]))
#23