Unable to split data

Question:

I have a data like below:

data = """1000
2000
3000

4000

5000
6000

7000
8000
9000

10000"""

Now, I want to sum up the elements that appear before the space and maintain the max_sum track with the sum of the next elements that appear before the empty line. So for me, it should be the sum of 1000,2000,3000 = 6000 compared with the initial max_sum for eg 0, and now sum the next element i.e 4000, and keep comparing with the max_sum which could be like max(6000, 4000) = 6000 and keep on doing the same but need to reset the sum if I encounter a empty line.

Below is my code:

max_num = 0
    sum = 0
    for line in data:
        # print(line)
        sum = sum + int(line)
        if line in ['n', 'rn']:
            sum=0
        max_num = max(max_num, sum)

This gives an error:

sum = sum + int(line)
ValueError: invalid literal for int() with base 10: 'n'
Asked By: RushHour

||

Answers:

You are trying to cast empty lines to int:

max_num = 0
sum = 0
for line in data:
    print(line)
    if line.strip():
        sum = sum + int(line)
    if line in ['n', 'rn']:
        sum=0
    max_num = max(max_num, sum)
Answered By: rtoth

There are lines that are just composed of ‘n’, which you are trying to convert into int.
You should move your test for line up the int conversion, and continue without casting to int if the line is ‘n’ or ‘rn’

Answered By: Loïc Robert

Here’s a quick oneliner:

data = """1000
2000
3000

4000

5000
6000

7000
8000
9000

10000"""

max(
    sum(
        int(i) for i in l.split('n')
    ) for l in data.split('nn')
)

which gives 24000

First it divides based on nn and then based on n. Sums all elements in the groups and then chooses the biggest value.

Answered By: alex

Don’t use builtin names like sum, here you need to split the data in n you will get list then you can loop over and remove space using strip() then if line has some digits it will sum it else it will assign 0.

max_num = 0
sum_val = 0


for line in data.split("n"):
    line = line.strip()
    sum_val = int(line) + sum_val if line and line.isdigit() else 0
    max_num = max(max_num, sum_val)
print(max_num)
Answered By: Usman Arshad

You can try:

data = """1000
    2000
    3000
    
    4000
    
    5000
    6000
    
    7000
    8000
    9000
    
    10000
    """

data = data.splitlines()

max_sum = 0
group = []

for data_index, single_data in enumerate(data):
    single_data = single_data.replace(" ","")
    if single_data == "":
        if max_sum < sum(group):
            max_sum = sum(group)
        group = []
    else:
        group.append(int(single_data))

print(max_sum)

Output:

24000
Answered By: Harsha Biyani

Note that int() is impervious to leading and trailing whitespace – e.g., int(‘n99n’) will result in 99 without error. However, a string comprised entirely of whitespace will result in ValueError. That’s what is happening here. You’re trying to parse a string that just contains a newline character.

You can take advantage of ValueError for these data as follows:

data = """1000
2000
3000

4000

5000
6000

7000
8000
9000

10000"""

current_sum = 0
max_sum = float('-inf')

for t in data.splitlines():
    try:
        x = int(t)
        current_sum += x
    except ValueError:
        max_sum = max(max_sum, current_sum)
        current_sum = 0

print(f'Max sum = {max(max_sum, current_sum)}')

Output:

Max sum = 24000
Answered By: Cobra
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.