How to iterate over a list, modify it but avoid an error?

Question:

I have two variable lenght lists extracted from an excel file. One has wagon number and the other the wagon weight, something like this:

wagon_list = [1234567, 2345678, 3456789, 4567890]
weight_list = [1.1, 2.2, 3.3, 4.4]

Sometimes the wagon_list will have a duplicate number, I need to sum the wagon weight and remove the duplicate from both:

wagon_list = [1234567, 2345678, 2345678, 4567890]
weight_list = [1.1, 2.2, 3.3, 4.4]

should become:

wagon_list = [1234567, 2345678, 4567890]
weight_list = [1.1, 5.5, 4.4]

My first option was to pop items and sum them while iterating with a for loop. It didnt work because (after some research) you cant change a list youre iterating over.
So I moved to the second option, using an auxiliary list. It doesnt work when it hits the last index. Even after some tweaking of my code, I cant find a solution.

I can see it would have further problems if the last three elements were to be added.

counter_3 = 0

for i in wagon_list:

    if i == wagon_list[-1]: #last entry, simply appends to the new list. This comes first because the next option returns error if running the last entry as i
        new_wagon_list.append(wagon_list[counter_3])
        new_weight_list.append(weight_list[counter_3])
        counter_3 +=2

    elif i != wagon_list[(counter_3 + 1)]: #if they are different, appends.
        new_wagon_list.append(wagon_list[counter_3])
        new_weight_list.append(weight_list[counter_3])
        counter_3 += 1

    elif i == wagon_list[(counter_3 + 1)]: #if equal to next item, appends the wagon and sums the weights
        new_wagon_list.append(wagon_list[counter_3])
        new_weight_list.append(weight_list[counter_3] + weight_list[counter_3 + 1])

This should return:

wagon_list = [1234567, 2345678, 4567890]
weight_list = [1.1, 5.5, 4.4]

But returns

wagon_list = [1234567, 2345678, 3456789, 3456789, 3456789]
weight_list = [1.1, 2.2, 7.7, 7.7, 3.3]
Asked By: PStavis

||

Answers:

Here is a simple way, using defaultdict (hence the result is correct even if wagon_list is unordered). You could also use groupby but then you have to sort both lists so that duplicate wagons are consecutive.

This solution requires a single pass through the lists, and doesn’t change the order of the lists. It just removes duplicate wagons and adds their weight.

from collections import defaultdict

def group_weights(wagon_list, weight_list):
    ww = defaultdict(float)
    for wagon, weight in zip(wagon_list, weight_list):
        ww[wagon] += weight

    return list(ww), list(ww.values())

Example

# set up MRE

wagon_list = [1234567, 2345678, 2345678, 4567890]
weight_list = [1.1, 2.2, 3.3, 4.4]

new_wagon_list, new_weight_list = group_weights(wagon_list, weight_list)

>>> new_wagon_list
[1234567, 2345678, 4567890]

>>> new_weight_list
[1.1, 5.5, 4.4]

Addendum

If you’d like to avoid defaultdict altogether, you can also simply do this (same result as above):

ww = {}
for k, v in zip(wagon_list, weight_list):
    ww[k] = ww.get(k, 0) + v
new_wagon_list, new_weight_list = map(list, zip(*ww.items()))

Explanation

A quick review of some of the tools and syntax used above:

  • zip(*iterables) "Make an iterator that aggregates elements from each of the iterables." So e.g.:

    for x, y in zip(wagon_list, weight_list):
        print(f'x={x}, y={y}')
    # prints out
    x=1234567, y=1.1
    x=2345678, y=2.2
    x=2345678, y=3.3
    x=4567890, y=4.4
    
  • dict.get(key[, default]) "Return the value for key if key is in the dictionary, else default." In other words, with ww[k] = ww.get(k, 0) + v, we are saying: add v to ww[k], but if it doesn’t exist yet, then use 0 as a starting point.

  • The last bit (new_wagon_list, new_weight_list = map(list, zip(*ww.items()))) uses the idiom that "zip() in conjunction with the * operator can be used to unzip a list" (or, in this case, an iterator of tuples key, value obtained from dict.items()). Without the map(list, ...), we would get tuples in the two variables. I thought you may want to stick with lists, so we apply list(.) to each tuple before assigning to new_wagon_list resp. new_weight_list.

Answered By: Pierre D

Modifying a list that you’re iterating over doesn’t work out well. I’d zip the two lists together and use itertools.groupby:

>>> from itertools import groupby
>>> wagon_list = [1234567, 2345678, 2345678, 4567890]
>>> weight_list = [1.1, 2.2, 3.3, 4.4]
>>> wagon_list, weight_list = map(list, zip(*(
...     (wagon, sum(weight for _, weight in group))
...     for wagon, group in groupby(sorted(
...         zip(wagon_list, weight_list)
...     ), key=lambda t: t[0])
... )))
>>> wagon_list
[1234567, 2345678, 4567890]
>>> weight_list
[1.1, 5.5, 4.4]
Answered By: Samwise

Use a dictionary to combine the values:

In [1]: wagon_list = [1234567, 2345678, 2345678, 4567890]
   ...: weight_list = [1.1, 2.2, 3.3, 4.4]
Out[1]: [1.1, 2.2, 3.3, 4.4]

In [2]: together = {}
Out[2]: {}

In [3]: for k, v in zip(wagon_list, weight_list):
   ...:     together[k] = together.setdefault(k, 0) + v
   ...:     

In [4]: together
Out[4]: {1234567: 1.1, 2345678: 5.5, 4567890: 4.4}

In [6]: new_wagon_list = list(together.keys())
Out[6]: [1234567, 2345678, 4567890]

In [7]: new_weight_list = list(together.values())
Out[7]: [1.1, 5.5, 4.4]
Answered By: Roland Smith

No fluff, frills, dependency or mystery version. Either an index for the current wagon is going to be found, allowing us to pinpoint the weight index to modify or no index is found and we append both of the new values.

Your entire problem revolves around "Does this already exist?". When using any Iterable, we can answer that question with index. index throws an Exception if no index is found so, we wrap it in try and treat except as an else.

def wagon_filter(wagons:list, weights:list) -> tuple:
    #pre-zip and clear so we can reuse the references
    data   = zip(wagons, weights)
    wagons, weights = [], []
    
    #reassign
    for W, w in data:
        try:      #(W)agon exists? modify it's (w)eight index
            i = wagons.index(W)
            weights[i] += w
        except:   #else append new (W)agon and (w)eight
            wagons.append(W)
            weights.append(w)
     
    return wagons, weights

usage:

#data
wagons  = [1234567, 2345678, 2345678, 4567890]
weights = [1.1, 2.2, 3.3, 4.4]

#print filter results
print(*wagon_filter(wagons, weights), sep='n')

#[1234567, 2345678, 4567890]
#[1.1, 5.5, 4.4]
Answered By: OneMadGypsy
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.