Replace None in list with average of last 3 non-None entries when the average is above the same entry in another list

Question:

I have two lists:

dataa = [11, 18, 84, 51, 82, 1, 19, 45, 83, 22]
datab = [1, None, 40, 45, None, None, 23, 24, None, None]

I need to replace all None in datab for any instance where the prior 3 entries are > than the data entry (see walk-through example below).
Ignore entries where there are not 3 prior non-None entries to average to make the comparison to dataa.

My first attempt was this:

for i in range(len(dataa)):
        if (datab[i] == None):
                a = (datab[i-3]+datab[i-2]+datab[i-1])/3
                if ((datab[i-3]+datab[i-2]+datab[i-1])/3 > dataa[i]):
                        datab[i] = dataa[i]

It errors trying to compute the average of the prior three in the case where one of the prior 3 are None. I tried to keep a running total, but this fails for some of them.

c = 0;
a = 0;
for i in range(len(dataa)):
        c = c + 1
        if (datab[i] == None):
                if (a > dataa[i]):
                        datab[i] = a
        else:
                if (c > 2):
                        a = (a * 3 + datab[i])/3

This also did not work as expected.

From this sample data, I expected:

  • Entry 1, 2, and 3 have no average, so leave as is.
  • Entry 5 is None in datab and 82 in dataa. Since (1+40+45)/3 = 28.66 we also leave as is.
  • Entry 6 is None in datab and 1 in dataa. The 3 prior non-None average are greater (28.66 > 1), so set to the 28.66 average.
  • Entry 9 is None, but (28.66+23+24)/3 = 25.22 is not greater than 83, so leave as is.
  • Entry 10 is None and the 3 prior non-None average are greater (25.22>22), so set it to the 25.22 average.

The correct expected output:

[1, None, 40, 45, None, 28.66, 23, 24, None, 25.22]
Asked By: James Risner

||

Answers:

Let’s use a collections.deque to keep track of our window of numbers to average, since popping off the top of a deque is cheaper than popping off the top of a list.

Thanks to @ShadowRanger for pointing out the maxlen feature of deque, which allows us to append an element, and the deque automatically pops the left element if needed.

from collections import deque

dataa = [11, 18, 84, 51, 82, 1, 19, 45, 83, 22]
datab = [1, None, 40, 45, None, None, 23, 24, None, None]

result = []

moving_avg = 0
sliding_window = deque(maxlen=3)

# Iterate over the two lists simultaneously
for a, b in zip(dataa, datab):
    # If b already has a value
    # Or the window has less than three items
    # Or the average is less than the element of dataa
    if b is not None or len(sliding_window) < 3 or moving_avg < a:
        # Append the element of datab to result
        result.append(b)
    else:
        # Else, append the moving average
        result.append(moving_avg)

    # If the value we just appended to our result is not None
    # Then append it to the sliding window
    if result[-1] is not None:
        sliding_window.append(result[-1])
    
    # Recalculate moving average
    moving_avg = sum(sliding_window) / len(sliding_window)
        
        
print(result) 
# [1, None, 40, 45, None, 28.666666666666668, 23, 24, None, 25.222222222222225]

You could save on some computation time by keeping track of the element being popped off the deque and using that to calculate the moving average, but for a deque of size 3 that shouldn’t be such a big deal anyway.

Answered By: Pranav Hosangadi
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.