Length of longest subarray of sum less than or equal to k

Question:

In an interview I was asked this question: given some array of positive integers s, find the length of the longest subarray such that the sum of all its values is less than or equal to some positive integer k. Each input will always have at least one solution. The array is not circular.

I began writing a dynamic programming solution that worked by finding the maximum length at increasingly larger values from 0 to k.

Here is my code in python, there is an error inside it that I couldn’t seem to find, my answer is always off by a few digits:

def maxLength(s, k):
    lengths = [0 for x in range(k)]
    for i in range(1,k+1):
        for j in range(len(s)):
            if s[j] <= i and lengths[i - s[j]] + 1 > lengths[i]:
                lengths[i] = lengths[i - s[j]] + 1
        if i + 1 == len(s):
            break
    return lengths[-1]

Input1:s = [1,2,3], k = 4

Output1: 2

Input2: s=[3,1,2,1], k = 4

Output2: 3

Asked By: nonsequiter

||

Answers:

I think this works… (recursively and taking out the contiguous requirement from the question, since that doesn’t seem to match the sample outputs provided in the question) and the OP mentions that the question was:

given some array of positive integers s, find the length of the longest subarray such that the sum of all values is equal to some positive integer k.

def longest_sum(input_list, index, num_used, target_number):
    if target_number == 0:
        return num_used
    if index >= len(input_list):
        return 0

    # Taken
    used_1 = longest_sum(input_list, index + 1, num_used + 1, target_number - input_list[index])
    # Not taken
    used_2 = longest_sum(input_list, index + 1, num_used, target_number)
    return max(used_1, used_2)


if __name__ == "__main__":
    print(longest_sum([2, 1, 8, 3, 4], 0, 0, 6))
    print(longest_sum([1, 2, 3], 0, 0, 4))
    print(longest_sum([3, 1, 2, 1], 0, 0, 4))
    print(longest_sum([1, 2, 7, 8, 11, 12, 14, 15], 0, 0, 10))
    print(longest_sum([1, 2, 3], 0, 0, 999))
    print(longest_sum([1, 1, 1, 1, 1, 1, 4], 0, 0, 6))

Outputs:

3
# BorrajaX's note: 2 + 1 + 3
2
# BorrajaX's note: 3 + 1
3
# BorrajaX's note: 1 + 2 + 1
3
# BorrajaX's note: 1 + 2 + 7
0
# BorrajaX's note: No possible sum
6
# BorrajaX's note: 1 + 1 + 1 + 1 + 1 + 1

EDIT 01:

If you wanted to fetch which is the list that gives you the longest sum, you could always do it like this:

import copy


def longest_sum(input_list, used_list, target_number):
    if target_number == 0:
        return used_list

    if not input_list:
        return []

    # Taken
    used_list_taken = copy.copy(used_list)
    used_list_taken.append(input_list[0])
    used_1 = longest_sum(input_list[1:], used_list_taken, target_number - input_list[0])

    # Not taken
    used_list_not_taken = copy.copy(used_list)
    used_2 = longest_sum(input_list[1:], used_list_not_taken, target_number)
    if len(used_1) > len(used_2):
        return used_1
    else:
        return used_2


if __name__ == "__main__":
    print(longest_sum([2, 1, 8, 3, 4], [], 6))
    print(longest_sum([1, 2, 3], [], 4))
    print(longest_sum([3, 1, 2, 1], [], 4))
    print(longest_sum([1, 2, 7, 8, 11, 12, 14, 15], [], 10))
    print(longest_sum([1, 2, 3], [], 999))
    print(longest_sum([1, 1, 1, 1, 1, 1, 4], [], 6))

You’d see:

[2, 1, 3]
[1, 3]
[1, 2, 1]
[1, 2, 7]
[]
[1, 1, 1, 1, 1, 1]

PS 1: I really don’t know how to do this without the quick backtracking capabilities that recursion provides… Sorry 🙁

PS 2: If this is not what you wanted (I mentioned that I took out the contiguous requirement from the requirements) let me know, and I’ll remove this answer.

Answered By: BorrajaX

Bit shorter and assuming non circular s: Slide windows of decreasing size over s.

def maxlen(s, k):
    for win in range(k, 0, -1):
        for i in range(0, len(s) - win):
            if sum(s[i:i+win]) == k:
                return win
    return None

s = [3,1,2,3]
k = 4
print(maxlen(s, k))
Answered By: hvwaldow

You can do this in linear (O(n)) time:

def max_length(s, k):
    # These two mark the start and end of the subarray that `current` used to be.
    subarray_start = 0
    subarray_end = 0

    subarray_sum = 0
    max_len = -1 # returns -1 if there is no subsequence that adds up to k.
    for i in s:
        subarray_sum += i
        subarray_end += 1
        while subarray_sum > k: # Shrink the array from the left, until the sum is <= k.
            subarray_sum -= s[subarray_start]
            subarray_start += 1

        # After the previous while loop, subarray_sum is guaranteed to be 
        # smaller than or equal to k.
        max_len = max(max_len, subarray_end - subarray_start)

    return max_len

There was some confusion with the original question, where I thought we were looking for a subarray with sum **equal to (but not less than) k*.
My original answer is below. There’s information about the linearity of this solution in there as well, so read on if you’re interested.

Original Answer

Here’s how I would do it:

def max_length(s, k):
    current = []
    max_len = -1 # returns -1 if there is no subsequence that adds up to k.
    for i in s:
        current.append(i)
        while sum(current) > k: # Shrink the array from the left, until the sum is <= k.
           current = current[1:]
        if sum(current) == k:
            max_len = max(max_len, len(current))

    return max_len

This exploits the fact that we’re looking for a contiguous subarray to get a solution with linear (O(n)) time complexity. current is our current attempt at creating a subarray that adds up to k. We loop through s and add every element from s to current. If the total sum of current becomes too large (larger than k), we remove elements from the left of current until the sum is smaller than or equal to k. If at any point, the sum is equal to k, we record the length.


Well… I lied, and Francisco Couzo caught me in the comments. The code above is not really O(n) I’m calling len(current) and sum(current), which take at most n steps, making the algorithm run in quadratic time (O(n^2)). We can fix this by keeping track of the size and sum of current ourselves.

The version below gets us closer to O(n), but I noticed an issue while writing it.

def max_length(s, k):
    current = []
    len_current = 0
    sum_current = 0
    max_len = -1 # returns -1 if there is no subsequence that adds up to k.
    for i in s:
        current.append(i)
        sum_current += i
        len_current += 1
        while sum_current > k: # Shrink the array from the left, until the sum is <= k.
            sum_current -= current[0]
            current = current[1:]
            len_current -= 1
        if sum_current == k:
            max_len = max(max_len, len_current)

    return max_len

This piece of code might look like it’s O(n), and if it were written in Go, it would be. See that current = current[1:]? According to the TimeComplexities article in the Python wiki, taking a slice from a list takes O(n).

I couldn’t find a list operation that removes an element from the start, until I suddenly realized that I didn’t have to. current would always be a contiguous subarray of s, so why not just mark its start and end?

So here’s my final solution:

def max_length(s, k):
    # These two mark the start and end of the subarray that `current` used to be.
    subarray_start = 0
    subarray_end = 0

    subarray_sum = 0
    max_len = -1 # returns -1 if there is no subsequence that adds up to k.
    for i in s:
        subarray_sum += i
        subarray_end += 1
        while subarray_sum > k: # Shrink the array from the left, until the sum is <= k.
            subarray_sum -= s[subarray_start]
            subarray_start += 1
        if subarray_sum == k:
            max_len = max(max_len, subarray_end - subarray_start)

    return max_len

If you consider the array to be circular, which the first example case in the question seems to indicate, you can go through the array twice:

def max_length(s, k):
    s = s + s
    # These two mark the start and end of the subarray that `current` used to be.
    subarray_start = 0
    subarray_end = 0

    subarray_sum = 0
    max_len = -1 # returns -1 if there is no subsequence that adds up to k.
    for i in s:
        subarray_sum += i
        subarray_end += 1
        while subarray_sum > k: # Shrink the array from the left, until the sum is <= k.
            subarray_sum -= s[subarray_start]
            subarray_start += 1
        if subarray_sum == k:
            max_len = max(max_len, subarray_end - subarray_start)

    return max_len

There are probably checks you can make to break out of that second pass sooner, based on the values you’ve encountered in the first pass.

Answered By: bigblind

Original Answer

Initially the question was to find the length of the longest subarray that would sum to k.

You can run through the list indices, take each index as starting point of a window over which you sum. You then run over the indices from your starting index to the end to mark the end of the window. At each step you take the sum, or even better, add it to a sum term. If the sum exceeds the target you break out of the inner loop, moving on.

It will look like this:

def get_longes(a_list, k):
    longest = 0
    length = len(a_list)
    for i in xrange(length):
        s = 0
        for j in xrange(i,length):
            s+=a_list[j]
            if s < k:
                pass
            elif s==k:
                longest = j+1-i
            else:
                break
    return longest

This can be further speed up as you don’t need to reset the window size when you move one step in the outer loop. In fact you just need to keep track of the window size and decrease it by 1 if the outer loop moves on. In this way you can even get rid of the inner loop and write something in O(n):

def get_longest(a_list,k):
    length=len(a_list)
    l_length = 0
    longest = 0
    s = 0
    for i in xrange(length):
        while s<k:  # while the sum is smaller, we increase the window size
            if i+l_length==length: # this could also be handled with a try, except IndexError on the s+=a_list[... line
                return longest
            s+=a_list[i+l_length]
            l_length+=1
        if s == k:  # if the sum is right, keep its length if it is bigger than the last match.
            longest = max(l_length, longest)
        l_length-=1  # keep the window end at the same position (as we will move forward one step)
        s-=a_list[i]  # and subtract the element that will leave the window
    return longest

Answer to Updated Question

The updated question asks for the longest subarray for which the sum is equal or less than k.

For this question, the basic approach is the same and actually the solution becomes even simpler as now we only have two conditions on the sum, that is:

1) the sum is smaller or equal to k.

2) the sum is bigger than k.

The solution looks like:

def get_longest_smaller_or_equal(a_list,k):
    length=len(a_list)
    l_length = 0
    longest = 0
    s = 0
    for i in xrange(length):
        while s<=k:  # while the sum is smaller, we increase the window size
            longest = max(l_length, longest)
            if i+l_length==length: # this could also be handled with a try, except IndexError on the s+=a_list[... line
                return longest
            s+=a_list[i+l_length]
            l_length+=1
        l_length-=1  # keep the window end at the same position (as we will move forward one step)
        s-=a_list[i]  # and subtract the element that will leave the window
    return longest
Answered By: jojo

Here’s a solution that works for any iterable s (even an iterator). It’s essentially the same algorithm as bigblind’s answer, but it will be more efficient if k is large relative to the values in s (so that the lengths of relevant subsequences are long):

import itertools

def max_length(s, k):
    head, tail = itertools.tee(s)
    current_length = current_sum = 0
    max_len = -1 # returns -1 if there is no subsequence that adds up to k.

    for i in head:
        current_length += 1
        current_sum += i

        while current_sum > k:
           current_length -= 1
           current_sum -= next(tail)

        if current_sum == k:
            max_len = max(max_len, current_sum)

    return max_len

Because we don’t keep a list with the subsequence we’re examining as we iterate, this iterator-based approach is only useful if you only need the length of the longest subsequence, not its actual content.

If you want to get a copy of the longest subsequence, you could use a different variation on bigblind’s answer, using a collections.dequeue instead of a list (so that popping from the left is fast) and keeping track of the running sum like my code does (so you don’t need to call sum over and over):

import collections

def max_subsequence(s, k):
    current = collections.dequeue()
    current_sum = 0
    max_len = -1
    max_seq = None # returns None if there is no valid subsequence.

    for i in s:
        current.append(i)
        current_sum += i

        while current_sum > k: # Shrink from the left efficiently!
           current_sum -= current.popleft()

        if current_sum == k:
            if len(current) > max_len:
                max_len = len_current
                max_seq = list(current) # save a copy of the subsequence

    return max_seq

If your question title is misleading and you don’t actually care if the subsequence is contiguous, then I think your current dynamic programming approach can do what you want. I’m just not entirely sure I understand how your loops were intended to work. I think it is most natural with an outer loop over the input items, and a second loop over potential sums that include that value (which are indexes into the lengths list). I’d also suggest using None as the initial value for all lengths other than 0, since otherwise its hard to make the conditions work correctly without special cases.

def max_length(s, k):
    lengths = [None for _ in range(k+1)]
    lengths[0] = 0

    for x in s:
        for i in range(k, x-1, -1): # count down to avoid duplication
            if lengths[i-x] is not None and (lengths[i] is None or
                                             lengths[i-x] >= lengths[i]):
                lengths[i] = lengths[i-x] + 1

    return lengths[k]
Answered By: Blckknght

This article can help you very much.

https://e2718281828459045.wordpress.com/2013/08/19/longest-subarray-whose-sum-k/

It can be solved using
Sum Array + Binary Search.

First observation you get is that if we consider ith element then we must continue with (i+1)th
and so on. That is we need to add all the elements in a order till the last element or we reach the sum. So order is important.

How can we add the numbers. There are n ways present. The first way is that we can start with first element and add till we reach k or we reach the last element. The second way is that we can start with second element and read till we reach k or reaches the last element.

So a brute force algorithm will give you a O(n2)
solution. How can we improve on that? Obviously it is not the expected solution. For each i

we are calculating the sum of element and check that the total exceeds the given value of ‘k’ or not. To avoid this we can create a sum array.

Remember, whenever you encounter a problem with sequence sum(or the sum of continuous elements in a given array) most probably it can solve using sum array technique. Sum array is a newly constructed array using the given array. It can be generated using the following formula,

sum[i] = sum[i−1] + array[i]

for all i>0. and

sum[i]=array[i]

for i=0.

Sum array can be created in O(n) time. Finding the sum between ith and jth becomes easy. It’s the difference,

sum[j]−sum[i], j>i

will give you the answer. But still it O(n2) solution.
The problem is that for each value of i we take O(n) time to find the j.
So how can we reduce that?
Bingo! Here comes binary search. By using modified binary search on the interval i and n for each i we can find the j in O(logn) time. So it takes only O(nlogn) time. We need an additional variable and condition for storing the length of the subarray, that is j−i.

Hope this helps.

Answered By: nikoo28

Approaches to this problem

O(n) – Two pointer approach

Initialise first element to s and e and subArray_sum = arr[0]

Now if subArray_sum < k increment e while subArray_sum <= k

Once subArray_sum become >= k increment s till it becomes <= k

O(nlog n) – Binary Search

Consider all possible subarray lengths i.(1 <= i <= n) .Among all sub arrays of length i find the one with minimum sum.For a given value of i this can be done in O(n). Now for any sub-array of length i if the subarray of length i but with minimum sum has sum <= k,we can find i length sub arrays with sum <= k.Now to find longest i such that there exists a subarray of that length with sub array sum <= k.
Do binary search over range of i with start = 1 and end = n;

O(n*n) – Brute Force

Consider all possible subarrays(n*n in number) and find longest with sum <= k

Variant of above problem

Length of longest subarray with average less than or equal to k

All above approaches applicable here also

Answered By: Vinayak Sangar

CODE IN CPP

#include <bits/stdc++.h>
#define ll long long
using namespace std;

void solve(vector<int>& arr, int k) {
    int n = arr.size();
    int subarray_start = 0;
    int subarray_end = 0;

    int subarray_sum = 0;
    int max_len = -1; 
    for(int i=0;i<n;i++){
        subarray_sum += arr[i];
        subarray_end += 1;
        while(subarray_sum>=k){
            subarray_sum -= arr[subarray_start];
            subarray_start ++;
        }
        max_len = max(max_len,subarray_end - subarray_start);
    }
    cout<<max_len;
        
}

int main()
{
    int n,k;
    cin >> n >> k;
    vector<int> v(n);
    for(int i=0;i<n;i++)
        cin >> v[i];
    solve(v,k);
    return 0;
}

[executed code snap][output snap]

Answered By: Rahul Kumar

I just stumbled over this interesting discussion.

While reading and pondering the different approaches, I wondered why no one had proposed to return early when it is impossible to find a longer subarray.

max_len > array_len - subarray_start

The overhead is one extra integer value stored in terms of memory.
It also adds one subtraction and one comparison to the loop.

Both should not weigh too much and could be beneficial for longer arrays and subarrays.

Integrating this into bigblind’s solution:

def max_length(s, k):
    # These two mark the start and end of the subarray that `current` used to be.
    array_len = len(s)
    subarray_start = 0
    subarray_end = 0

    subarray_sum = 0
    max_len = -1 # returns -1 if there is no subsequence that adds up to k.
    for i in s:
        subarray_sum += i
        subarray_end += 1
        while subarray_sum > k: # Shrink the array from the left, until the sum is <= k.
            subarray_sum -= s[subarray_start]
            subarray_start += 1

        # After the previous while loop, subarray_sum is guaranteed to be 
        # smaller than or equal to k.
        max_len = max(max_len, subarray_end - subarray_start)
        
        # Return early when it is clear no longer subarray can be found
        if max_len > array_len - subarray_end: return max_len

    return max_len

I tested it with the original input plus several longer lists and higher k values (max_length([1,3,1,0,1,4,2,1,4,1], 4) for example) side by side with log output so that I was able to count the loops.

What really struck me: My addition saves maybe a loop or two. If we really massage the test input there might be cases where it saves more, but all in all it is not doing much more than adding more lines of code.

I just left this here if others happen to think of this as well.

Answered By: line-o