Eventually the ids also need to be sorted

Question:

I have to design an algorithm to sort a list of orders by selection time (t selection, finding the good in the warehouse and bringing it to the surface) plus shipping time (t shipping, constant). The customer orders can be retrieved (in the same order as placed) from a server database

The program takes as input a data-set of orders where the id, t selection, and t shipping are of type unsigned int, n is the number of orders and a space character.

id1, t selection1, t shipping1; ...; idn, t selectionn, t shippingn n

The expected output is a space-separated list of the ids, sorted by t selection + t shipping and terminated by a new line n.

Input: 1, 500, 100; 2, 700, 100; 3, 100, 100n

Output: 3 1 2n

I have made this work with mergesort, however, my teacher also wants the ids to be sorted. for example, if there are 2 orders with the same sum of selection + shipping time, the smallest id should come first. see example

 80001, 1000, 10; 70001, 1000, 100

my program still outputs the 80001 first, but 70001 should come first. I have provided my code below. Anyone that can help?

#!/usr/bin/env python3
import sys


class Order:
    def __init__(self, id: int, selection_time: int, shipping_time: int):
        self.id: int = id
        self.selection_time: int = selection_time
        self.shipping_time: int = shipping_time


def merge(lit):
    def combi(order_):
        return order_.selection_time + order_.shipping_time

    if len(lit) > 1:
        mid = len(lit) // 2
        left = lit[:mid]
        right = lit[mid:]
        merge(left)
        merge(right)

        i = j = k = 0
        while i < len(left) and j < len(right):
            if combi(left[i]) < combi(right[j]):
                lit[k] = left[i]
                i += 1
            else:
                lit[k] = right[j]
                j += 1
            k += 1
        for _i in range(i, len(left)):
            lit[k] = left[_i]
            k += 1
        for _j in range(j, len(right)):
            lit[k] = right[_j]
            k += 1
    return lit


if __name__ == '__main__':
    '''
    Retrieves and splits the input
    '''
    data = input()
    data = data.split('; ')
    order_list = []
    for d in data:
        id, selection_t, shipping_t = d.split(', ', 2)
        order: Order = Order(int(id), int(selection_t), int(shipping_t))
        order_list.append(order)
    merge(order_list)
    for order in order_list:
        sys.stdout.write(str(order.id))
        sys.stdout.write(" ")

Asked By: Ama Sma

||

Answers:

In your sorting function you have to handle the case of equal order.selection_time values if you want to sort using more then one value for sorting. The idea with combi(order_) was generally not the best one as it assumes that selection_time value is an order of magnitude higher than the value of order_shipping. To point it out to you is probably the reason for the requirement. In order to sort using more then one value you need to add a test for equality in your merge function. See below for your merge code which sorts using three criteria for sorting:

def merge(lit):
    def combi(order_):
        return ( order_.selection_time, order_.id, order_.shipping_time )

    if len(lit) > 1:
        mid = len(lit) // 2
        left = lit[:mid]
        right = lit[mid:]
        merge(left)
        merge(right)

        i = j = k = 0
        while i < len(left) and j < len(right):
            for cond_indx in range(3): 
                if   combi(left[i])[cond_indx] < combi(right[j])[cond_indx]:
                    lit[k] = left[i]
                    i += 1
                    break
                elif combi(left[i])[cond_indx] > combi(right[j])[cond_indx]:
                    lit[k] = right[j]
                    j += 1
                    break
            k += 1
        for _i in range(i, len(left)):
            lit[k] = left[_i]
            k += 1
        for _j in range(j, len(right)):
            lit[k] = right[_j]
            k += 1
    return 0

Notice that for cond_indx in range(3): is there to eliminate nesting of if: elif: else: in case of same sub-key value for sorting. The else: case for equal values is handled this way by progressing to evaluation of the next value indexed by cond_indx.

For following data:

data = "80001, 1000, 10; 70001, 1000, 100; 60001, 800, 10; 70001, 1000, 50"

The above modified merge gives:

60001 70001 70001 80001

Following the request in the comment here an attempt to explain the shortcut taken with for cond_indx in range(3):. It’s only purpose was to reduce the amount of lines of code in the function. Check out the comments in the below code for hopefully helpful explanations:

for cond_indx in range(3): 
    if   combi(left[i])[cond_indx] < combi(right[j])[cond_indx]:
        lit[k] = left[i]
        i += 1
        break # OK, column values are different, stop looping
    elif combi(left[i])[cond_indx] > combi(right[j])[cond_indx]:
        lit[k] = right[j]
        j += 1
        break # OK, column values are different, stop looping
    # columns are equal so go on to sort on next column
# if all columns are equal no action is necessary 

is equivalent to:

# Compare using order_.selection_time to sort on order_.selection_time: 
#                  [0] means here order_.selection_time 
if   combi(left[i])[0] < combi(right[j])[0]: 
    lit[k] = left[i]
    i += 1
elif combi(left[i])[0] > combi(right[j])[0]:
    lit[k] = right[j]
    j += 1
else: # equal order_.selection_time so continue sorting on order_.id
#                      [1] means here order_.id 
    if   combi(left[i])[1] < combi(right[j])[1]:
        lit[k] = left[i]
        i += 1
    elif combi(left[i])[1] > combi(right[j])[1]:
        lit[k] = right[j]
        j += 1
    else: # order_id(s) are equal so sort on order_.shipping_time
    #                      [2] means here order_.shipping_time 
        if   combi(left[i])[2] < combi(right[j])[2]:
            lit[k] = left[i]
            i += 1
        elif combi(left[i])[2] > combi(right[j])[2]:
            lit[k] = right[j]
            j += 1
        # else: not needed here, all columns are equal, no action required 
Answered By: Claudio

When Python compares two tuples, it compares them element by element until it finds one that is unequal. So compare tuples instead of a single number.

    def combi(order_):
        return (order_.selection_time + order_.shipping_time, order_.id)
Answered By: Mark Ransom