How to create a new list in Python based on two factors?

Question:

I have a list of list in Python like this:

vehicle_list = [['car', '123464', '4322445'],   ['car', '64346', '643267'], ['bicycle', '1357', '78543'], 
        ['bicycle', '75325', '75425'], ['car', '8652', '652466'], ['taxi', '653367', '63226'], 
        ['taxi', '96544', '22267'], ['taxi', '86542222', '54433'],     
        ['motorcycle', '675422', '56312'], ['motorcycle', '53225', '88885'], ['motorcycle', '773345', '9977'], 
        ['motorcycle', '3466', '987444'], ['truck', '2455554', '5225544'], ['truck', '2455554', '344543'], 
        ['train', '6543355', '6336']]

I want to return the top 3 vehicles which has the highest number at the end. Like this:

top_vehicle = [['truck', '2455554', '5225544'], ['car', '123464', '4322445'], ['motorcycle', '3466', '987444']]

I have tried sorting this way, but with my result, the vehicles are repeating which I do not want. I want the unique vehicles in my sorted list. I have tried the code this way:

top_vehicle = (sorted(vehicle_list, key=lambda x: int(x[-1]), reverse = True))[:3]
print(top_vehicle)


[['truck', '2455554', '5225544'], ['car', '123464', '4322445'], ['car', '8652', '652466']]
Asked By: ssdn

||

Answers:

  • Group by vehicle type using a dict;
  • Select the maximum element of each vehicle type;
  • Take the top 3 elements from that list of maximum elements.

To take the top three elements, you can sort like you did, or use heapq.nlargest.

from heapq import nlargest
from operator import itemgetter

vehicle_list = [['car', '123464', '4322445'],   ['car', '64346', '643267'], ['bicycle', '1357', '78543'], 
        ['bicycle', '75325', '75425'], ['car', '8652', '652466'], ['taxi', '653367', '63226'], 
        ['taxi', '96544', '22267'], ['taxi', '86542222', '54433'],     
        ['motorcycle', '675422', '56312'], ['motorcycle', '53225', '88885'], ['motorcycle', '773345', '9977'], 
        ['motorcycle', '3466', '987444'], ['truck', '2455554', '5225544'], ['truck', '2455554', '344543'], 
        ['train', '6543355', '6336']]

grouped = {}
for v,x,y in vehicle_list:
    grouped.setdefault(v, []).append((v,int(x),int(y)))
# {'car': [('car', 123464, 4322445), ('car', 64346, 643267), ('car', 8652, 652466)],
#  'bicycle': [('bicycle', 1357, 78543), ('bicycle', 75325, 75425)],
#  'taxi': [('taxi', 653367, 63226), ('taxi', 96544, 22267), ('taxi', 86542222, 54433)],
#  'motorcycle': [('motorcycle', 675422, 56312), ('motorcycle', 53225, 88885), ('motorcycle', 773345, 9977), ('motorcycle', 3466, 987444)],
#  'truck': [('truck', 2455554, 5225544), ('truck', 2455554, 344543)],
#  'train': [('train', 6543355, 6336)]}


max_vehicles = [max(l, key=itemgetter(2)) for l in grouped.values()]
# [('car', 123464, 4322445), ('bicycle', 1357, 78543), ('taxi', 653367, 63226), ('motorcycle', 3466, 987444), ('truck', 2455554, 5225544), ('train', 6543355, 6336)]

top3 = nlargest(3, max_vehicles, key=itemgetter(2))
# [('truck', 2455554, 5225544), ('car', 123464, 4322445), ('motorcycle', 3466, 987444)]
Answered By: Stef

After you have a list of sorted vehicles, you can loop through it. You only append the vehicle tuple to the top_vehicles list if the type of vehicle has not been added before

sorted_vehicles = sorted(vehicle_list, key=lambda x: int(x[-1]), reverse=True)

unique = set()
top_vehicles = []
for vehicle_type, _, number in sorted_vehicles:
  if (vehicle_type in unique): # if the type of vehicle has already been seen
    continue

  unique.add(vehicle_type) # if it has not been seen, add it to the list of seen vehicle types
  top_vehicles.append([vehicle_type, _, number]) # add it to the list of top_vehicles

  if (len(unique) == 3): # get only the top 3
    break

print(top_vehicles)
# [['truck', '2455554', '5225544'], ['car', '123464', '4322445'], ['motorcycle', '3466', '987444']]

I noticed that the output of my program does not match yours, but I think the motorcycle one in yours is incorrect because 987444 > 88885.

Answered By: Nathan Dai

Method by

  • Get the item with largest number at end of item for vehicle
  • Reverse sort the values of dictionary by the number at end of item
  • Get the first three items
vehicle_list = [['car', '123464', '4322445'],   ['car', '64346', '643267'], ['bicycle', '1357', '78543'],
        ['bicycle', '75325', '75425'], ['car', '8652', '652466'], ['taxi', '653367', '63226'],
        ['taxi', '96544', '22267'], ['taxi', '86542222', '54433'],
        ['motorcycle', '675422', '56312'], ['motorcycle', '53225', '88885'], ['motorcycle', '773345', '9977'],
        ['motorcycle', '3466', '987444'], ['truck', '2455554', '5225544'], ['truck', '2455554', '344543'],
        ['train', '6543355', '6336']]

# Get largest one for each vehicle by dictionary
largest = {}
for item, no1, no2 in vehicle_list:
    if (item in largest and int(largest[item][2]) < int(no2)) or (item not in largest):
        largest[item] = [item, no1, no2]

# Reverse sort list by 3rd value and get first three
top_vehicle = sorted(list(largest.values()), key=lambda x:int(x[2]), reverse=True)[:3]

print(top_vehicle)
# [['truck', '2455554', '5225544'], ['car', '123464', '4322445'], ['motorcycle', '3466', '987444']]
Answered By: Jason Yang
Categories: questions Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.