How can I sort a list, according to where its elements appear in another list?

Question:

I have a predefined list which indicates the order of some values – say, ['id','name','age','height','weight'], but it could be much longer.

How can I sort a different list whose values all appear within the first list, according to their position in the first list?

For example, sorting ['height','id'] should produce ['id','height'], because 'id' comes before 'height' in the first list.

Similarly, ['name','weight','height'] should become ['name','height','weight'] after sorting.

Can I do this using the key argument for the built-in sort somehow? Or will I have to write my own sorting routine?


See also: Given parallel lists, how can I sort one while permuting (rearranging) the other in the same way? – this is another common way that people want to sort one list "based on" another. Before attempting to close duplicate questions, take special care to check exactly what the OP wants. A key clue: do the lists need to be the same length?

Asked By: YardenST

||

Answers:

The most efficient way would be to create a map from word to order:

ordering = {word: i for i, word in enumerate(predefined_list)}

then use that mapping in sorting:

somelist.sort(key=ordering.get)

The alternative is to use .index() on the list to scan through the list and find the index for each word while sorting:

somelist.sort(key=predefined_list.index)

but this is not nearly as efficient as using the ordering dictionary.

Demo:

>>> predefined_list = ['id','name','age','height','weight',]
>>> ordering = {word: i for i, word in enumerate(predefined_list)}
>>> sorted(['height','id'], key=ordering.get)
['id', 'height']
>>> sorted(['name','weight','height'], key=ordering.get)
['name', 'height', 'weight']

The two methods would result in different sorting orders if any of the values in the predefined list were not unique. The .index() method uses the first occurrence of a value as the sort value, the dictionary method would use the last instead. There are ways around that, you can make the dictionary method process the list and indices in reverse for example.

Answered By: Martijn Pieters

The shortest solution:

lst  = ['id', 'name', 'age', 'height', 'weight',]
test = ['name', 'weight', 'height']

print [word for word in lst if word in test]

Returns:

['name', 'height', 'weight']

This shows all items from lst in that order only if the item is contained in test.
Advantage: no sorting needed.

After comments:

Disadvantages:
– Duplicates might not be shown
– The ‘in’ operator is meaning a traversal; same as the second one. So if the list is long it might be inefficient. However the solution from Martijn also has two sorts so I cannot decide easily which is more efficient.

Answered By: Michel Keijzers
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.