Implement order definition by multiple attributes
Question:
I’d like to know how to efficiently sort by object’s multiple attributes. For instance, given
class A:
def __init__(self, a, b):
self.a = a
self.b = b
def __lt__(self, other):
if self.a == other.a:
return self.b < other.b
return self.a < other.a
def __str__(self):
return f'{self.__class__.__name__}({self.a}, {self.b})'
A1 = A(1, 5)
A2 = A(2, 5)
A3 = A(1, 6)
A4 = A(4, 0)
A5 = A(-3, -3)
A6 = A(0, 7)
l = [A1, A2, A3, A4, A5, A6]
The expected order of class A instances after sorting is
A5, A6, A1, A3, A2, A4
, i.e.
A(-3, -3)
A(0, 7)
A(1, 5)
A(1, 6)
A(2, 5)
A(4, 0)
because I want to order by both attributes a
and b
values.
However, with the increasing number of attributes the logic relying on embedded conditional statements, e.g.
class A:
def __init__(self, a, b, c=1):
self.a = a
self.b = b
self.c = c
def __lt__(self, other):
if self.a == other.a:
if self.b == other.b:
return self.c < other.c
return self.b < other.b
return self.a < other.a
seems to be cumbersome.
Do you have a better idea to implement such a sorting for a custom object? I was expecting the short circuit to do the trick:
def __lt__(self, other):
return self.a < other.a or self.b < other.b
but it generated following order:
A(-3, -3)
A(0, 7)
A(4, 0)
A(1, 5)
A(1, 6)
A(2, 5)
which is not what I want.
Answers:
In Python sequences are ordered in the same way you want your objects ordered (see https://docs.python.org/3/tutorial/datastructures.html#comparing-sequences-and-other-types), so you can co-opt that mechanism for your use case. For example:
def __lt__(self, other):
return (self.a, self.b, self.c) < (other.a, other.b, other.c)
To implement the logic you are describing I think it’d make sense to set a canonical importance order somewhere (ie. make it explicit that things are sorted in the a
, b
, c
, order…)
class A:
def __init__(self, a, b, c=1):
self.a = a
self.b = b
self.c = c
self._ordered_attrs = ['a','b','c']
def __lt__(self, other):
for attr in self._ordered_attrs:
if getattr(self, attr) != getattr(other, attr):
return getattr(self, attr) < getattr(other, attr)
return False
This implements the same logic you were describing, but as a loop over the attributes.
You could use dataclass
:
from dataclasses import dataclass
@dataclass(order=True)
class A:
a: int
b: int
I’d like to know how to efficiently sort by object’s multiple attributes. For instance, given
class A:
def __init__(self, a, b):
self.a = a
self.b = b
def __lt__(self, other):
if self.a == other.a:
return self.b < other.b
return self.a < other.a
def __str__(self):
return f'{self.__class__.__name__}({self.a}, {self.b})'
A1 = A(1, 5)
A2 = A(2, 5)
A3 = A(1, 6)
A4 = A(4, 0)
A5 = A(-3, -3)
A6 = A(0, 7)
l = [A1, A2, A3, A4, A5, A6]
The expected order of class A instances after sorting is
A5, A6, A1, A3, A2, A4
, i.e.
A(-3, -3)
A(0, 7)
A(1, 5)
A(1, 6)
A(2, 5)
A(4, 0)
because I want to order by both attributes a
and b
values.
However, with the increasing number of attributes the logic relying on embedded conditional statements, e.g.
class A:
def __init__(self, a, b, c=1):
self.a = a
self.b = b
self.c = c
def __lt__(self, other):
if self.a == other.a:
if self.b == other.b:
return self.c < other.c
return self.b < other.b
return self.a < other.a
seems to be cumbersome.
Do you have a better idea to implement such a sorting for a custom object? I was expecting the short circuit to do the trick:
def __lt__(self, other):
return self.a < other.a or self.b < other.b
but it generated following order:
A(-3, -3)
A(0, 7)
A(4, 0)
A(1, 5)
A(1, 6)
A(2, 5)
which is not what I want.
In Python sequences are ordered in the same way you want your objects ordered (see https://docs.python.org/3/tutorial/datastructures.html#comparing-sequences-and-other-types), so you can co-opt that mechanism for your use case. For example:
def __lt__(self, other):
return (self.a, self.b, self.c) < (other.a, other.b, other.c)
To implement the logic you are describing I think it’d make sense to set a canonical importance order somewhere (ie. make it explicit that things are sorted in the a
, b
, c
, order…)
class A:
def __init__(self, a, b, c=1):
self.a = a
self.b = b
self.c = c
self._ordered_attrs = ['a','b','c']
def __lt__(self, other):
for attr in self._ordered_attrs:
if getattr(self, attr) != getattr(other, attr):
return getattr(self, attr) < getattr(other, attr)
return False
This implements the same logic you were describing, but as a loop over the attributes.
You could use dataclass
:
from dataclasses import dataclass
@dataclass(order=True)
class A:
a: int
b: int