Why is foo(*arg, x) not allowed in Python?
Question:
Look at the following example
point = (1, 2)
size = (2, 3)
color = 'red'
class Rect(object):
def __init__(self, x, y, width, height, color):
pass
It would be very tempting to call:
Rect(*point, *size, color)
Possible workarounds would be:
Rect(point[0], point[1], size[0], size[1], color)
Rect(*(point + size), color=color)
Rect(*(point + size + (color,)))
But why is Rect(*point, *size, color)
not allowed, is there any semantic ambiguity or general disadvantage you could think of?
EDIT: Specific Questions
Why are multiple *arg expansions not allowed in function calls?
Why are positional arguments not allowed after *arg expansions?
Answers:
*point
says that you are passing in a whole sequence of items – something like all the elements in a list, but not as a list.
In this case, you cannot limit how many elements are being passed in. Therefore, there is no way for the interpreter to know which elements of the sequence are part of *points
and which are of *size
For example, if you passed the following as input: 2, 5, 3, 4, 17, 87, 4, 0
, can you tell me, which of those numbers are represented by *points
and which by *size
? This is the same problem that the interpreter would face as well
Hope this helps
I’m not going to speak to why multiple tuple unpacking isn’t part of Python, but I will point out that you’re not matching your class to your data in your example.
You have the following code:
point = (1, 2)
size = (2, 3)
color = 'red'
class Rect(object):
def __init__(self, x, y, width, height, color):
self.x = x
self.y = y
self.width = width
self.height = height
self.color = color
but a better way to express your Rect object would be as follows:
class Rect:
def __init__(self, point, size, color):
self.point = point
self.size = size
self.color = color
r = Rect(point, size, color)
In general, if your data is in tuples, have your constructor take tuples. If your data is in a dict, have your constructor take a dict. If your data is an object, have your constructor take an object, etc.
In general, you want to work with the idioms of the language, rather than try to work around them.
EDIT
Seeing how popular this question is, I’ll give you an decorator that allows you to call the constructor however you like.
class Pack(object):
def __init__(self, *template):
self.template = template
def __call__(self, f):
def pack(*args):
args = list(args)
for i, tup in enumerate(self.template):
if type(tup) != tuple:
continue
for j, typ in enumerate(tup):
if type(args[i+j]) != typ:
break
else:
args[i:i+j+1] = [tuple(args[i:i+j+1])]
f(*args)
return pack
class Rect:
@Pack(object, (int, int), (int, int), str)
def __init__(self, point, size, color):
self.point = point
self.size = size
self.color = color
Now you can initialize your object any way you like.
r1 = Rect(point, size, color)
r2 = Rect((1,2), size, color)
r3 = Rect(1, 2, size, color)
r4 = Rect((1, 2), 2, 3, color)
r5 = Rect(1, 2, 2, 3, color)
While I wouldn’t recommend using this in practice (it violates the principle that you should have only one way to do it), it does serve to demonstrate that there’s usually a way to do anything in Python.
Python is full of these subtle glitches. For example you can do:
first, second, last = (1, 2, 3)
And you can’t do:
first, *others = (1, 2, 3)
But in Python 3 you now can.
Your suggestion probably is going to be suggested in a PEP and integrated or rejected one day.
As far as I know, it was a design choice, but there seems to be a logic behind it.
EDIT: the *args
notation in a function call was designed so you could pass in a tuple of variables of an arbitrary length that could change between calls. In that case, having something like f(*a, *b, c) doesn’t make sense as a call, as if a
changes length all the elements of b get assigned to the wrong variables, and c isn’t in the right place either.
Keeping the language simple, powerful, and standardized is a good thing. Keeping it in sync with what actually goes on in processing the arguments is also a very good thing.
Think about how the language unpacks your function call. If multiple *arg
are allowed in any order like Rect(*point, *size, color)
, note that all that matters to properly unpack is that point and size have a total of four elements. So point=()
, size=(1,2,2,3)
, andcolor='red')
would allow Rect(*point, *size, color)
to work as a proper call. Basically, the language when it parses the *point and *size is treating it as one combined *arg
tuple, so Rect(*(point + size), color=color)
is more faithful representation.
There never needs to be two tuples of arguments passed in the form *args
, you can always represent it as one. Since assignment of parameters is only dependent on the order in this combined *arg
list, it makes sense to define it as such.
If you can make function calls like f(*a, *b), the language almost begs to allow you to define functions with multiple *args in the parameter list, and those couldn’t be processed. E.g.,
def f(*a, *b):
return (sum(a), 2*sum(b))
How would f(1,2,3,4)
be processed?
I think this is why for syntactical concreteness, the language forces function calls and definitions to be in the following specific form; like f(a,b,x=1,y=2,*args,**kwargs)
which is order dependent.
Everything there has a specific meaning in a function definition and function call. a
and b
are parameters defined without default values, next x
and y
are parameters defined with default values (that could be skipped; so come after the no default parameters). Next, *args
is populated as a tuple with all the args filled with the rest of the parameters from a function call that weren’t keyword parameters. This comes after the others, as this could change length, and you don’t want something that could change length between calls to affect assignment of variables. At the end **kwargs takes all the keyword arguments that weren’t defined elsewhere. With these concrete definitions you never need to have multiple *args
or **kwargs
.
Well, in Python 2, you can say:
point = 1, 2
size = 2, 3
color = 'red'
class Rect(object):
def __init__(self, (x, y), (width, height), color):
pass
Then you can say:
a_rect= Rect(point, size, color)
taking care that the first two arguments are sequences of len == 2.
NB: This capability has been removed from Python 3.
Look at the following example
point = (1, 2)
size = (2, 3)
color = 'red'
class Rect(object):
def __init__(self, x, y, width, height, color):
pass
It would be very tempting to call:
Rect(*point, *size, color)
Possible workarounds would be:
Rect(point[0], point[1], size[0], size[1], color)
Rect(*(point + size), color=color)
Rect(*(point + size + (color,)))
But why is Rect(*point, *size, color)
not allowed, is there any semantic ambiguity or general disadvantage you could think of?
EDIT: Specific Questions
Why are multiple *arg expansions not allowed in function calls?
Why are positional arguments not allowed after *arg expansions?
*point
says that you are passing in a whole sequence of items – something like all the elements in a list, but not as a list.
In this case, you cannot limit how many elements are being passed in. Therefore, there is no way for the interpreter to know which elements of the sequence are part of *points
and which are of *size
For example, if you passed the following as input: 2, 5, 3, 4, 17, 87, 4, 0
, can you tell me, which of those numbers are represented by *points
and which by *size
? This is the same problem that the interpreter would face as well
Hope this helps
I’m not going to speak to why multiple tuple unpacking isn’t part of Python, but I will point out that you’re not matching your class to your data in your example.
You have the following code:
point = (1, 2)
size = (2, 3)
color = 'red'
class Rect(object):
def __init__(self, x, y, width, height, color):
self.x = x
self.y = y
self.width = width
self.height = height
self.color = color
but a better way to express your Rect object would be as follows:
class Rect:
def __init__(self, point, size, color):
self.point = point
self.size = size
self.color = color
r = Rect(point, size, color)
In general, if your data is in tuples, have your constructor take tuples. If your data is in a dict, have your constructor take a dict. If your data is an object, have your constructor take an object, etc.
In general, you want to work with the idioms of the language, rather than try to work around them.
EDIT
Seeing how popular this question is, I’ll give you an decorator that allows you to call the constructor however you like.
class Pack(object):
def __init__(self, *template):
self.template = template
def __call__(self, f):
def pack(*args):
args = list(args)
for i, tup in enumerate(self.template):
if type(tup) != tuple:
continue
for j, typ in enumerate(tup):
if type(args[i+j]) != typ:
break
else:
args[i:i+j+1] = [tuple(args[i:i+j+1])]
f(*args)
return pack
class Rect:
@Pack(object, (int, int), (int, int), str)
def __init__(self, point, size, color):
self.point = point
self.size = size
self.color = color
Now you can initialize your object any way you like.
r1 = Rect(point, size, color)
r2 = Rect((1,2), size, color)
r3 = Rect(1, 2, size, color)
r4 = Rect((1, 2), 2, 3, color)
r5 = Rect(1, 2, 2, 3, color)
While I wouldn’t recommend using this in practice (it violates the principle that you should have only one way to do it), it does serve to demonstrate that there’s usually a way to do anything in Python.
Python is full of these subtle glitches. For example you can do:
first, second, last = (1, 2, 3)
And you can’t do:
first, *others = (1, 2, 3)
But in Python 3 you now can.
Your suggestion probably is going to be suggested in a PEP and integrated or rejected one day.
As far as I know, it was a design choice, but there seems to be a logic behind it.
EDIT: the *args
notation in a function call was designed so you could pass in a tuple of variables of an arbitrary length that could change between calls. In that case, having something like f(*a, *b, c) doesn’t make sense as a call, as if a
changes length all the elements of b get assigned to the wrong variables, and c isn’t in the right place either.
Keeping the language simple, powerful, and standardized is a good thing. Keeping it in sync with what actually goes on in processing the arguments is also a very good thing.
Think about how the language unpacks your function call. If multiple *arg
are allowed in any order like Rect(*point, *size, color)
, note that all that matters to properly unpack is that point and size have a total of four elements. So point=()
, size=(1,2,2,3)
, andcolor='red')
would allow Rect(*point, *size, color)
to work as a proper call. Basically, the language when it parses the *point and *size is treating it as one combined *arg
tuple, so Rect(*(point + size), color=color)
is more faithful representation.
There never needs to be two tuples of arguments passed in the form *args
, you can always represent it as one. Since assignment of parameters is only dependent on the order in this combined *arg
list, it makes sense to define it as such.
If you can make function calls like f(*a, *b), the language almost begs to allow you to define functions with multiple *args in the parameter list, and those couldn’t be processed. E.g.,
def f(*a, *b):
return (sum(a), 2*sum(b))
How would f(1,2,3,4)
be processed?
I think this is why for syntactical concreteness, the language forces function calls and definitions to be in the following specific form; like f(a,b,x=1,y=2,*args,**kwargs)
which is order dependent.
Everything there has a specific meaning in a function definition and function call. a
and b
are parameters defined without default values, next x
and y
are parameters defined with default values (that could be skipped; so come after the no default parameters). Next, *args
is populated as a tuple with all the args filled with the rest of the parameters from a function call that weren’t keyword parameters. This comes after the others, as this could change length, and you don’t want something that could change length between calls to affect assignment of variables. At the end **kwargs takes all the keyword arguments that weren’t defined elsewhere. With these concrete definitions you never need to have multiple *args
or **kwargs
.
Well, in Python 2, you can say:
point = 1, 2
size = 2, 3
color = 'red'
class Rect(object):
def __init__(self, (x, y), (width, height), color):
pass
Then you can say:
a_rect= Rect(point, size, color)
taking care that the first two arguments are sequences of len == 2.
NB: This capability has been removed from Python 3.