Calling __new__ when making a subclass of tuple
Question:
In Python, when subclassing tuple, the __new__
function is called with self as an argument. For example, here is a paraphrased version of PySpark’s Row
class:
class Row(tuple):
def __new__(self, args):
return tuple.__new__(self, args)
But help(tuple)
shows no self
argument to __new__
:
__new__(*args, **kwargs) from builtins.type
Create and return a new object. See help(type) for accurate signature.
and help(type)
just says the same thing:
__new__(*args, **kwargs)
Create and return a new object. See help(type) for accurate signature.
So how does self
get passed to __new__
in the Row
class definition?
- Is it via
*args
?
- Does
__new__
have some subtlety where its signature can change with context?
- Or, is the documentation mistaken?
Is it possible to view the source of tuple.__new__
so I can see the answer for myself?
My question is not a duplicate of this one because in that question, all discussion refers to __new__
methods that explicitly have self
or cls
as first argument. I’m trying to understand
- Why the
tuple.__new__
method does not have self
or cls
as first argument.
- How I might go about examining the source code of the tuple class, to see for myself what’s really going on.
Follow-up: Moderators closed this old question as a duplicate of this one. But it’s not a duplicate. Look at the accepted answer on this question and note how little overlap it has with the answers in the claimed duplicate, in terms of the information provided.
Answers:
The correct signature of tuple.__new__
Functions and types implemented in C often can’t be inspected, and their signature always look like that one.
The correct signature of tuple.__new__
is:
__new__(cls[, sequence])
For example:
>>> tuple.__new__(tuple)
()
>>> tuple.__new__(tuple, [1, 2, 3])
(1, 2, 3)
Not surprisingly, this is exactly as calling tuple()
, except for the fact that you have to repeat tuple
twice.
The first argument of __new__
Note that the first argument of __new__
is always the class, not the instance. In fact, the role of __new__
is to create and return the new instance.
The special method __new__
is a static method.
I’m saying this because in your Row.__new__
I can see self
: while the name of the argument is not important (except when using keyword arguments), beware that self
will be Row
or a subclass of Row
, not an instance. The general convention is to name the first argument cls
instead of self
.
Back to your questions
So how does self
get passed to __new__
in the Row
class definition?
When you call Row(...)
, Python automatically calls Row.__new__(Row, ...)
.
- Is it via
*args
?
You can write your Row.__new__
as follows:
class Row(tuple):
def __new__(*args, **kwargs):
return tuple.__new__(*args, **kwargs)
This works and there’s nothing wrong about it. It’s very useful if you don’t care about the arguments.
- Does
__new__
have some subtlety where its signature can change with context?
No, the only special thing about __new__
is that it is a static method.
- Or, is the documentation mistaken?
I’d say that it is incomplete or ambiguous.
- Why the
tuple.__new__
method does not have self
or cls
as first argument.
It does have, it’s just not appearing on help(tuple.__new__)
, because often that information is not exposed by functions and methods implemented in C.
- How I might go about examining the source code of the
tuple
class, to see for myself what’s really going on.
The file you are looking for is Objects/tupleobject.c
. Specifically, you are interested in the tuple_new()
function:
static char *kwlist[] = {"sequence", 0};
/* ... */
if (!PyArg_ParseTupleAndKeywords(args, kwds, "|O:tuple", kwlist, &arg))
Here "|O:tuple"
means: the function is called “tuple” and it accepts one optional argument (|
delimits optional arguments, O
stands for a Python object). The optional argument may be set via the keyword argument sequence
.
About help(type)
For the reference, you were looking at the documentation of type.__new__
, while you should have stopped at the first four lines of help(type)
:
In the case of __new__()
the correct signature is the signature of type()
:
class type(object)
| type(object_or_name, bases, dict)
| type(object) -> the object's type
| type(name, bases, dict) -> a new type
But this is not relevant, as tuple.__new__
has a different signature.
Remember super()
!
Last but not least, try to use super()
instead of calling tuple.__new__()
directly.
In Python, when subclassing tuple, the __new__
function is called with self as an argument. For example, here is a paraphrased version of PySpark’s Row
class:
class Row(tuple):
def __new__(self, args):
return tuple.__new__(self, args)
But help(tuple)
shows no self
argument to __new__
:
__new__(*args, **kwargs) from builtins.type
Create and return a new object. See help(type) for accurate signature.
and help(type)
just says the same thing:
__new__(*args, **kwargs)
Create and return a new object. See help(type) for accurate signature.
So how does self
get passed to __new__
in the Row
class definition?
- Is it via
*args
? - Does
__new__
have some subtlety where its signature can change with context? - Or, is the documentation mistaken?
Is it possible to view the source of tuple.__new__
so I can see the answer for myself?
My question is not a duplicate of this one because in that question, all discussion refers to __new__
methods that explicitly have self
or cls
as first argument. I’m trying to understand
- Why the
tuple.__new__
method does not haveself
orcls
as first argument. - How I might go about examining the source code of the tuple class, to see for myself what’s really going on.
Follow-up: Moderators closed this old question as a duplicate of this one. But it’s not a duplicate. Look at the accepted answer on this question and note how little overlap it has with the answers in the claimed duplicate, in terms of the information provided.
The correct signature of tuple.__new__
Functions and types implemented in C often can’t be inspected, and their signature always look like that one.
The correct signature of tuple.__new__
is:
__new__(cls[, sequence])
For example:
>>> tuple.__new__(tuple)
()
>>> tuple.__new__(tuple, [1, 2, 3])
(1, 2, 3)
Not surprisingly, this is exactly as calling tuple()
, except for the fact that you have to repeat tuple
twice.
The first argument of __new__
Note that the first argument of __new__
is always the class, not the instance. In fact, the role of __new__
is to create and return the new instance.
The special method __new__
is a static method.
I’m saying this because in your Row.__new__
I can see self
: while the name of the argument is not important (except when using keyword arguments), beware that self
will be Row
or a subclass of Row
, not an instance. The general convention is to name the first argument cls
instead of self
.
Back to your questions
So how does
self
get passed to__new__
in theRow
class definition?
When you call Row(...)
, Python automatically calls Row.__new__(Row, ...)
.
- Is it via
*args
?
You can write your Row.__new__
as follows:
class Row(tuple):
def __new__(*args, **kwargs):
return tuple.__new__(*args, **kwargs)
This works and there’s nothing wrong about it. It’s very useful if you don’t care about the arguments.
- Does
__new__
have some subtlety where its signature can change with context?
No, the only special thing about __new__
is that it is a static method.
- Or, is the documentation mistaken?
I’d say that it is incomplete or ambiguous.
- Why the
tuple.__new__
method does not haveself
orcls
as first argument.
It does have, it’s just not appearing on help(tuple.__new__)
, because often that information is not exposed by functions and methods implemented in C.
- How I might go about examining the source code of the
tuple
class, to see for myself what’s really going on.
The file you are looking for is Objects/tupleobject.c
. Specifically, you are interested in the tuple_new()
function:
static char *kwlist[] = {"sequence", 0};
/* ... */
if (!PyArg_ParseTupleAndKeywords(args, kwds, "|O:tuple", kwlist, &arg))
Here "|O:tuple"
means: the function is called “tuple” and it accepts one optional argument (|
delimits optional arguments, O
stands for a Python object). The optional argument may be set via the keyword argument sequence
.
About help(type)
For the reference, you were looking at the documentation of type.__new__
, while you should have stopped at the first four lines of help(type)
:
In the case of __new__()
the correct signature is the signature of type()
:
class type(object)
| type(object_or_name, bases, dict)
| type(object) -> the object's type
| type(name, bases, dict) -> a new type
But this is not relevant, as tuple.__new__
has a different signature.
Remember super()
!
Last but not least, try to use super()
instead of calling tuple.__new__()
directly.