Why is creating a class in Python so much slower than instantiating a class?

Question:

I found that creation of a class is way slower than instantiation of a class.

>>> from timeit import Timer as T
>>> def calc(n):
...     return T("class Haha(object): pass").timeit(n)

<<After several these 'calc' things, at least one of them have a big number, eg. 100000>>

>>> calc(9000)
15.947055101394653
>>> calc(9000)
17.39099097251892
>>> calc(9000)
18.824054956436157
>>> calc(9000)
20.33335590362549

Yeah, create 9000 classes took 16 secs, and becomes even slower in the subsequent calls.

And this:

>>> T("type('Haha', b, d)", "b = (object, ); d = {}").timeit(9000)

gives similar results.

But instantiation don’t suffer:

>>> T("Haha()", "class Haha(object): pass").timeit(5000000)
0.8786070346832275

5000000 instances in less than one sec.

What makes the creation this expensive?

And why the creation process become slower?

EDIT:

How to reproduce:

start a fresh python process, the initial several “calc(10000)”s give a number of 0.5 on my machine. And try some bigger values, calc(100000), it can’t end in even 10secs, interrupt it, and calc(10000), gives a 15sec.

EDIT:

Additional fact:

If you gc.collect() after ‘calc’ becomes slow, you can get the ‘normal’ speed at beginning, but the timing will increasing in subsequent calls

>>> from a import calc
>>> calc(10000)
0.4673938751220703
>>> calc(10000)
0.4300072193145752
>>> calc(10000)
0.4270968437194824
>>> calc(10000)
0.42754602432250977
>>> calc(10000)
0.4344758987426758
>>> calc(100000)
^CTraceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "a.py", line 3, in calc
    return T("class Haha(object): pass").timeit(n)
  File "/usr/lib/python2.7/timeit.py", line 194, in timeit
    timing = self.inner(it, self.timer)
  File "<timeit-src>", line 6, in inner
KeyboardInterrupt
>>> import gc
>>> gc.collect()
234204
>>> calc(10000)
0.4237039089202881
>>> calc(10000)
1.5998330116271973
>>> calc(10000)
4.136359930038452
>>> calc(10000)
6.625348806381226
Asked By: Proton

||

Answers:

It isn’t: Only your contrived tests show slow class creation. In fact, as @Veedrac shows in his answer, this result is an artifact of timeit disabling garbage collection.

Downvoters: Show me a non-contrived example where class creation is slow.

In any case, your timings are affected by the load on your system at the time. They are really only useful for comparisons performed at pretty much the same time. I get about 0.5s for 9000 class creations. In fact, it’s about 0.3s on ideone, even when performed repeatedly: http://ideone.com/Du859. There isn’t even an upward trend.

So, in summary, it is much slower on your computer than others, and there is no upwards trend on other computers for repeated tests (as per your original claim). Testing massive numbers of instantiations does show slowing down, presumably because the process consumes a lot of memory. You have shown that allocating a huge amount of memory slows a process down. Well done.

That ideone code in full:

from timeit import Timer as T
def calc(n):
return T("class Haha(object): pass").timeit(n)

for i in xrange(30):
print calc(9000)
Answered By: Marcin

This might give you the intuition:

>>> class Haha(object): pass
...
>>> sys.getsizeof(Haha)
904
>>> sys.getsizeof(Haha())
64

Class object is much more complex and expensive structure than an instance of that class.

Answered By: Roman Bodnarchuk

A quick dis of the following functions:

def a():
    class Haha(object):
         pass



def b():
    Haha()

gives:

2           0 LOAD_CONST               1 ('Haha')
            3 LOAD_GLOBAL              0 (object)
            6 BUILD_TUPLE              1
            9 LOAD_CONST               2 (<code object Haha at 0x7ff3e468bab0, file "<stdin>", line 2>)
            12 MAKE_FUNCTION            0
            15 CALL_FUNCTION            0
            18 BUILD_CLASS         
            19 STORE_FAST               0 (Haha)
            22 LOAD_CONST               0 (None)
            25 RETURN_VALUE        

and

2           0 LOAD_GLOBAL              0 (Haha)
            3 CALL_FUNCTION            0
            6 POP_TOP             
            7 LOAD_CONST               0 (None)
            10 RETURN_VALUE        

accordingly.

By the looks of it, it simply does more stuff when creating a class. It has to initialize class, add it to dicts, and wherever else, while in case of Haha() is just calls a function.

As you noticed doing garbage collection when it gets’s too slow speeds stuff up again, so Marcin’s right in saying that it’s probably memory fragmentation issue.

Answered By: soulcheck

Ahahaha! Gotcha!

Was this perchance done on a Python version without this patch? (HINT: IT WAS)

Check the line numbers if you want proof.

Marcin was right: when the results look screwy you’ve probably got a screwy benchmark. Run gc.disable() and the results reproduce themselves. It just shows that when you disable garbage collection you get garbage results!


To be more clear, the reason running the long benchmark broke things is that:

  • timeit disables garbage collections, so overly large benchmarks take much (exponentially) longer

  • timeit wasn’t restoring garbage collection on exceptions

  • You quit the long-running process with an asynchronous exception, turning off garbage collection

Answered By: Veedrac