Python effective integer size

Question:

For example in C#, C++, Java or JavaScript effective int size is 32 bits. If we want to calculate some large number, for example, 70 bits, we should use some software features (Arbitrary-precision arithmetic).

Python has a very tricky integer internal boundless representation and I’m can not figure out what is the most efficient int size for integer arithmetic.

In other words, do we have some int size, say 64 bits, for effective ints usage?

Or it does not matter is it 16, 32, 64 or some random bits count, and Python will work for all these ints with the same efficiency?

In short does Python always uses Arbitrary-precision arithmetic or for 3264 it uses hardware arithmetic?

Asked By: No Name QA

||

Answers:

CPython’s int, in Python 3, is represented as a sign-magnitude array of values, where each element in the array represents 15 or 30 bits of the magnitude, for 32 and 64 bit Python builds respectively. This is an implementation detail, but a longstanding one (it was originally 15 all the time, but it was found to be an easy win to double the size and number of used bits per "digit" in the array when working on 64 bit systems). It has optimizations for ints that fit into a single (or sometimes two) such array values (it extracts the raw value from the array and performs a single CPU operation, skipping loops and algorithms that apply to the arbitrary length case), and on 64 bit builds of CPython, that currently means values with magnitude of 30 bits or less are generally optimized specially (with 60 bit magnitudes occasionally having fast paths).

That said, there’s rarely a reason to take this into account; CPython interpreter overhead is pretty high, and it’s pretty hard to imagine a scenario where manually breaking down a larger operation into smaller ones (incurring more interpreter overhead to do many small operations at the Python layer) would outweigh the much smaller cost of Python doing the array-based operations (even without special fast paths) at the C layer. The exceptions to this rule would all rely on non-Python-int-solutions, using fixed size numpy arrays to vectorize the work, and largely following C rules at that point (since numpy arrays are wrappers around raw C arrays most of the time).

Answered By: ShadowRanger

You could also use the python struct library and pack a simple number then attempt to read the length back, the result will be in bytes:

import struct
size_int = len(struct.pack('i',0))
print(size_int)

4

Answered By: iunknown

You can use sys.getsizeof which returns the total size of any object including any garbage collector overhead, though this is apparently not the size of the internal piece of data used by the CPU for any hardware operation.

>>> import sys
>>> sys.getsizeof(1)
28
>>> sys.getsizeof(12345678)
28
>>> sys.getsizeof(1234567890)
32
>>> sys.getsizeof(1234567890123456789)
36
>>> sys.getsizeof(12345678901234567890123456789)
40

This may not be your question, but I guess there is no point in Python to infer this size if your plan is to make your code more efficient.

Answered By: calandoa