Variable's memory size in Python

Question:

I am writing Python code to do some big number calculation, and have serious concern about the memory used in the calculation.

Thus, I want to count every bit of each variable.

For example, I have a variable x, which is a big number, and want to count the number of bits for representing x.

The following code is obviously useless:

x=2**1000
len(x)

Thus, I turn to use the following code:

x=2**1000
len(repr(x))

The variable x is (in decimal) is:

10715086071862673209484250490600018105614048117055336074437503883703510511249361224931983788156958581275946729175531468251871452856923140435984577574698574803934567774824230985421074605062371141877954182153046474983581941267398767559165543946077062914571196477686542167660429831652624386837205668069376

but the above code returns 303

The above long long sequence is of length 302, and so I believe that 303 should be related to the string length only.

So, here comes my original question:

How can I know the memory size of variable x?

One more thing; in C/C++ language, if I define

int z=1;

This means that there are 4 bytes= 32 bits allocated for z, and the bits are arranged as 00..001(31 0’s and one 1).

Here, my variable x is huge, I don’t know whether it follows the same memory allocation rule?

Asked By: user4478

||

Answers:

Use sys.getsizeof to get the size of an object, in bytes.

>>> from sys import getsizeof
>>> a = 42
>>> getsizeof(a)
12
>>> a = 2**1000
>>> getsizeof(a)
146
>>>

Note that the size and layout of an object is purely implementation-specific. CPython, for example, may use totally different internal data structures than IronPython. So the size of an object may vary from implementation to implementation.

Answered By: Jonathon Reinhart

Regarding the internal structure of a Python long, check sys.int_info (or sys.long_info for Python 2.7).

>>> import sys
>>> sys.int_info
sys.int_info(bits_per_digit=30, sizeof_digit=4)

Python either stores 30 bits into 4 bytes (most 64-bit systems) or 15 bits into 2 bytes (most 32-bit systems). Comparing the actual memory usage with calculated values, I get

>>> import math, sys
>>> a=0
>>> sys.getsizeof(a)
24
>>> a=2**100
>>> sys.getsizeof(a)
40
>>> a=2**1000
>>> sys.getsizeof(a)
160
>>> 24+4*math.ceil(100/30)
40
>>> 24+4*math.ceil(1000/30)
160

There are 24 bytes of overhead for 0 since no bits are stored. The memory requirements for larger values matches the calculated values.

If your numbers are so large that you are concerned about the 6.25% unused bits, you should probably look at the gmpy2 library. The internal representation uses all available bits and computations are significantly faster for large values (say, greater than 100 digits).

Answered By: casevh
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.