Why does python's zlib.crc32 need a bitwise AND to be considered "stable"?
Question:
According to the following python documentation:
Changed in version 3.0: Always returns an unsigned value. To generate the same numeric value across all Python versions and platforms, use crc32(data) & 0xffffffff.
Why is it necessary to perform a bitwise AND with a binary number set to all 1’s? Won’t the result always be the same binary number regardless (since any binary number ANDed with all 1’s will yield the same binary number)? This seemingly arbitrary detail makes me feel unsure of my understanding.
Answers:
The bitwise &
forces a conversion of the value to an unsigned integer. Consider, for example:
>>> -2 & 0xffffffff
4294967294
Based on the comment you have quoted, in 3.0 Python was changed to always return an unsigned value, while in previous versions the crc32()
method may have returned a signed value.
For example, in Python 2.7.11:
>>> crc32('the quick brown fox')
-1849621814
But in Python 3.4.3:
>>> crc32(b'the quick brown fox')
2445345482
So you get different values here, but if in Python 2.7 you perform the bitwise and, you get the same value as in Python 3.4:
>>> crc32('the quick brown fox') & 0xffffffff
2445345482
According to the following python documentation:
Changed in version 3.0: Always returns an unsigned value. To generate the same numeric value across all Python versions and platforms, use crc32(data) & 0xffffffff.
Why is it necessary to perform a bitwise AND with a binary number set to all 1’s? Won’t the result always be the same binary number regardless (since any binary number ANDed with all 1’s will yield the same binary number)? This seemingly arbitrary detail makes me feel unsure of my understanding.
The bitwise &
forces a conversion of the value to an unsigned integer. Consider, for example:
>>> -2 & 0xffffffff
4294967294
Based on the comment you have quoted, in 3.0 Python was changed to always return an unsigned value, while in previous versions the crc32()
method may have returned a signed value.
For example, in Python 2.7.11:
>>> crc32('the quick brown fox')
-1849621814
But in Python 3.4.3:
>>> crc32(b'the quick brown fox')
2445345482
So you get different values here, but if in Python 2.7 you perform the bitwise and, you get the same value as in Python 3.4:
>>> crc32('the quick brown fox') & 0xffffffff
2445345482