Getting the position of 1-bits in a python Long object

Question:

Let’s say I have a very very large python integer, in python 2.7 (though if I need to, I don’t mind switching to python 3).

Something bigger than say, 2^100000.

What is the fastest way by which I can find the positions of all 1s in it’s binary sequence? (example: 24 would be 11000 —> = [4,5] (or [5,4].. I don’t care about order)

At the moment I am using:

sum = whatever_starting_number

while 1:
    val = sum.bit_length()-1
    sum -= 2**val
    mylist.append(val)
    if sum == 0:
        break

this is alright, but it’s barely even faster than just taking log2 and subtracting that repeatedly. What I would like to actually do is just look at the bits, skip zeros, record positions of 1s, and not even need to modify the original value.

edit: getting multiple answers, really appreciate it. I’ll implement them in a couple timeit tests and this will be updated with results by tomorrow.

Asked By: Travis Black

||

Answers:

You can use bitwise operators.

one_bit_indexes = []
index = 0
while sum: # returns true if sum is non-zero
    if sum & 1: # returns true if right-most bit is 1
        one_bit_indexes.append(index)
    sum >>= 1 # discard the right-most bit
    index += 1

Haven’t tested this, but pretty sure that it will work. Bitwise operations are fast, so this should also be more efficient than calculating and subtracting powers of 2. (Unless your Python interpreter is already doing something smart like transforming your code to replace powers of 2 with bitwise operations).

edit: to make it work for negative numbers, you’ll have to take the absolute value of ‘sum’, first.

Answered By: Pig

Maybe this goes fast enough:

mylist = [i for i in range(sum.bit_length()) if sum & (1<<i)]
Answered By: Robᵩ

Probably not the fastest solution, but fairly simple and seems fast enough (2^1M was instant).

bits = []
for i, c in enumerate(bin(2**1000000)[:1:-1], 1):
    if c == '1':
        bits.append(i)

Just in case the [:1:-1] wasn’t clear, it is called "extended slicing", more info here: https://docs.python.org/2/whatsnew/2.3.html#extended-slices.

Edit: I decided to time the 3 solutions posted in answers, here are the results:

import timeit


def fcn1():
    sum = 3**100000
    one_bit_indexes = []
    index = 0
    while sum: # returns true if sum is non-zero
        if sum & 1: # returns true if right-most bit is 1
            one_bit_indexes.append(index)
        sum >>= 1 # discard the right-most bit
        index += 1
    return one_bit_indexes


def fcn2():
    number = 3**100000
    bits = []
    for i, c in enumerate(bin(number)[:1:-1], 1):
        if c == '1':
            bits.append(i)
    return bits


def fcn3():
    sum = 3**100000
    return [i for i in range(sum.bit_length()) if sum & (1<<i)]


print(timeit.timeit(fcn1, number=1))
print(timeit.timeit(fcn2, number=1))
print(timeit.timeit(fcn3, number=1))

For 3^100k:
fcn1: 0.7462488659657538
fcn2: 0.02108444197801873
fcn3: 0.40482770901871845

For 3^1M:
fcn1: 70.9139410170028
fcn2: 0.20711017202120274
fcn3: 43.36111917096423

Answered By: Joe Samanek