Find the run_length of a series of number digits and output as tuple of (digit, count)

Question

Given an integer I need to create a list of tuples such that in each tuple the first entry is a digit of the number and the second one its frequency. This should be done from the left of the number and the order of the digits is important. Eg 11122311 would lead to [('1', 3), ('2', 2),('3',1),('1',2)].

I do not want to use an inbuilt function such as groupby.

I am trying to iterate through the elements of a string as long as they are the same and then cut these digits and iterate again until the list of strings gets the length zero. I can unfortunately not implement this successfully. Any help is appreciated. Thanks.

def compress(n):
    L = []
    while len(str(n)) != 0:
        for i in range(len(str(n))):
            for j in range(len(str(n))):
                if str(n)[i] == str(n)[i+j]:
                    L.append((str(n)[i],j)) 
                    str(n) = str(n)[j:]
    return L            

print(compress(11122))

Asked By: user249018

||

Source

Answer 1

You can create a simple run length encoder by keeping the previous character and keeping a count of the "run" – i.e. how many characters you’ve seen before it changes.

def compress_rle(s):
    compressed = []
    
    if not s:
        return compressed
        
    previous_character = s[0]
    run = 0
    
    for character in s:
        if character != previous_character:
            compressed.append((previous_character, run))
            run = 0

        run += 1
        previous_character = character
        
    compressed.append((previous_character, run))
    return compressed

print(compress_rle('11122311'))

This outputs the same as you gave in your question:

[('1', 3), ('2', 2), ('3', 1), ('1', 2)]

Answered By: MatsLindh

Answer 2

You can achieve that by converting your input in string and then counting each number. This is the probably the cleaner way

def compress(n):
    n = str(n)
    return [(n.count(c), c) for c in sorted(set(n))]
   
print(compress(11122))

Answered By: Flavio Adamo

Answer 3

One approach using while to handle the iteration better:

def compress(n):
    res = []
    length, index = len(n), 0

    while index < length:
        last = n[index]

        # while the current value is equal to the last iterate
        run_length = 0
        while index + run_length < length and n[index + run_length] == last:
            run_length += 1

        #  append the run
        res.append((last, run_length))

        # move index forward
        index += run_length

    return res

print(compress("11122311"))

Output

[('1', 3), ('2', 2), ('3', 1), ('1', 2)]

Answered By: Dani Mesejo

Answer 4

Run-length encoding by using tuples to track run count.

Code

def compress(n):
    s, runs = str(n), []
    for c in s:
        if not runs or runs[-1][0] != c: # different letter->start new empty substring
            runs.append((c, 0))          # new run count tuple
        runs[-1]  = (runs[-1][0], runs[-1][1] + 1)  # increment run count

    return runs

Test

print(compress(11122311))  # Out: [('1', 3), ('2', 2), ('3', 1), ('1', 2)]

Answered By: DarrylG

Find the run_length of a series of number digits and output as tuple of (digit, count)

Question:

Answers: