Find the run_length of a series of number digits and output as tuple of (digit, count)

Question:

Given an integer I need to create a list of tuples such that in each tuple the first entry is a digit of the number and the second one its frequency. This should be done from the left of the number and the order of the digits is important. Eg 11122311 would lead to [('1', 3), ('2', 2),('3',1),('1',2)].

I do not want to use an inbuilt function such as groupby.

I am trying to iterate through the elements of a string as long as they are the same and then cut these digits and iterate again until the list of strings gets the length zero. I can unfortunately not implement this successfully. Any help is appreciated. Thanks.

def compress(n):
    L = []
    while len(str(n)) != 0:
        for i in range(len(str(n))):
            for j in range(len(str(n))):
                if str(n)[i] == str(n)[i+j]:
                    L.append((str(n)[i],j)) 
                    str(n) = str(n)[j:]
    return L            

print(compress(11122))
Asked By: user249018

||

Answers:

You can create a simple run length encoder by keeping the previous character and keeping a count of the "run" – i.e. how many characters you’ve seen before it changes.

def compress_rle(s):
    compressed = []
    
    if not s:
        return compressed
        
    previous_character = s[0]
    run = 0
    
    for character in s:
        if character != previous_character:
            compressed.append((previous_character, run))
            run = 0

        run += 1
        previous_character = character
        
    compressed.append((previous_character, run))
    return compressed

print(compress_rle('11122311'))

This outputs the same as you gave in your question:

[('1', 3), ('2', 2), ('3', 1), ('1', 2)]
Answered By: MatsLindh

You can achieve that by converting your input in string and then counting each number. This is the probably the cleaner way

def compress(n):
    n = str(n)
    return [(n.count(c), c) for c in sorted(set(n))]
   
print(compress(11122))
Answered By: Flavio Adamo

One approach using while to handle the iteration better:

def compress(n):
    res = []
    length, index = len(n), 0

    while index < length:
        last = n[index]

        # while the current value is equal to the last iterate
        run_length = 0
        while index + run_length < length and n[index + run_length] == last:
            run_length += 1

        #  append the run
        res.append((last, run_length))

        # move index forward
        index += run_length

    return res

print(compress("11122311"))

Output

[('1', 3), ('2', 2), ('3', 1), ('1', 2)]
Answered By: Dani Mesejo

Run-length encoding by using tuples to track run count.

Code

def compress(n):
    s, runs = str(n), []
    for c in s:
        if not runs or runs[-1][0] != c: # different letter->start new empty substring
            runs.append((c, 0))          # new run count tuple
        runs[-1]  = (runs[-1][0], runs[-1][1] + 1)  # increment run count

    return runs

Test

print(compress(11122311))  # Out: [('1', 3), ('2', 2), ('3', 1), ('1', 2)]
Answered By: DarrylG
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.