Find the run_length of a series of number digits and output as tuple of (digit, count)
Question:
Given an integer I need to create a list of tuples such that in each tuple the first entry is a digit of the number and the second one its frequency. This should be done from the left of the number and the order of the digits is important. Eg 11122311
would lead to [('1', 3), ('2', 2),('3',1),('1',2)]
.
I do not want to use an inbuilt function such as groupby
.
I am trying to iterate through the elements of a string as long as they are the same and then cut these digits and iterate again until the list of strings gets the length zero. I can unfortunately not implement this successfully. Any help is appreciated. Thanks.
def compress(n):
L = []
while len(str(n)) != 0:
for i in range(len(str(n))):
for j in range(len(str(n))):
if str(n)[i] == str(n)[i+j]:
L.append((str(n)[i],j))
str(n) = str(n)[j:]
return L
print(compress(11122))
Answers:
You can create a simple run length encoder by keeping the previous character and keeping a count of the "run" – i.e. how many characters you’ve seen before it changes.
def compress_rle(s):
compressed = []
if not s:
return compressed
previous_character = s[0]
run = 0
for character in s:
if character != previous_character:
compressed.append((previous_character, run))
run = 0
run += 1
previous_character = character
compressed.append((previous_character, run))
return compressed
print(compress_rle('11122311'))
This outputs the same as you gave in your question:
[('1', 3), ('2', 2), ('3', 1), ('1', 2)]
You can achieve that by converting your input in string and then counting each number. This is the probably the cleaner way
def compress(n):
n = str(n)
return [(n.count(c), c) for c in sorted(set(n))]
print(compress(11122))
One approach using while
to handle the iteration better:
def compress(n):
res = []
length, index = len(n), 0
while index < length:
last = n[index]
# while the current value is equal to the last iterate
run_length = 0
while index + run_length < length and n[index + run_length] == last:
run_length += 1
# append the run
res.append((last, run_length))
# move index forward
index += run_length
return res
print(compress("11122311"))
Output
[('1', 3), ('2', 2), ('3', 1), ('1', 2)]
Run-length encoding by using tuples to track run count.
Code
def compress(n):
s, runs = str(n), []
for c in s:
if not runs or runs[-1][0] != c: # different letter->start new empty substring
runs.append((c, 0)) # new run count tuple
runs[-1] = (runs[-1][0], runs[-1][1] + 1) # increment run count
return runs
Test
print(compress(11122311)) # Out: [('1', 3), ('2', 2), ('3', 1), ('1', 2)]
Given an integer I need to create a list of tuples such that in each tuple the first entry is a digit of the number and the second one its frequency. This should be done from the left of the number and the order of the digits is important. Eg 11122311
would lead to [('1', 3), ('2', 2),('3',1),('1',2)]
.
I do not want to use an inbuilt function such as groupby
.
I am trying to iterate through the elements of a string as long as they are the same and then cut these digits and iterate again until the list of strings gets the length zero. I can unfortunately not implement this successfully. Any help is appreciated. Thanks.
def compress(n):
L = []
while len(str(n)) != 0:
for i in range(len(str(n))):
for j in range(len(str(n))):
if str(n)[i] == str(n)[i+j]:
L.append((str(n)[i],j))
str(n) = str(n)[j:]
return L
print(compress(11122))
You can create a simple run length encoder by keeping the previous character and keeping a count of the "run" – i.e. how many characters you’ve seen before it changes.
def compress_rle(s):
compressed = []
if not s:
return compressed
previous_character = s[0]
run = 0
for character in s:
if character != previous_character:
compressed.append((previous_character, run))
run = 0
run += 1
previous_character = character
compressed.append((previous_character, run))
return compressed
print(compress_rle('11122311'))
This outputs the same as you gave in your question:
[('1', 3), ('2', 2), ('3', 1), ('1', 2)]
You can achieve that by converting your input in string and then counting each number. This is the probably the cleaner way
def compress(n):
n = str(n)
return [(n.count(c), c) for c in sorted(set(n))]
print(compress(11122))
One approach using while
to handle the iteration better:
def compress(n):
res = []
length, index = len(n), 0
while index < length:
last = n[index]
# while the current value is equal to the last iterate
run_length = 0
while index + run_length < length and n[index + run_length] == last:
run_length += 1
# append the run
res.append((last, run_length))
# move index forward
index += run_length
return res
print(compress("11122311"))
Output
[('1', 3), ('2', 2), ('3', 1), ('1', 2)]
Run-length encoding by using tuples to track run count.
Code
def compress(n):
s, runs = str(n), []
for c in s:
if not runs or runs[-1][0] != c: # different letter->start new empty substring
runs.append((c, 0)) # new run count tuple
runs[-1] = (runs[-1][0], runs[-1][1] + 1) # increment run count
return runs
Test
print(compress(11122311)) # Out: [('1', 3), ('2', 2), ('3', 1), ('1', 2)]