Simplification of fractions of unit symbols: wrong results

Question:

I’m writing a function that simplifies fractions with symbols (like "kg") without numbers (coefficients).

The function takes two arguments: the first list has all the units referred to in the numerator, the second lists the units referred to in the denominator.

I think I’ve already completed most of the function, but I don’t understand why it doesn’t pass some of the tests.

For example, when the denominator and the numerator are equal it is not returning (['1'], ['1']) as expected, and another example is when the function in one for the tests returns (['1', '1', 's'], ['s']) instead of (['1'], ['1']).

My code:

def canonical(unit):
    numerador = unit[0]
    denominador = unit[1]
    numerador.sort()
    denominador.sort()
    
    lstNumerador = numerador
    lstDenominador = denominador
    
    for i in numerador:
        for j in denominador:
            if i == j:
                lstNumerador.remove(i)
                lstDenominador.remove(j)
                
    
    if lstNumerador.count('1') > 0:
        lstNumerador.remove('1')
        
    if lstDenominador.count('1') > 0:
        lstDenominador.remove('1')
        
    if len(lstNumerador) == 0:
        lstNumerador.append('1')
        
    if len(lstDenominador) == 0:
        lstDenominador.append('1')
        
    unit = (lstNumerador, lstDenominador)
    return unit

# What I expect

unit = (['kg', 'm', 'm', 's'], ['m', 's', 's'])
print(canonical(unit))
#(['kg', 'm'], ['s'])
unit = (['kg', 'm', 'm', 's'], ['s', 'm', 'm', 'kg'])
print(canonical(unit))
#(['1'], ['1'])
unit = (['s', 'kg', 'm', '1'], ['1', '1'])
print(canonical(unit))
#(['kg', 'm', 's'], ['1'])
unit = (['1', 'm', 's', '1', '1'], ['m', 's'])
print(canonical(unit))
#(['1'], ['1'])
unit = (['kg', 'm', 'm'], ['kg', 'm', 'm'])
print(canonical(unit))
#(['1'], ['1'])

# What I received instead

(['kg', 'm'], ['s'])
(['m', 's'], ['m', 's'])
(['kg', 'm', 's'], ['1'])
(['1', '1', 's'], ['s'])
(['m'], ['m'])
Asked By: Tumes

||

Answers:

Note: i left denominator empty instead of adding one to it

def canonical(unit):
    numerador = sorted([i for i in unit[0] if i!='1'])
    denominador = sorted([i for i in unit[1] if i!='1'])


    numCount= {i : numerador.count(i)-denominador.count(i) for i in set(numerador +denominador)}
    
    numerador=[]
    denominador= []

    for num,count in numCount.items():
        if count>0:
            numerador+=[num]*count
        elif count<0:
            denominador+= [num]*-count

    if len(numerador) == 0:
        numerador=['1']
    if len(denominador) == 0:
        denominador=['1']
    
    return (numerador, denominador)
Answered By: YellowFlash

See copy below. I created a deepcopy of your unit so it is impregnable to iterate/remove and also corrected the logic which was wrong in a few places. This gives the output you expect.

import copy

def canonical(unit):
    numerador = unit[0]
    denominador = unit[1]
    numerador.sort()
    denominador.sort()
    
    lstNumerador = copy.deepcopy(numerador)
    lstDenominador = copy.deepcopy(denominador)
    
    for i in numerador:
        for j in denominador:
            if (i in lstNumerador) and (i in lstDenominador):
                    lstNumerador.remove(i)
                    lstDenominador.remove(i)
                
    
    while lstNumerador.count('1') > 1:
        lstNumerador.remove('1')
        
    while lstDenominador.count('1') > 1:
        lstDenominador.remove('1')
        
    if len(lstNumerador) == 0:
        lstNumerador.append('1')
        
    if len(lstDenominador) == 0:
        lstDenominador.append('1')
        
    unit = (lstNumerador, lstDenominador)
    return unit

unit1 = (['kg', 'm', 'm', 's'], ['m', 's', 's'])

unit2 = (['kg', 'm', 'm', 's'], ['s', 'm', 'm', 'kg'])

unit3 = (['s', 'kg', 'm', '1'], ['1', '1'])

unit4 = (['1', 'm', 's', '1', '1'], ['m', 's'])

unit5 = (['kg', 'm', 'm'], ['kg', 'm', 'm'])

units = [unit1, unit2, unit3, unit4, unit5]

for entry in units:
    print(canonical(entry))

output:

(['kg', 'm'], ['s'])
(['1'], ['1'])
(['kg', 'm', 's'], ['1'])
(['1'], ['1'])
(['1'], ['1'])
Answered By: user19077881

The main problem is that you call remove on a list that you are iterating. That is rarely a good idea, because it "pulls the rug" under the iterator, which thereby may skip some values.

Secondly, that nested loop is generally not efficient. This can better be done with dictionaries.

This code challenge is actually an ideal candidate for Counter dictionaries. Besides that the constructor creates a dictionary with a count per element, it also allows subtraction of two such Counter objects, and negation of a Counter object with the unary operator. You can easily get the individual (repeated) elements again with the elements method. All this is very handy to achieve the desired result in an efficient way.

Here is how that would look:

from collections import Counter

def canonical(unit):
    diff = Counter(unit[0])
    diff.subtract(unit[1])  # May produce negative frequencies
    del diff["1"]  # NB: Counter objects don't raise "key error"
    return list(diff.elements()) or ["1"], list((-diff).elements()) or ["1"]
Answered By: trincot