Problem using function that uses boolean verification (string) to create a new column in pandas

Question

I have a database (df2) with the following structure to calculate profitability (or gains) for Brazilian fixed income assets:

df2:
enter image description here

For every type of assets (described in the column "Tipo") I need to make a different calculation to calculate their gains. Eg.: if its a CDB% the calculation is one, if CDBIPCA is another, etc.

So I build a function "rentabilidade" which checks what type of fixed income in the column "Tipo" and perform the calculation accordingly.

The function is below:

def rentabilidade(tipo, taxa, dtAplic, dtResg, cnpj):
    if tipo == 'Caixa':
        rentAtivo = 0
    elif tipo.item == "CDB%":
        rentAtivo = rent_cdbper(taxa)
    elif tipo.item == 'CDBPre':
        rentAtivo = rent_pre(taxa)
    elif tipo.item == 'CDBIPCA':
        rentAtivo = rent_cdbipca(taxa)
    elif tipo.item == 'CDB+':
        rentAtivo = rent_cdimais(taxa)
    elif tipo.item == 'LetraPre':
        rentAtivo = rent_letra_pre(taxa, dtAplic, dtResg)
    elif tipo.item == 'LetraIPCA':
        rentAtivo = rent_letra_ipca(taxa, dtAplic, dtResg)
    elif tipo.item == 'Letra%':
        rentAtivo = rent_letra_per(taxa, dtAplic, dtResg)
    elif tipo.item == 'Fundos':
        rentAtivo = rent_fundo(cnpj)
    else:
        rentAtivo = "Error"
        
    return rentAtivo

My goal is to create a new column "rentabilidade" with all gains calculated row by row.

However when I run the following code:

df2["rentabilidade"] = rentabilidade(df2["Tipo"], df2["Taxa"], df2["Aplicação"], df2["Vencimento"], df2["CNPJ_Emissor"])

I get this error:

ValueError: The truth value of a Series is ambiguous. 
Use a.empty, a.bool(), a.item(), a.any() or a.all().

I believe the python code is comparing the entire series to the value in the function instead of doing one by one.

I was expecting to have a column with each value calculated accordingly to the type described in the column "Tipo" (all strings).

Asked By: Dico Gomes

||

Source

Answer 1

In order to apply a function to all rows of a dataframe, you must use a lambda function. In your case:

df2["rentabilidade"] = df2.apply(lambda x: rentabilidade(x["Tipo"], x["Taxa"], x["Aplicação"], x["Vencimento"], x["CNPJ_Emissor"]), axis=1)

Answered By: Galo do Leste

Problem using function that uses boolean verification (string) to create a new column in pandas

Question:

Answers: