How to shorten if statement code in Python

Question:

I have this Pandas DataFrame

enter image description here

I’m attempting to create a new column named Needed using the code below. The rule is:

In case of "KHOÁ NHÓM", for EVERY 25 giohoc, Needed = dauvao_overall + 0.5.

In case of "KHOÁ KÈM", for EVERY 20 giohoc, Needed = dauvao_overall + 0.5.

My idea is to divide giohoc by 25 for "KHOÁ NHÓM" and 20 for "KHOÁ KÈM".

If the result < 1 then Needed = dauvao_overall.

If the result >=1 and <2 then Needed = dauvao_overall + 0.5.

If the result >=2 and <3 then Needed = dauvao_overall + 1.

All the way up to …. Needed = dauvao_overall + 7.

Although I succeeded, I believe there is a shorter and cleaner way to achieve the same result. Please tell me what I can do to improve the code. Thank you!

empty =[]
for index, row in didiem.iterrows():
        # KHOÁ NHÓM
        if row.group_kh_ten == "KHOÁ NHÓM" and row.giohoc/25 < 1:
                empty.append(row.dauvao_overall)
        elif row.group_kh_ten == "KHOÁ NHÓM" and row.giohoc/25 >= 1 and row.giohoc/25 <2:
                empty.append(row.dauvao_overall + 0.5)
        elif row.group_kh_ten == "KHOÁ NHÓM" and row.giohoc/25 >= 2 and row.giohoc/25 <3:
                empty.append(row.dauvao_overall + 1)
        elif row.group_kh_ten == "KHOÁ NHÓM" and row.giohoc/25 >= 3 and row.giohoc/25 <4:
                empty.append(row.dauvao_overall + 1.5)
        elif row.group_kh_ten == "KHOÁ NHÓM" and row.giohoc/25 >= 4 and row.giohoc/25 <5:
                empty.append(row.dauvao_overall + 2)
        elif row.group_kh_ten == "KHOÁ NHÓM" and row.giohoc/25 >= 5 and row.giohoc/25 <6:
                empty.append(row.dauvao_overall + 2.5)
        elif row.group_kh_ten == "KHOÁ NHÓM" and row.giohoc/25 >= 6 and row.giohoc/25 <7:
                empty.append(row.dauvao_overall + 3)
        elif row.group_kh_ten == "KHOÁ NHÓM" and row.giohoc/25 >= 7 and row.giohoc/25 <8:
                empty.append(row.dauvao_overall + 3.5)
        elif row.group_kh_ten == "KHOÁ NHÓM" and row.giohoc/25 >= 8 and row.giohoc/25 <9:
                empty.append(row.dauvao_overall + 4.0)
        elif row.group_kh_ten == "KHOÁ NHÓM" and row.giohoc/25 >= 9 and row.giohoc/25 <10:
                empty.append(row.dauvao_overall + 4.5)
        elif row.group_kh_ten == "KHOÁ NHÓM" and row.giohoc/25 >= 10 and row.giohoc/25 <11:
                empty.append(row.dauvao_overall + 5)
        elif row.group_kh_ten == "KHOÁ NHÓM" and row.giohoc/20 >= 14 and row.giohoc/20 <15:
                empty.append(row.dauvao_overall + 7.0)
        # KHOÁ KÈM
        elif row.group_kh_ten == "KHOÁ KÈM" and row.giohoc/20 < 1:
                empty.append(row.dauvao_overall) 
        elif row.group_kh_ten == "KHOÁ KÈM" and row.giohoc/20 >= 1 and row.giohoc/20 <2:
                empty.append(row.dauvao_overall + 0.5)
        elif row.group_kh_ten == "KHOÁ KÈM" and row.giohoc/20 >= 2 and row.giohoc/20 <3:
                empty.append(row.dauvao_overall + 1)
        elif row.group_kh_ten == "KHOÁ KÈM" and row.giohoc/20 >= 3 and row.giohoc/20 <4:
                empty.append(row.dauvao_overall + 1.5) 
        elif row.group_kh_ten == "KHOÁ KÈM" and row.giohoc/20 >= 4 and row.giohoc/20 <5:
                empty.append(row.dauvao_overall + 2)
        elif row.group_kh_ten == "KHOÁ KÈM" and row.giohoc/20 >= 5 and row.giohoc/20 <6:
                empty.append(row.dauvao_overall + 2.5) 
        elif row.group_kh_ten == "KHOÁ KÈM" and row.giohoc/20 >= 6 and row.giohoc/20 <7:
                empty.append(row.dauvao_overall + 3)
        elif row.group_kh_ten == "KHOÁ KÈM" and row.giohoc/20 >= 7 and row.giohoc/20 <8:
                empty.append(row.dauvao_overall + 3.5) 
        elif row.group_kh_ten == "KHOÁ KÈM" and row.giohoc/20 >= 8 and row.giohoc/20 <9:
                empty.append(row.dauvao_overall + 4.0)
        elif row.group_kh_ten == "KHOÁ KÈM" and row.giohoc/20 >= 9 and row.giohoc/20 <10:
                empty.append(row.dauvao_overall + 4.5)
        elif row.group_kh_ten == "KHOÁ KÈM" and row.giohoc/20 >= 10 and row.giohoc/20 <11:
                empty.append(row.dauvao_overall + 5.0)
        elif row.group_kh_ten == "KHOÁ KÈM" and row.giohoc/20 >= 11 and row.giohoc/20 <12:
                empty.append(row.dauvao_overall + 5.5)
        elif row.group_kh_ten == "KHOÁ KÈM" and row.giohoc/20 >= 12 and row.giohoc/20 <13:
                empty.append(row.dauvao_overall + 6.0)
        elif row.group_kh_ten == "KHOÁ KÈM" and row.giohoc/20 >= 13 and row.giohoc/20 <14:
                empty.append(row.dauvao_overall + 6.5)
        elif row.group_kh_ten == "KHOÁ KÈM" and row.giohoc/20 >= 14 and row.giohoc/20 <15:
                empty.append(row.dauvao_overall + 7.0)
        elif row.group_kh_ten == "KHOÁ KÈM" and row.giohoc/20 >= 15 and row.giohoc/20 <16:
                empty.append(row.dauvao_overall + 7.5)
        else:
                empty.append("inspect")
didiem["Needed"] = empty
Asked By: thanh pham

||

Answers:

I think this will do what you want (I only solved it for one of your cases…)

import numpy
import pandas
num_rows = 1000
# some random values between 2 and 10 for this column
dauvao_overall = numpy.random.uniform(2,10,num_rows)
# some random values between 1 and 200 for this column
giohoc = numpy.random.randint(1,200,num_rows)
# some random values for this column
group_kh_ten = numpy.random.choice(["KHOA NHOM","KHOA KEM"],num_rows)

#make a dataframe

df = pandas.DataFrame({"dauvao_overall":dauvao_overall,"giohoc":giohoc, "group_kh_ten":group_kh_ten})
df['needed'] = 0



# here is how you would solve KHOA KEM
khoa_kem = df['group_kh_ten']=='KHOA KEM'
df.loc[khoa_kem,"needed"] = (df[khoa_kem]['dauvao_overall'] + 0.5) * (df[khoa_kem]['giohoc']//25)
print(df)
Answered By: Joran Beasley

First, define a function which will calculate the Needed value. It will receive a dataframe row, and do the calculations.

def fun(row):
    group_kh, overall, giohoc = [row[col_name] 
                                for col_name in ['group_kh_ten', 'dauvao_overall',  'giohoc']]
    match group_kh:
        
        case 'KHOÁ NHÓM': 
            needed = overall + (giohoc // 25) * 0.5
        
        case 'KHOÁ KÈM' : 
            needed = overall + (giohoc // 20) * 0.5
            if giohoc // 20 >= 16: needed = 'inspect'
        
        case _ : 
            print("error: wrong group_kh_ten")
            
    return needed

Apply the function on each row of the dataframe:

df['Needed'] = df.apply(fun, axis=1)

Example:

    group_kh_ten    dauvao_overall  giohoc
0   KHOÁ NHÓM       2.0             70.0
1   KHOÁ KÈM        3.5             80.0

Apply the function fun:

df['Needed'] = df.apply(fun, axis=1)

Output:

    group_kh_ten    dauvao_overall  giohoc  Needed
0   KHOÁ NHÓM       2.0             70.0    3.0
1   KHOÁ KÈM        3.5             80.0    5.5
Answered By: AndrzejO

Thank to the formula of @AndrzejO and @Joran Beasley:

needed = overall + (giohoc // 25) * 0.5

I figured out an even shorter way.

empty =[]
for index, row in didiem.iterrows():
        # KHOÁ NHÓM
        if row.group_kh_ten == "KHOÁ NHÓM":
                empty.append(row.dauvao_overall + row.giohoc//25 * 0.5)
        # KHOÁ KÈM
        elif row.group_kh_ten == "KHOÁ KÈM":
                empty.append(row.dauvao_overall + row.giohoc//20 * 0.5)
        else:
                empty.append("Null")
didiem["Needed"] = empty
Answered By: thanh pham
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.