Python pandas: boolean function in Data Frame
Question:
I am trying to select the rows of data frame df_A
whose index values ends with 1 or 4 and capture them in another data frame df_s1s4
.
I am given the following hint: “pass a boolean function, which checks if an index string ends with 1 or 4, to .loc or .iloc methods.”
I tried the following but couldn’t get it to work.
import numpy as np
import pandas as pd
heights_A=pd.Series([176.2,158.4,167.6,156.2,161.4], index=['s1','s2','s3','s4','s5'])
weights_A=pd.Series([85.1,90.2,76.8,80.4,78.9], index=['s1','s2','s3','s4','s5'])
df_A=pd.DataFrame({'Student_height':heights_A, 'Student_weight':weights_A})
df_s1s4=df_A.loc[:,df_A.columns.str.endswith('1','4')]
print(df_s1s4)
Can anybody suggest how I might use a boolean function to solve this problem?
Answers:
You can use a boolean array with .loc
:
df_s1s4 = df_A.loc[(df_A.index.str.endswith('1') | df_A.index.str.endswith('4'))]
Student_height Student_weight
s1 176.2 85.1
s4 156.2 80.4
df_s1s4 = df_A.iloc[ [ i for i in [0, 3] if str(i)[-1] in [‘0’, ‘3’] ] ]
Using Boolean function you can achieve results in following way:
bool_func = lambda x : x.index.str.endswith('1') + x.index.str.endswith('4')
df_s1s4 = df_A.loc[bool_func]
print(df_s1s4)
Results :
Student_height Student_weight
s1 176.2 85.1
s4 156.2 80.4
Use this
df_s1s4=df_A.loc[df_A.index.str.endswith('1','4'),:]
I am trying to select the rows of data frame df_A
whose index values ends with 1 or 4 and capture them in another data frame df_s1s4
.
I am given the following hint: “pass a boolean function, which checks if an index string ends with 1 or 4, to .loc or .iloc methods.”
I tried the following but couldn’t get it to work.
import numpy as np
import pandas as pd
heights_A=pd.Series([176.2,158.4,167.6,156.2,161.4], index=['s1','s2','s3','s4','s5'])
weights_A=pd.Series([85.1,90.2,76.8,80.4,78.9], index=['s1','s2','s3','s4','s5'])
df_A=pd.DataFrame({'Student_height':heights_A, 'Student_weight':weights_A})
df_s1s4=df_A.loc[:,df_A.columns.str.endswith('1','4')]
print(df_s1s4)
Can anybody suggest how I might use a boolean function to solve this problem?
You can use a boolean array with .loc
:
df_s1s4 = df_A.loc[(df_A.index.str.endswith('1') | df_A.index.str.endswith('4'))]
Student_height Student_weight
s1 176.2 85.1
s4 156.2 80.4
df_s1s4 = df_A.iloc[ [ i for i in [0, 3] if str(i)[-1] in [‘0’, ‘3’] ] ]
Using Boolean function you can achieve results in following way:
bool_func = lambda x : x.index.str.endswith('1') + x.index.str.endswith('4')
df_s1s4 = df_A.loc[bool_func]
print(df_s1s4)
Results :
Student_height Student_weight
s1 176.2 85.1
s4 156.2 80.4
Use this
df_s1s4=df_A.loc[df_A.index.str.endswith('1','4'),:]