Tuple unpacking and np.max() giving unexpected values
Question:
I have a dataset that I am trying to filter and apply an adjustment value using tuples.
df = pd.DataFrame({
'loannum': ['1', '2', '3', '4'],
'or_dep': [250000, 650000, 1000000, 300000]
})
loan2adj = [('1', 50000), ('3', 250000), ('2', 100000)]
My expected output looks like this.
loannum or_dep
1 200000
2 550000
3 750000
4 300000
This is the logic I’m using to unpack the tuples and apply the adjustment value.
for loan, adj_amt in loan2adj:
df.loc[df['loannum'] == loan, 'or_dep'] = np.max(df['or_dep'] - adj_amt, 0)
This code produces some unusual values.
loannum or_dep
1 950000
2 550000
3 750000
4 300000
Loans 3 and 4 are being returned correctly. Loan 4 should not have an adjustment and loan 3 is being adjusted correctly. How can I achieve the desired output?
Answers:
The problem is that when you do np.max(df['or_dep'] - adj_amt, 0)
you are not selecting the wanted row.
To fix it just:
for loan, adj_amt in loan2adj:
df.loc[df['loannum'] == loan, 'or_dep'] = np.max(df.loc[df['loannum'] == loan, 'or_dep'] - adj_amt, 0)
You probably should use the loannum
column as an index to simplyfy your locs:
df.set_index("loannum", inplace=True)
for loan, adj_amt in loan2adj:
df.loc[loan, 'or_dep'] = max(df.loc[loan, 'or_dep'] - adj_amt, 0)
I have a dataset that I am trying to filter and apply an adjustment value using tuples.
df = pd.DataFrame({
'loannum': ['1', '2', '3', '4'],
'or_dep': [250000, 650000, 1000000, 300000]
})
loan2adj = [('1', 50000), ('3', 250000), ('2', 100000)]
My expected output looks like this.
loannum or_dep
1 200000
2 550000
3 750000
4 300000
This is the logic I’m using to unpack the tuples and apply the adjustment value.
for loan, adj_amt in loan2adj:
df.loc[df['loannum'] == loan, 'or_dep'] = np.max(df['or_dep'] - adj_amt, 0)
This code produces some unusual values.
loannum or_dep
1 950000
2 550000
3 750000
4 300000
Loans 3 and 4 are being returned correctly. Loan 4 should not have an adjustment and loan 3 is being adjusted correctly. How can I achieve the desired output?
The problem is that when you do np.max(df['or_dep'] - adj_amt, 0)
you are not selecting the wanted row.
To fix it just:
for loan, adj_amt in loan2adj:
df.loc[df['loannum'] == loan, 'or_dep'] = np.max(df.loc[df['loannum'] == loan, 'or_dep'] - adj_amt, 0)
You probably should use the loannum
column as an index to simplyfy your locs:
df.set_index("loannum", inplace=True)
for loan, adj_amt in loan2adj:
df.loc[loan, 'or_dep'] = max(df.loc[loan, 'or_dep'] - adj_amt, 0)