Tuple unpacking and np.max() giving unexpected values

Question:

I have a dataset that I am trying to filter and apply an adjustment value using tuples.

df = pd.DataFrame({
    'loannum': ['1', '2', '3', '4'],
    'or_dep': [250000, 650000, 1000000, 300000]
})

loan2adj = [('1', 50000), ('3', 250000), ('2', 100000)]

My expected output looks like this.

loannum      or_dep
1            200000
2            550000
3            750000
4            300000

This is the logic I’m using to unpack the tuples and apply the adjustment value.

for loan, adj_amt in loan2adj:
    df.loc[df['loannum'] == loan, 'or_dep'] = np.max(df['or_dep'] - adj_amt, 0)

This code produces some unusual values.

loannum     or_dep
1           950000
2           550000
3           750000
4           300000

Loans 3 and 4 are being returned correctly. Loan 4 should not have an adjustment and loan 3 is being adjusted correctly. How can I achieve the desired output?

Asked By: gernworm

||

Answers:

The problem is that when you do np.max(df['or_dep'] - adj_amt, 0) you are not selecting the wanted row.

To fix it just:

for loan, adj_amt in loan2adj:
    df.loc[df['loannum'] == loan, 'or_dep'] = np.max(df.loc[df['loannum'] == loan, 'or_dep'] - adj_amt, 0)

You probably should use the loannum column as an index to simplyfy your locs:

df.set_index("loannum", inplace=True)
for loan, adj_amt in loan2adj:
    df.loc[loan, 'or_dep'] = max(df.loc[loan, 'or_dep'] - adj_amt, 0)
Answered By: Matteo Zanoni
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.