TypeError: Cannot perform 'rand_' with a dtyped [float64] array and scalar of type [bool]

Question:

I ran a command in python pandas as follows:

q1_fisher_r[(q1_fisher_r['TP53']==1) & q1_fisher_r[(q1_fisher_r['TumorST'].str.contains(':1:'))]]

I got following error:

TypeError: Cannot perform 'rand_' with a dtyped [float64] array and scalar of type [bool]

the solution i tried using this:
error link.

changed the code accordingly as:

q1_fisher_r[(q1_fisher_r['TumorST'].str.contains(':1:')) & (q1_fisher_r[(q1_fisher_r['TP53']==1)])]

But still I got the same error as TypeError: Cannot perform 'rand_' with a dtyped [float64] array and scalar of type [bool]

Asked By: svp

||

Answers:

For filtering by multiple conditions chain them by & and filter by boolean indexing:

q1_fisher_r[(q1_fisher_r['TP53']==1) & q1_fisher_r['TumorST'].str.contains(':1:')]
               ^^^^                        ^^^^
           first condition            second condition

Problem is this code returned filtered data, so cannot chain by condition:

q1_fisher_r[(q1_fisher_r['TumorST'].str.contains(':1:'))]

Similar problem:

q1_fisher_r[(q1_fisher_r['TP53']==1)]

Sample:

q1_fisher_r = pd.DataFrame({'TP53':[1,1,2,1], 'TumorST':['5:1:','9:1:','5:1:','6:1']})
print (q1_fisher_r)
   TP53 TumorST
0     1    5:1:
1     1    9:1:
2     2    5:1:
3     1     6:1

df = q1_fisher_r[(q1_fisher_r['TP53']==1) & q1_fisher_r['TumorST'].str.contains(':1:')]
print (df)
   TP53 TumorST
0     1    5:1:
1     1    9:1:
Answered By: jezrael

Had a similar problem with a setup as below, which yielded the same error message. The very simple solution for me was to have each individual condition between brackets. Should have known, but want to highlight in case someone else has the same problem.

Incorrect code:

conditions = [
    (df['A'] == '15min' & df['B'].dt.minute == 15),  # Note brackets only surrounding both conditions together, not each individual condition
    df['A'] == '30min' & df['B'].dt.minute == 30,  # Note no brackets at all 
]

output = [
    df['Time'] + dt.timedelta(minutes = 45),
    df['Time'] + dt.timedelta(minutes = 30),
]

df['TimeAdjusted'] = np.select(conditions, output, default = np.datetime64('NaT'))

Correct code:

conditions = [
        (df['A'] == '15min') & (df['B'].dt.minute == 15),  # Note brackets surrounding each condition
        (df['A'] == '30min') & (df['B'].dt.minute == 30),  # Note brackets surrounding each condition
]
    
output = [
        df['Time'] + dt.timedelta(minutes = 45),
        df['Time'] + dt.timedelta(minutes = 30),
]

df['TimeAdjusted'] = np.select(conditions, output, default = np.datetime64('NaT'))
Answered By: Hedge92

simple solution – Have all the conditions inside the bracket-

This throws an error –

df[df['Customer Id']==999 & df['month_year']=='10/22']

this doesn’t –

df[(df['Customer Id']==999) & (df['month_year']=='10/22')]
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.