pandas comparison raises TypeError: cannot compare a dtyped [float64] array with a scalar of type [bool]

Question:

I have the following structure to my dataFrame:

Index: 1008 entries, Trial1.0 to Trial3.84
Data columns (total 5 columns):
CHUNK_NAME                    1008  non-null values
LAMBDA                        1008  non-null values
BETA                          1008  non-null values
HIT_RATE                      1008  non-null values
AVERAGE_RECIPROCAL_HITRATE    1008  non-null values

chunks=['300_321','322_343','344_365','366_387','388_408','366_408','344_408','322_408','300_408']
lam_beta=[(lambda1,beta1),(lambda1,beta2),(lambda1,beta3),...(lambda1,beta_n),(lambda2,beta1),(lambda2,beta2)...(lambda2,beta_n),........]

my_df.ix[my_df.CHUNK_NAME==chunks[0]&my_df.LAMBDA==lam_beta[0][0]]

I want to get the rows of the DataFrame for a particular chunk lets say chunks[0] and particular lambda value. So in this case, the output should be all rows in the DataFrame having CHUNK_NAME='300_321' and LAMBDA=lambda1. There would be n rows one for each beta value that would be returned. But instead I get the following error. Any help in solving this problem would be appreciated.

TypeError: cannot compare a dtyped [float64] array with a scalar of type [bool]
Asked By: anonuser0428

||

Answers:

& has higher precedence than ==. Write:

my_df.ix[(my_df.CHUNK_NAME==chunks[0])&(my_df.LAMBDA==lam_beta[0][0])]
         ^                           ^ ^                            ^
Answered By: ecatmur

One way to make sure you don’t get into trouble with operator precedence is to use the wrapper methods of comparison operators. For example, use eq method instead of the == operator.

Other wrappers are:

  • ne: !=
  • le: <=
  • lt: <
  • ge: >=
  • gt: >

So the expression in OP would be:

my_df.loc[my_df.CHUNK_NAME.eq(chunks[0]) & my_df.LAMBDA.eq(lam_beta[0][0])]

The wrappers can do more than the comparison operators. You can choose the axis along which to compare. Also, if you’re dealing with a MultiIndex object, you can choose the level.


Example:

For df:

   a  b    c
0  1  3  5.0
1  2  4  6.0

the following line:

out = df.loc[df['a']<3 & df['c']==5]

results in the following error:

> TypeError: Cannot perform 'rand_' with a dtyped [float64] array and
> scalar of type [bool]

However, if we use the equivalent wrappers:

out = df.loc[df['a'].lt(3) & df['c'].eq(5)])

Output:

   a  b    c
0  1  3  5.0
Answered By: user7864386
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.