Tilde sign in pandas DataFrame


I’m new to python/pandas and came across a code snippet.

df = df[~df['InvoiceNo'].str.contains('C')]

Would be much obliged if I could know what is the tilde sign’s usage in this context?


It means bitwise not, inversing boolean mask – Falses to Trues and Trues to Falses.


df = pd.DataFrame({'InvoiceNo': ['aaC','ff','lC'],
print (df)
  InvoiceNo  a
0       aaC  1
1        ff  2
2        lC  5

#check if column contains C
print (df['InvoiceNo'].str.contains('C'))
0     True
1    False
2     True
Name: InvoiceNo, dtype: bool

#inversing mask
print (~df['InvoiceNo'].str.contains('C'))
0    False
1     True
2    False
Name: InvoiceNo, dtype: bool

Filter by boolean indexing:

df = df[~df['InvoiceNo'].str.contains('C')]
print (df)
  InvoiceNo  a
1        ff  2

So output is all rows of DataFrame, which not contains C in column InvoiceNo.

Answered By: jezrael

It’s used to invert boolean Series, see pandas-doc.

Answered By: RobinFrcd

tilde ~ is a bitwise operator. If the operand is 1, it returns 0, and if 0, it returns 1. So you will get the InvoiceNo values in the df that does not contain the string ‘C’

Answered By: Haz
df = df[~df['InvoiceNo'].str.contains('C')]

The above code block denotes that remove all data tuples from pandas dataframe, which has "C" letters in the strings values in [InvoiceNo] column.

tilde(~) sign works as a NOT(!) operator in this scenario.

Generally above statement uses to remove data tuples that have null values from data columns.

Answered By: Pasindu Perera
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.