How to find column where is punctation mark as a single value in Python Pandas?

Question:

I have DataFrame like below:

COL1 | COL2 | COL3
-----|------|--------
abc  | P    | 123
b.bb | ,    | 22
  1  | B    | 2
...  |...   | ...

And I need to find columns where is only punctation mark like !"#$%&'()*+,-./:;<=>?@[]^_`{|}~

So as a result I need something like below (only COL2, because in COL1 is also punctation mark, but there is with other values).

COL2 
-------
 P    
 ,    
 B   
... 
Asked By: dingaro

||

Answers:

punc = set("!"#$%&'()*+,-./:;<=>?@[]^_`{|}~")
df.loc[:, df.applymap(lambda x: set(x).issubset(punc)).any()]
Answered By: Chrysophylaxs

Using a regex with str.fullmatch and any:

import re

chars = '''!"#$%&'()*+,-./:;<=>?@[]^_`{|}~'''
pattern = f'[{re.escape(chars)}]'
# [!"#$%&'()*+,-./:;<=>?@[]^_`{|}~]

out = df.loc[:, df.astype(str).apply(lambda s: s.str.fullmatch(pattern).any())]

Or with isin:

out = df.loc[:, df.isin(set(chars)).any()]

Output:

  COL2
0    P
1    ,
2    B
Answered By: mozway
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.