Get the name of columns from rows distinct from zero python

Question:

I have this dataframe:

df0 = pd.DataFrame({'points': [0, 0, -3, 16, 0, 5, -3, 14],
                    'assists': [0, 0, 2, 0, 1, -7, 0, 6],
                    'numbers': [0, 0, 1, 6, 10, 5, 8, 7]})

and my desired dataset looks like this:

points assists numbers colX
0      0       0        0
0      0       0        0
-3     2       1       'points-assists-numbers'
16     0       6       'points-numbers'
0      1       10      'assists-numbers'
5      7       5       'points-assists-numbers'
-3     0       8       'points-numbers'
14     8       7       'points-assists-numbers'

A function that create a string from columns names that have values distinct from zero.

Any help?

Asked By: rnv86

||

Answers:

This kind of operation is well suited to a lambda expression.

Something like this should work:

df0['colX'] = df0.apply(lambda x: '-'.join(c for c in df0.columns if x[c] != 0), axis=1).replace('', 0)

  • first it gets a list of the columns that are not 0
  • joins the names of those columns with a "-"
  • after that, fills blank names with a 0
Answered By: robertoia

You can try with dot

df0['new'] = df0.ne(0).dot(df0.columns+'_').str[:-1]
df0
Out[9]: 
   points  assists  numbers                     new
0       0        0        0                        
1       0        0        0                        
2      -3        2        1  points_assists_numbers
3      16        0        6          points_numbers
4       0        1       10         assists_numbers
5       5       -7        5  points_assists_numbers
6      -3        0        8          points_numbers
7      14        6        7  points_assists_numbers
Answered By: BENY

Here are a few options:

#keep only the values that are not 0 and stack. reset the index so it can be joined by '-' on the index level.
df0['new'] = df0.where(df0.ne(0)).stack().reset_index(level=1).groupby(level=0)['level_1'].agg('-'.join)

or

#similar to the solution above, but multiplies the column value to avoid resetting index
df0['new'] = df0.ne(0).mul(df0.columns).where(lambda x: x.ne('')).stack().groupby(level=0).agg('-'.join)

or

#multiplies the column names by values that are not 0, and adds them together.
df0['new'] = df0.ne(0).dot(df0.columns + '-').str.rstrip('-')
Answered By: rhug123
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.