Get the name of columns from rows distinct from zero python
Question:
I have this dataframe:
df0 = pd.DataFrame({'points': [0, 0, -3, 16, 0, 5, -3, 14],
'assists': [0, 0, 2, 0, 1, -7, 0, 6],
'numbers': [0, 0, 1, 6, 10, 5, 8, 7]})
and my desired dataset looks like this:
points assists numbers colX
0 0 0 0
0 0 0 0
-3 2 1 'points-assists-numbers'
16 0 6 'points-numbers'
0 1 10 'assists-numbers'
5 7 5 'points-assists-numbers'
-3 0 8 'points-numbers'
14 8 7 'points-assists-numbers'
A function that create a string from columns names that have values distinct from zero.
Any help?
Answers:
This kind of operation is well suited to a lambda expression.
Something like this should work:
df0['colX'] = df0.apply(lambda x: '-'.join(c for c in df0.columns if x[c] != 0), axis=1).replace('', 0)
- first it gets a list of the columns that are not 0
- joins the names of those columns with a "-"
- after that, fills blank names with a 0
You can try with dot
df0['new'] = df0.ne(0).dot(df0.columns+'_').str[:-1]
df0
Out[9]:
points assists numbers new
0 0 0 0
1 0 0 0
2 -3 2 1 points_assists_numbers
3 16 0 6 points_numbers
4 0 1 10 assists_numbers
5 5 -7 5 points_assists_numbers
6 -3 0 8 points_numbers
7 14 6 7 points_assists_numbers
Here are a few options:
#keep only the values that are not 0 and stack. reset the index so it can be joined by '-' on the index level.
df0['new'] = df0.where(df0.ne(0)).stack().reset_index(level=1).groupby(level=0)['level_1'].agg('-'.join)
or
#similar to the solution above, but multiplies the column value to avoid resetting index
df0['new'] = df0.ne(0).mul(df0.columns).where(lambda x: x.ne('')).stack().groupby(level=0).agg('-'.join)
or
#multiplies the column names by values that are not 0, and adds them together.
df0['new'] = df0.ne(0).dot(df0.columns + '-').str.rstrip('-')
I have this dataframe:
df0 = pd.DataFrame({'points': [0, 0, -3, 16, 0, 5, -3, 14],
'assists': [0, 0, 2, 0, 1, -7, 0, 6],
'numbers': [0, 0, 1, 6, 10, 5, 8, 7]})
and my desired dataset looks like this:
points assists numbers colX
0 0 0 0
0 0 0 0
-3 2 1 'points-assists-numbers'
16 0 6 'points-numbers'
0 1 10 'assists-numbers'
5 7 5 'points-assists-numbers'
-3 0 8 'points-numbers'
14 8 7 'points-assists-numbers'
A function that create a string from columns names that have values distinct from zero.
Any help?
This kind of operation is well suited to a lambda expression.
Something like this should work:
df0['colX'] = df0.apply(lambda x: '-'.join(c for c in df0.columns if x[c] != 0), axis=1).replace('', 0)
- first it gets a list of the columns that are not 0
- joins the names of those columns with a "-"
- after that, fills blank names with a 0
You can try with dot
df0['new'] = df0.ne(0).dot(df0.columns+'_').str[:-1]
df0
Out[9]:
points assists numbers new
0 0 0 0
1 0 0 0
2 -3 2 1 points_assists_numbers
3 16 0 6 points_numbers
4 0 1 10 assists_numbers
5 5 -7 5 points_assists_numbers
6 -3 0 8 points_numbers
7 14 6 7 points_assists_numbers
Here are a few options:
#keep only the values that are not 0 and stack. reset the index so it can be joined by '-' on the index level.
df0['new'] = df0.where(df0.ne(0)).stack().reset_index(level=1).groupby(level=0)['level_1'].agg('-'.join)
or
#similar to the solution above, but multiplies the column value to avoid resetting index
df0['new'] = df0.ne(0).mul(df0.columns).where(lambda x: x.ne('')).stack().groupby(level=0).agg('-'.join)
or
#multiplies the column names by values that are not 0, and adds them together.
df0['new'] = df0.ne(0).dot(df0.columns + '-').str.rstrip('-')