Crate new row depends on 2 columns
Question:
After Year row need a new row as Year period if column 1 is year and column3< 2010 then columns values for year period is Below 2010 same as other rows
Column1 Column2 ColumnX Column3
0 Year 1 A 2009
1 Date 1 A 12
2 Year 2 A 2021
3 Year 3 A 2011
Column1 Column2 ColumnX Column3
0 Year 1 A 2009
1 Year period 1 A Below2010
2 Date 1 A 12
3 Year 2 A 2021
4 Year period 2 A Above2020
5 Year 3 A 2011
6 Year period 3 A Range in 2010/2020
Answers:
Filter rows first in boolean indexing
for Year
columns, replace Column3
in numpy.select
and add substring to Column1
, last join with original by concat
and sort indices by DataFrame.sort_index
:
#necessary default RangeIndex
df = df.reset_index(drop=True)
df2 = df[df['Column1'].eq('Year')].copy()
df2['Column3'] = pd.to_numeric(df2['Column3'], errors='coerce')
df1 = (df2.assign(Column3 = lambda x: np.select([x['Column3']<2010, x['Column3']>2020],
['Below2010','Above2020'],
default='Range in 2010/2020'),
Column1 = lambda x: x['Column1'] + ' period'))
df = pd.concat([df, df1]).sort_index(kind='mergesort', ignore_index=True)
print (df)
Column1 Column2 ColumnX Column3
0 Year 1 A 2009
1 Year period 1 A Below2010
2 Date 1 A 12
3 Year 2 A 2021
4 Year period 2 A Above2020
5 Year 3 A 2011
6 Year period 3 A Range in 2010/2020
After Year row need a new row as Year period if column 1 is year and column3< 2010 then columns values for year period is Below 2010 same as other rows
Column1 Column2 ColumnX Column3
0 Year 1 A 2009
1 Date 1 A 12
2 Year 2 A 2021
3 Year 3 A 2011
Column1 Column2 ColumnX Column3
0 Year 1 A 2009
1 Year period 1 A Below2010
2 Date 1 A 12
3 Year 2 A 2021
4 Year period 2 A Above2020
5 Year 3 A 2011
6 Year period 3 A Range in 2010/2020
Filter rows first in boolean indexing
for Year
columns, replace Column3
in numpy.select
and add substring to Column1
, last join with original by concat
and sort indices by DataFrame.sort_index
:
#necessary default RangeIndex
df = df.reset_index(drop=True)
df2 = df[df['Column1'].eq('Year')].copy()
df2['Column3'] = pd.to_numeric(df2['Column3'], errors='coerce')
df1 = (df2.assign(Column3 = lambda x: np.select([x['Column3']<2010, x['Column3']>2020],
['Below2010','Above2020'],
default='Range in 2010/2020'),
Column1 = lambda x: x['Column1'] + ' period'))
df = pd.concat([df, df1]).sort_index(kind='mergesort', ignore_index=True)
print (df)
Column1 Column2 ColumnX Column3
0 Year 1 A 2009
1 Year period 1 A Below2010
2 Date 1 A 12
3 Year 2 A 2021
4 Year period 2 A Above2020
5 Year 3 A 2011
6 Year period 3 A Range in 2010/2020