For every row keep only the non-nas and then concatenate while ignoring index, pandas
Question:
I have the following pandas dataframe:
import numpy as np
import pandas as pd
df = pd.DataFrame({'col1': [1, np.nan, np.nan],
'results': ['Sub', 'Sub', 'Sub'],
'group': ['a', 'a', 'a'],
'seed': [6, 6, 6],
'col2': [np.nan, 2, np.nan],
'col3': [np.nan, np.nan, 3]})
df
col1 results group seed col2 col3
0 1.0 Sub a 6 NaN NaN
1 NaN Sub a 6 2.0 NaN
2 NaN Sub a 6 NaN 3.0
I would like for every row to keep only the columns that dont have NaNs and then concatenate back ignoring the index
The end result looks like this
pd.DataFrame({'col1':[1],
'results':['Sub',],
'group':['a'],
'seed':[6],
'col2':[2],
'col3':[3]}, index=[0])
col1 results group seed col2 col3
0 1 Sub a 6 2 3
How could I do that ?
Answers:
You can use the notnull() method of a Pandas DataFrame to select only the columns that don’t have any NaN values, and then concatenate the resulting dataframes along the rows using concat() method with ignore_index=True argument. Here’s an example:
# Select only columns without NaNs
df_clean = df[df.notnull().all(axis=1)]
# Concatenate resulting dataframes ignoring the index
df_result = pd.concat([df_clean[col] for col in df_clean.columns], axis=1,
ignore_index=True)
# Assign the column names from the original dataframe
df_result.columns = df.columns
# Print the result
print(df_result)
You can bfill
and slice the first row:
out = df.bfill().iloc[[0]]
Output:
col1 results group seed col2 col3
0 1.0 Sub a 6 2.0 3.0
I have the following pandas dataframe:
import numpy as np
import pandas as pd
df = pd.DataFrame({'col1': [1, np.nan, np.nan],
'results': ['Sub', 'Sub', 'Sub'],
'group': ['a', 'a', 'a'],
'seed': [6, 6, 6],
'col2': [np.nan, 2, np.nan],
'col3': [np.nan, np.nan, 3]})
df
col1 results group seed col2 col3
0 1.0 Sub a 6 NaN NaN
1 NaN Sub a 6 2.0 NaN
2 NaN Sub a 6 NaN 3.0
I would like for every row to keep only the columns that dont have NaNs and then concatenate back ignoring the index
The end result looks like this
pd.DataFrame({'col1':[1],
'results':['Sub',],
'group':['a'],
'seed':[6],
'col2':[2],
'col3':[3]}, index=[0])
col1 results group seed col2 col3
0 1 Sub a 6 2 3
How could I do that ?
You can use the notnull() method of a Pandas DataFrame to select only the columns that don’t have any NaN values, and then concatenate the resulting dataframes along the rows using concat() method with ignore_index=True argument. Here’s an example:
# Select only columns without NaNs
df_clean = df[df.notnull().all(axis=1)]
# Concatenate resulting dataframes ignoring the index
df_result = pd.concat([df_clean[col] for col in df_clean.columns], axis=1,
ignore_index=True)
# Assign the column names from the original dataframe
df_result.columns = df.columns
# Print the result
print(df_result)
You can bfill
and slice the first row:
out = df.bfill().iloc[[0]]
Output:
col1 results group seed col2 col3
0 1.0 Sub a 6 2.0 3.0