If statement to add column to pandas dataframe gives the same values
Question:
I want to add a new column called I have a pandas dataframe called week5_233C
. My Python version is 3.19.13.
I wrote an if-statement to add a new column to my data set: Spike
. If the value in Value [pV]
is not equal to 0, I want to add a 1 to that row. If Value [pV]
is 0, then I want to add in the spike
column that it is 0.
The data looks like this:
TimeStamp [µs] Value [pV]
0 1906200 0
1 1906300 0
2 1906400 0
3 1906500 -149012
4 1906600 -149012
And I want it to look like this:
TimeStamp [µs] Value [pV] Spike
0 1906200 0 0
1 1906300 0 0
2 1906400 0 0
3 1906500 -149012 1
4 1906600 -149012 1
I tried:
week5_233C.loc[week5_233C[' Value [pV]'] != 0, 'Spike'] = 1
week5_233C.loc[week5_233C[' Value [pV]'] == 0, 'Spike'] = 0
but all rows in column Spike
get the same value.
I also tried:
week5_233C['Spike'] = week5_233C[' Value [pV]'].apply(lambda x: 0 if x == 0 else 1)
Again, it just adds only 0s or only 1s, but does not work with if
and else
. See example data:
TimeStamp [µs] Value [pV] Spike
0 1906200 0 1
1 1906300 0 1
2 1906400 0 1
3 1906500 -149012 1
4 1906600 -149012 1
Doing it like this:
for i in week5_233C[' Value [pV]']:
if i != 0:
week5_233C['Spike'] = 1
elif i == 0:
week5_233C['Spike'] = 0
does not do anything: does not add a column, does not give an error, and makes Python crash.
However, when I run this if-statement with just a print as such:
for i in week5_233C[' Value [pV]']:
if i != 0:
print(1)
elif i == 0:
print(0)
then it does print the exact values I want. I cannot figure out how to save these values in a new column.
This:
for i in week5_233C[' Value [pV]']:
if i != 0:
week5_233C.concat([1, df.iloc['Spike']])
elif i == 0:
week5_233C.concat([0, df.iloc['Spike']])
gives me an error: AttributeError: 'DataFrame' object has no attribute 'concat'
How can I make a new column Spike
and add the values 0 and 1 based on the value in column Value [pV]
?
Answers:
import pandas as pd
df = pd.DataFrame({'TimeStamp [µs]':[1906200, 1906300, 1906400, 1906500, 1906600],
'Value [pV] ':[0, 0, 0, -149012, -149012],
})
df['Spike'] = df.agg({'Value [pV] ': lambda v: int(bool(v))})
print(df)
TimeStamp [µs] Value [pV] Spike
0 1906200 0 0
1 1906300 0 0
2 1906400 0 0
3 1906500 -149012 1
4 1906600 -149012 1
I think you should check the dtype of Value [pV]
column. You probably have string that’s why you have the same value. Try print(df['Value [pV]'].dtype)
. If object
try to convert with astype(float)
or pd.to_numeric(df['Value [pV]'])
.
You can also try:
df['spike'] = np.where(df['Value [pV]'] == '0', 0, 1)
Update
To show bad rows and debug your datafame, use the following code:
df.loc[pd.to_numeric(df['Value [pV]'], errors='coerce').isna(), 'Value [pV]']
Here is an alternative approach using df.astype()
df['Spike'] = (df['value [pV]'] != 0).astype(int)
print(df)
timestamp value [pV] Spike
0 1906200 0 0
1 1906300 0 0
2 1906400 0 0
3 1906500 -149012 1
4 1906600 -149012 1
I want to add a new column called I have a pandas dataframe called week5_233C
. My Python version is 3.19.13.
I wrote an if-statement to add a new column to my data set: Spike
. If the value in Value [pV]
is not equal to 0, I want to add a 1 to that row. If Value [pV]
is 0, then I want to add in the spike
column that it is 0.
The data looks like this:
TimeStamp [µs] Value [pV]
0 1906200 0
1 1906300 0
2 1906400 0
3 1906500 -149012
4 1906600 -149012
And I want it to look like this:
TimeStamp [µs] Value [pV] Spike
0 1906200 0 0
1 1906300 0 0
2 1906400 0 0
3 1906500 -149012 1
4 1906600 -149012 1
I tried:
week5_233C.loc[week5_233C[' Value [pV]'] != 0, 'Spike'] = 1
week5_233C.loc[week5_233C[' Value [pV]'] == 0, 'Spike'] = 0
but all rows in column Spike
get the same value.
I also tried:
week5_233C['Spike'] = week5_233C[' Value [pV]'].apply(lambda x: 0 if x == 0 else 1)
Again, it just adds only 0s or only 1s, but does not work with if
and else
. See example data:
TimeStamp [µs] Value [pV] Spike
0 1906200 0 1
1 1906300 0 1
2 1906400 0 1
3 1906500 -149012 1
4 1906600 -149012 1
Doing it like this:
for i in week5_233C[' Value [pV]']:
if i != 0:
week5_233C['Spike'] = 1
elif i == 0:
week5_233C['Spike'] = 0
does not do anything: does not add a column, does not give an error, and makes Python crash.
However, when I run this if-statement with just a print as such:
for i in week5_233C[' Value [pV]']:
if i != 0:
print(1)
elif i == 0:
print(0)
then it does print the exact values I want. I cannot figure out how to save these values in a new column.
This:
for i in week5_233C[' Value [pV]']:
if i != 0:
week5_233C.concat([1, df.iloc['Spike']])
elif i == 0:
week5_233C.concat([0, df.iloc['Spike']])
gives me an error: AttributeError: 'DataFrame' object has no attribute 'concat'
How can I make a new column Spike
and add the values 0 and 1 based on the value in column Value [pV]
?
import pandas as pd
df = pd.DataFrame({'TimeStamp [µs]':[1906200, 1906300, 1906400, 1906500, 1906600],
'Value [pV] ':[0, 0, 0, -149012, -149012],
})
df['Spike'] = df.agg({'Value [pV] ': lambda v: int(bool(v))})
print(df)
TimeStamp [µs] Value [pV] Spike
0 1906200 0 0
1 1906300 0 0
2 1906400 0 0
3 1906500 -149012 1
4 1906600 -149012 1
I think you should check the dtype of Value [pV]
column. You probably have string that’s why you have the same value. Try print(df['Value [pV]'].dtype)
. If object
try to convert with astype(float)
or pd.to_numeric(df['Value [pV]'])
.
You can also try:
df['spike'] = np.where(df['Value [pV]'] == '0', 0, 1)
Update
To show bad rows and debug your datafame, use the following code:
df.loc[pd.to_numeric(df['Value [pV]'], errors='coerce').isna(), 'Value [pV]']
Here is an alternative approach using df.astype()
df['Spike'] = (df['value [pV]'] != 0).astype(int)
print(df)
timestamp value [pV] Spike
0 1906200 0 0
1 1906300 0 0
2 1906400 0 0
3 1906500 -149012 1
4 1906600 -149012 1