If statement to add column to pandas dataframe gives the same values

Question:

I want to add a new column called I have a pandas dataframe called week5_233C. My Python version is 3.19.13.

I wrote an if-statement to add a new column to my data set: Spike. If the value in Value [pV] is not equal to 0, I want to add a 1 to that row. If Value [pV] is 0, then I want to add in the spike column that it is 0.

The data looks like this:

  TimeStamp [µs]  Value [pV]  
0        1906200         0   
1        1906300         0  
2        1906400         0     
3        1906500    -149012   
4        1906600    -149012    

And I want it to look like this:

  TimeStamp [µs]  Value [pV]  Spike
0        1906200         0      0
1        1906300         0      0
2        1906400         0      0
3        1906500    -149012     1
4        1906600    -149012     1

I tried:

week5_233C.loc[week5_233C[' Value [pV]'] != 0, 'Spike'] = 1 
week5_233C.loc[week5_233C[' Value [pV]'] == 0, 'Spike'] = 0 

but all rows in column Spike get the same value.

I also tried:

week5_233C['Spike'] = week5_233C[' Value [pV]'].apply(lambda x: 0 if x == 0 else 1)

Again, it just adds only 0s or only 1s, but does not work with if and else. See example data:

  TimeStamp [µs]  Value [pV]  Spike
0        1906200         0      1
1        1906300         0      1
2        1906400         0      1
3        1906500    -149012     1
4        1906600    -149012     1

Doing it like this:

for i in week5_233C[' Value [pV]']:
    if i != 0: 
        week5_233C['Spike'] = 1
    elif i == 0:
        week5_233C['Spike'] = 0

does not do anything: does not add a column, does not give an error, and makes Python crash.

However, when I run this if-statement with just a print as such:

for i in week5_233C[' Value [pV]']:
    if i != 0: 
        print(1)
    elif i == 0:
        print(0)

then it does print the exact values I want. I cannot figure out how to save these values in a new column.

This:

for i in week5_233C[' Value [pV]']:
    if i != 0:
       week5_233C.concat([1, df.iloc['Spike']]) 
    elif i == 0:
        week5_233C.concat([0, df.iloc['Spike']])

gives me an error: AttributeError: 'DataFrame' object has no attribute 'concat'

How can I make a new column Spike and add the values 0 and 1 based on the value in column Value [pV]?

Asked By: Celine Serry

||

Answers:

import pandas as pd

df = pd.DataFrame({'TimeStamp [µs]':[1906200, 1906300, 1906400, 1906500, 1906600],
                   'Value [pV] ':[0, 0, 0, -149012, -149012],
                   })



df['Spike'] = df.agg({'Value [pV] ': lambda v: int(bool(v))})

print(df)
   TimeStamp [µs]  Value [pV]   Spike
0         1906200            0      0
1         1906300            0      0
2         1906400            0      0
3         1906500      -149012      1
4         1906600      -149012      1
Answered By: Laurent B.

I think you should check the dtype of Value [pV] column. You probably have string that’s why you have the same value. Try print(df['Value [pV]'].dtype). If object try to convert with astype(float) or pd.to_numeric(df['Value [pV]']).

You can also try:

df['spike'] = np.where(df['Value [pV]'] == '0', 0, 1)

Update

To show bad rows and debug your datafame, use the following code:

df.loc[pd.to_numeric(df['Value [pV]'], errors='coerce').isna(), 'Value [pV]']
Answered By: Corralien

Here is an alternative approach using df.astype()

df['Spike'] = (df['value [pV]'] != 0).astype(int)
print(df)

   timestamp  value [pV]  Spike
0    1906200           0      0
1    1906300           0      0
2    1906400           0      0
3    1906500     -149012      1
4    1906600     -149012      1
Answered By: Jamiu S.
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.