How to select a range of numpy array in a pandas dataframe
Question:
I have a pandas dataframe as below:
Train ID
adc_data1
101
[1610,1613,1616,…]
102
[1601,1605,1610,…]
…
…
in the "adc_data1" column, in each cell there is a numpy array. I would like to select a range of data from each cell in this column and put it in a new column and creat a dataframe as below. How does one do this?
Train ID
adc_data1
selected data
101
[1610,1613,1616,…]
[1610,1613,1616]
102
[1601,1605,1610,…]
[1601,1605,1610]
…
…
…
Using the line below one can select a range of data for a single cell:
selected_range = df["adc_data1"].iloc[1][0:2]
but is there a way to do the same for all rows at the same time without using a for loop?
Answers:
You can do:
df["adc_data1"].str[0:100]
For example
df = pd.DataFrame([[list(range(1000))]]*3, columns=['adc_data1'])
print(df["adc_data1"].str[:10])
Output:
0 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
1 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
2 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Name: adc_data1, dtype: object
I have a pandas dataframe as below:
Train ID | adc_data1 |
---|---|
101 | [1610,1613,1616,…] |
102 | [1601,1605,1610,…] |
… | … |
in the "adc_data1" column, in each cell there is a numpy array. I would like to select a range of data from each cell in this column and put it in a new column and creat a dataframe as below. How does one do this?
Train ID | adc_data1 | selected data |
---|---|---|
101 | [1610,1613,1616,…] | [1610,1613,1616] |
102 | [1601,1605,1610,…] | [1601,1605,1610] |
… | … | … |
Using the line below one can select a range of data for a single cell:
selected_range = df["adc_data1"].iloc[1][0:2]
but is there a way to do the same for all rows at the same time without using a for loop?
You can do:
df["adc_data1"].str[0:100]
For example
df = pd.DataFrame([[list(range(1000))]]*3, columns=['adc_data1'])
print(df["adc_data1"].str[:10])
Output:
0 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
1 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
2 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Name: adc_data1, dtype: object