How to use a string to set iloc in pandas
Question:
I understand the general usage of iloc as follows.
import pandas as pd
df = pd.DataFrame([[1,2,3,4,5],[4,5,6,4,5],[7,8,9,4,5],[10,11,12,4,5]])
df_ = df.iloc[:, 1:4]
On the other hand, although it is a limited usage, is it possible to set iloc using a string?
Below is pseudo code that does not work properly but is what I would like to do.
import pandas as pd
df = pd.DataFrame([[1,2,3,4,5],[4,5,6,4,5],[7,8,9,4,5],[10,11,12,4,5]])
df.columns = ["money","job","fruits","animals","height"]
tests = ["1:2","2:3", "1:4"]
for i in tests:
print(df.iloc[:,i])
Is there a better way to split the string into "start_col" and "end_col" using a function?
Answers:
You an just create a converter function:
import pandas as pd
df = pd.DataFrame([[1,2,3,4,5],[4,5,6,4,5],[7,8,9,4,5],[10,11,12,4,5]])
ranges = ["1:2", "2:3", "1:4"]
def as_int_range(ranges):
return [i for rng in ranges for i in range(*map(int, rng.split(':')))]
df.iloc[as_int_range(ranges),:]
0 1 2 3 4
1 4 5 6 4 5
2 7 8 9 4 5
1 4 5 6 4 5
2 7 8 9 4 5
3 10 11 12 4 5
iloc[ ]
is for slicing numeric data. For String slicing, you can use loc[ ]
like you have used iloc[ ]
for numbers. Here is the official pandas documentation for implementing loc[ ]
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.loc.html
I didn’t mention it in my original question.
I wrote a program that supports examples like ["1:3, 4"].
import pandas as pd
df = pd.DataFrame([[1,2,3,4,5],[4,5,6,4,5],[7,8,9,4,5],[10,11,12,4,5]])
df.columns = ["a", "b", "c" , "d", "e"]
def args_to_list(string):
strings = string.split(",")
column_list = []
for each_string in strings:
each_string = each_string.strip()
if ":" in each_string:
start_ , end_ = each_string.split(":")
for i in range(int(start_), int(end_)):
column_list.append(i)
else:
column_list.append(int(each_string))
return column_list
tests = ["1:2", "1,2,3,4", "1:2,3", "1,2:3,4"]
for i in tests:
list_ =args_to_list(i)
print(list_)
print(df.iloc[:, list_])
print(list_)
I understand the general usage of iloc as follows.
import pandas as pd
df = pd.DataFrame([[1,2,3,4,5],[4,5,6,4,5],[7,8,9,4,5],[10,11,12,4,5]])
df_ = df.iloc[:, 1:4]
On the other hand, although it is a limited usage, is it possible to set iloc using a string?
Below is pseudo code that does not work properly but is what I would like to do.
import pandas as pd
df = pd.DataFrame([[1,2,3,4,5],[4,5,6,4,5],[7,8,9,4,5],[10,11,12,4,5]])
df.columns = ["money","job","fruits","animals","height"]
tests = ["1:2","2:3", "1:4"]
for i in tests:
print(df.iloc[:,i])
Is there a better way to split the string into "start_col" and "end_col" using a function?
You an just create a converter function:
import pandas as pd
df = pd.DataFrame([[1,2,3,4,5],[4,5,6,4,5],[7,8,9,4,5],[10,11,12,4,5]])
ranges = ["1:2", "2:3", "1:4"]
def as_int_range(ranges):
return [i for rng in ranges for i in range(*map(int, rng.split(':')))]
df.iloc[as_int_range(ranges),:]
0 1 2 3 4
1 4 5 6 4 5
2 7 8 9 4 5
1 4 5 6 4 5
2 7 8 9 4 5
3 10 11 12 4 5
iloc[ ]
is for slicing numeric data. For String slicing, you can use loc[ ]
like you have used iloc[ ]
for numbers. Here is the official pandas documentation for implementing loc[ ]
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.loc.html
I didn’t mention it in my original question.
I wrote a program that supports examples like ["1:3, 4"].
import pandas as pd
df = pd.DataFrame([[1,2,3,4,5],[4,5,6,4,5],[7,8,9,4,5],[10,11,12,4,5]])
df.columns = ["a", "b", "c" , "d", "e"]
def args_to_list(string):
strings = string.split(",")
column_list = []
for each_string in strings:
each_string = each_string.strip()
if ":" in each_string:
start_ , end_ = each_string.split(":")
for i in range(int(start_), int(end_)):
column_list.append(i)
else:
column_list.append(int(each_string))
return column_list
tests = ["1:2", "1,2,3,4", "1:2,3", "1,2:3,4"]
for i in tests:
list_ =args_to_list(i)
print(list_)
print(df.iloc[:, list_])
print(list_)