How to split string in every 6th strings which are the subsets of a dataframe column?
Question:
In a dataframe column, I would like to split subset data in format of strings into every 6 digits and add a comma ‘,’ so that I can get a list of hs codes under the column. I tried the below but it needs some correction.
df.loc[df[:, 1] for i in range(0, len(['id'], 6)
Answers:
Assuming you want to split from the left:
df['id'] = df['id'].astype(str).str.replace(r'(.{6})(?=.)', r'1,', regex=True)
Output:
id
0 280530,284442,284690
Code:
import pandas as pd
data = {'Column 1': ['a', 'b', 'c'],
'id': [2468938493843983, 345642232, 23343433]}
df = pd.DataFrame(data)
df['id'] = df['id'].astype(str)
df['fromleft'] = [','.join([df['id'][i][j:j+6] for j in range(0, len(df['id'][i]), 6)]) for i in range(len(df))]
print(df)
Output:
Column 1 id fromleft
0 a 2468938493843983 246893,849384,3983
1 b 345642232 345642,232
2 c 23343433 233434,33
In a dataframe column, I would like to split subset data in format of strings into every 6 digits and add a comma ‘,’ so that I can get a list of hs codes under the column. I tried the below but it needs some correction.
df.loc[df[:, 1] for i in range(0, len(['id'], 6)
Assuming you want to split from the left:
df['id'] = df['id'].astype(str).str.replace(r'(.{6})(?=.)', r'1,', regex=True)
Output:
id
0 280530,284442,284690
Code:
import pandas as pd
data = {'Column 1': ['a', 'b', 'c'],
'id': [2468938493843983, 345642232, 23343433]}
df = pd.DataFrame(data)
df['id'] = df['id'].astype(str)
df['fromleft'] = [','.join([df['id'][i][j:j+6] for j in range(0, len(df['id'][i]), 6)]) for i in range(len(df))]
print(df)
Output:
Column 1 id fromleft
0 a 2468938493843983 246893,849384,3983
1 b 345642232 345642,232
2 c 23343433 233434,33