How to fetch data and store into multiple files based on condition
Question:
test.csv
name,age,n1,n2,n3
a,21,1,2,3
b,22,4,9,0
c,25,4,5,6
d,25,41,5,6
e,25,4,66,6
f,25,4,5,66
g,25,4,55,6
h,25,4,5,56
i,25,41,5,61
j,25,4,51,60
k,20,40,50,60
l,21,40,51,60
My code till reading and storing into dict
import pandas as pd
input_file = pd.read_csv("test.csv")
for i in range(0, len(input_file['name'])):
dict1 = {}
dict1["name"] = str(input_file['name'][i])
dict1["age"] = str(input_file['age'][i])
dict1["n1"] = str(input_file['n1'][i])
dict1["n2"] = str(input_file['n2'][i])
dict1["n3"] = str(input_file['n3'][i])
I want to generate output in multiple file for each 5 rows of data (But this I need to do using writeline function in python as I need to do many stuff in writelines. FIle name should be generated dynamically also input will be dynamic (Meaning more rows can come)
example or expected output (herre file name must be dynamic)
out_file = open('File1.xml', 'w')
out_file.writelines(I will process with dictionary data row by row)
out_file.writelines("n")
File1
a,21,1,2,3
b,22,4,9,0
c,25,4,5,6
d,25,41,5,6
e,25,4,66,6
File2
f,25,4,5,66
g,25,4,55,6
h,25,4,5,56
i,25,41,5,61
j,25,4,51,60
File3
k,20,40,50,60
l,21,40,51,60
Answers:
If default RangeIndex
you can loop in groupby
with integer division by number of groups:
input_file = pd.read_csv("test.csv")
N = 5
for name, g in input_file.groupby(input_file.index // N):
g.to_csv(f'file_{name}.csv', ignore_index=True, header=False)
N = 5
for name, g in input_file.groupby(np.arange(len(input_file)) // N):
g.to_csv(f'file_{name}.csv', ignore_index=True, header=False)
EDIT: If need really write line by line use:
N = 5
for name, g in input_file.groupby(input_file.index // N):
with open(f'File{name+1}.xml', 'w') as out_file:
for data in g.to_numpy():
out_file.write(','.join(str(x) for x in data))
out_file.write('n')
test.csv
name,age,n1,n2,n3
a,21,1,2,3
b,22,4,9,0
c,25,4,5,6
d,25,41,5,6
e,25,4,66,6
f,25,4,5,66
g,25,4,55,6
h,25,4,5,56
i,25,41,5,61
j,25,4,51,60
k,20,40,50,60
l,21,40,51,60
My code till reading and storing into dict
import pandas as pd
input_file = pd.read_csv("test.csv")
for i in range(0, len(input_file['name'])):
dict1 = {}
dict1["name"] = str(input_file['name'][i])
dict1["age"] = str(input_file['age'][i])
dict1["n1"] = str(input_file['n1'][i])
dict1["n2"] = str(input_file['n2'][i])
dict1["n3"] = str(input_file['n3'][i])
I want to generate output in multiple file for each 5 rows of data (But this I need to do using writeline function in python as I need to do many stuff in writelines. FIle name should be generated dynamically also input will be dynamic (Meaning more rows can come)
example or expected output (herre file name must be dynamic)
out_file = open('File1.xml', 'w')
out_file.writelines(I will process with dictionary data row by row)
out_file.writelines("n")
File1
a,21,1,2,3
b,22,4,9,0
c,25,4,5,6
d,25,41,5,6
e,25,4,66,6
File2
f,25,4,5,66
g,25,4,55,6
h,25,4,5,56
i,25,41,5,61
j,25,4,51,60
File3
k,20,40,50,60
l,21,40,51,60
If default RangeIndex
you can loop in groupby
with integer division by number of groups:
input_file = pd.read_csv("test.csv")
N = 5
for name, g in input_file.groupby(input_file.index // N):
g.to_csv(f'file_{name}.csv', ignore_index=True, header=False)
N = 5
for name, g in input_file.groupby(np.arange(len(input_file)) // N):
g.to_csv(f'file_{name}.csv', ignore_index=True, header=False)
EDIT: If need really write line by line use:
N = 5
for name, g in input_file.groupby(input_file.index // N):
with open(f'File{name+1}.xml', 'w') as out_file:
for data in g.to_numpy():
out_file.write(','.join(str(x) for x in data))
out_file.write('n')