Python – re-ordering columns in a csv
Question:
I have a bunch of csv files with the same columns but in different order. We are trying to upload them with SQL*Plus but we need the columns with a fixed column arrange.
Example
required order: A B C D E F
csv file: A C D E B (sometimes a column is not in the csv because it is not available)
is it achievable with python? we are using Access+Macros to do it… but it is too time consuming
PS. Sorry if anyone get upset for my English skills.
Answers:
csv_in = open("<filename>.csv", "r")
csv_out = open("<filename>.csv", "w")
for line in csv_in:
field_list = line.split(',') # split the line at commas
output_line = ','.join(field_list[0], # rejoin with commas, new order
field_list[2],
field_list[3],
field_list[4],
field_list[1]
)
csv_out.write(output_line)
csv_in.close()
csv_out.close()
You can use something similar to this to change the order, replacing ‘;’ with ‘,’ in your case.
Because you said you needed to do multiple .csv files, you could use the glob module for a list of your files
for file_name in glob.glob('<Insert-your-file-filter-here>*.csv'):
#Do the work here
The csv
module allows you to read csv files with their values associated to their column names. This in turn allows you to arbitrarily rearrange columns, without having to explicitly permute lists.
for row in csv.DictReader(open("foo.csv")):
print row["b"], row["a"]
2 1
22 21
Given the file foo.csv:
a,b,d,e,f
1,2,3,4,5
21,22,23,24,25
You can use the csv module to read, reorder, and then and write your file.
Sample File:
$ cat file.csv
A,B,C,D,E
a1,b1,c1,d1,e1
a2,b2,c2,d2,e2
Code
import csv
with open('file.csv', 'r') as infile, open('reordered.csv', 'a') as outfile:
# output dict needs a list for new column ordering
fieldnames = ['A', 'C', 'D', 'E', 'B']
writer = csv.DictWriter(outfile, fieldnames=fieldnames)
# reorder the header first
writer.writeheader()
for row in csv.DictReader(infile):
# writes the reordered rows to the new file
writer.writerow(row)
output
$ cat reordered.csv
A,C,D,E,B
a1,c1,d1,e1,b1
a2,c2,d2,e2,b2
So one way to tackle this problem is to use pandas
library which can be easily install using pip
. Basically, you can download csv
file to pandas dataframe then re-order the column and save it back to csv
file. For example, if your sample.csv
looks like below:
A,C,B,E,D
a1,b1,c1,d1,e1
a2,b2,c2,d2,e2
Here is a snippet to solve the problem.
import pandas as pd
df = pd.read_csv('/path/to/sample.csv')
df_reorder = df[['A', 'B', 'C', 'D', 'E']] # rearrange column here
df_reorder.to_csv('/path/to/sample_reorder.csv', index=False)
I have a bunch of csv files with the same columns but in different order. We are trying to upload them with SQL*Plus but we need the columns with a fixed column arrange.
Example
required order: A B C D E F
csv file: A C D E B (sometimes a column is not in the csv because it is not available)
is it achievable with python? we are using Access+Macros to do it… but it is too time consuming
PS. Sorry if anyone get upset for my English skills.
csv_in = open("<filename>.csv", "r")
csv_out = open("<filename>.csv", "w")
for line in csv_in:
field_list = line.split(',') # split the line at commas
output_line = ','.join(field_list[0], # rejoin with commas, new order
field_list[2],
field_list[3],
field_list[4],
field_list[1]
)
csv_out.write(output_line)
csv_in.close()
csv_out.close()
You can use something similar to this to change the order, replacing ‘;’ with ‘,’ in your case.
Because you said you needed to do multiple .csv files, you could use the glob module for a list of your files
for file_name in glob.glob('<Insert-your-file-filter-here>*.csv'):
#Do the work here
The csv
module allows you to read csv files with their values associated to their column names. This in turn allows you to arbitrarily rearrange columns, without having to explicitly permute lists.
for row in csv.DictReader(open("foo.csv")):
print row["b"], row["a"]
2 1
22 21
Given the file foo.csv:
a,b,d,e,f
1,2,3,4,5
21,22,23,24,25
You can use the csv module to read, reorder, and then and write your file.
Sample File:
$ cat file.csv
A,B,C,D,E
a1,b1,c1,d1,e1
a2,b2,c2,d2,e2
Code
import csv
with open('file.csv', 'r') as infile, open('reordered.csv', 'a') as outfile:
# output dict needs a list for new column ordering
fieldnames = ['A', 'C', 'D', 'E', 'B']
writer = csv.DictWriter(outfile, fieldnames=fieldnames)
# reorder the header first
writer.writeheader()
for row in csv.DictReader(infile):
# writes the reordered rows to the new file
writer.writerow(row)
output
$ cat reordered.csv
A,C,D,E,B
a1,c1,d1,e1,b1
a2,c2,d2,e2,b2
So one way to tackle this problem is to use pandas
library which can be easily install using pip
. Basically, you can download csv
file to pandas dataframe then re-order the column and save it back to csv
file. For example, if your sample.csv
looks like below:
A,C,B,E,D
a1,b1,c1,d1,e1
a2,b2,c2,d2,e2
Here is a snippet to solve the problem.
import pandas as pd
df = pd.read_csv('/path/to/sample.csv')
df_reorder = df[['A', 'B', 'C', 'D', 'E']] # rearrange column here
df_reorder.to_csv('/path/to/sample_reorder.csv', index=False)