Create multiple files from unique values of a column using inbuilt libraries of python
Question:
I started learning python and was wondering if there was a way to create multiple files from unique values of a column. I know there are 100’s of ways of getting it done through pandas. But I am looking to have it done through inbuilt libraries. I couldn’t find a single example where its done through inbuilt libraries.
Here is the sample csv file data:
uniquevalue|count
a|123
b|345
c|567
d|789
a|123
b|345
c|567
Sample output file:
a.csv
uniquevalue|count
a|123
a|123
b.csv
b|345
b|345
I am struggling with looping on unique values in a column and then print them out. Can someone explain with logic how to do it ? That will be much appreciated. Thanks.
Answers:
import csv
with open('sample.csv', newline='') as csvfile:
reader = csv.reader(csvfile)
for row in reader:
with open(f"{row[0]}.csv", 'a') as inner:
writer = csv.writer(
inner, delimiter='|',
fieldnames=('uniquevalue', 'count')
)
writer.writerow(row)
import csv
from collections import defaultdict
header = []
data = defaultdict(list)
DELIMITER = "|"
with open("inputfile.csv", newline="") as csvfile:
reader = csv.reader(csvfile, delimiter=DELIMITER)
for i, row in enumerate(reader):
if i == 0:
header = row
else:
key = row[0]
data[key].append(row)
for key, value in data.items():
filename = f"{key}.csv"
with open(filename, "w", newline="") as f:
writer = csv.writer(f, delimiter=DELIMITER)
rows = [header] + value
writer.writerows(rows)
the task can also be done without using csv module. the lines of the file are read, and with read_file.read().splitlines()[1:]
the newline characters are stripped off, also skipping the header line of the csv file. with a set a unique collection of inputdata is created, that is used to count number of duplicates and to create the output files.
with open("unique_sample.csv", "r") as read_file:
items = read_file.read().splitlines()[1:]
for line in set(items):
with open(line[:line.index('|')] + '.csv', 'w') as output:
output.write((line + 'n') * items.count(line))
I started learning python and was wondering if there was a way to create multiple files from unique values of a column. I know there are 100’s of ways of getting it done through pandas. But I am looking to have it done through inbuilt libraries. I couldn’t find a single example where its done through inbuilt libraries.
Here is the sample csv file data:
uniquevalue|count
a|123
b|345
c|567
d|789
a|123
b|345
c|567
Sample output file:
a.csv
uniquevalue|count
a|123
a|123
b.csv
b|345
b|345
I am struggling with looping on unique values in a column and then print them out. Can someone explain with logic how to do it ? That will be much appreciated. Thanks.
import csv
with open('sample.csv', newline='') as csvfile:
reader = csv.reader(csvfile)
for row in reader:
with open(f"{row[0]}.csv", 'a') as inner:
writer = csv.writer(
inner, delimiter='|',
fieldnames=('uniquevalue', 'count')
)
writer.writerow(row)
import csv
from collections import defaultdict
header = []
data = defaultdict(list)
DELIMITER = "|"
with open("inputfile.csv", newline="") as csvfile:
reader = csv.reader(csvfile, delimiter=DELIMITER)
for i, row in enumerate(reader):
if i == 0:
header = row
else:
key = row[0]
data[key].append(row)
for key, value in data.items():
filename = f"{key}.csv"
with open(filename, "w", newline="") as f:
writer = csv.writer(f, delimiter=DELIMITER)
rows = [header] + value
writer.writerows(rows)
the task can also be done without using csv module. the lines of the file are read, and with read_file.read().splitlines()[1:]
the newline characters are stripped off, also skipping the header line of the csv file. with a set a unique collection of inputdata is created, that is used to count number of duplicates and to create the output files.
with open("unique_sample.csv", "r") as read_file:
items = read_file.read().splitlines()[1:]
for line in set(items):
with open(line[:line.index('|')] + '.csv', 'w') as output:
output.write((line + 'n') * items.count(line))