Convert CSV style
Question:
I have a CSV file that is formatted in a way that I am unfamiliar with.
The file contains hourly mean power output over a whole year for a couple of generators and the water level in the reservoir of a hydropower plant.
These are the first 17 lines (4 hours) of the file.
CTime,textbox1,Name,textbox2
01-01-2021 00:00,Middel,TURBINEG2 EFF,0.00
01-01-2021 00:00,Middel,TURBINUG3 EFF,0.00
01-01-2021 00:00,Middel,RSVRLEVEL VST,-4.98
01-01-2021 00:00,Middel,TURBINEG1 EFF,0.00
01-01-2021 01:00,Middel,TURBINEG2 EFF,0.00
01-01-2021 01:00,Middel,TURBINUG3 EFF,0.00
01-01-2021 01:00,Middel,RSVRLEVEL VST,-4.98
01-01-2021 01:00,Middel,TURBINEG1 EFF,0.00
01-01-2021 02:00,Middel,TURBINEG2 EFF,0.00
01-01-2021 02:00,Middel,TURBINUG3 EFF,0.00
01-01-2021 02:00,Middel,RSVRLEVEL VST,-4.97
01-01-2021 02:00,Middel,TURBINEG1 EFF,0.00
01-01-2021 03:00,Middel,TURBINEG2 EFF,0.00
01-01-2021 03:00,Middel,TURBINUG3 EFF,0.00
01-01-2021 03:00,Middel,RSVRLEVEL VST,-4.96
01-01-2021 03:00,Middel,TURBINEG1 EFF,0.00
What I want is five columns (CTime, TURBINEG1, TURBINEG2, TURBINEG3, WATERLEVEL) with one row for each hour instead of this (four rows per hour).
I haven’t gotten anywhere by simply iterating over each hour of the year and write to columns of a new text file. Unfortunately I haven’t come up with any code that is worth bringing up here.
Answers:
In the future, please attach a text file, not a screenshot 🙂
This code should do what you need:
import csv
with open('csv_in.csv', newline='') as csv_in,
open('csv_out', 'w', newline='') as csv_out:
reader = csv.DictReader(csv_in)
fieldnames_out = ['CTime', 'TURBINEG1', 'TURBINEG2', 'TURBINEG3', 'WATERLEVEL']
writer = csv.DictWriter(csv_out, fieldnames=fieldnames_out)
writer.writeheader()
row_out = dict.fromkeys(fieldnames_out)
while True:
try:
row_in = reader.__next__()
row_out['CTime'] = row_in['CTime']
row_out['TURBINEG2'] = row_in['textbox2']
row_in = reader.__next__()
row_out['TURBINEG3'] = row_in['textbox2']
row_in = reader.__next__()
row_out['WATERLEVEL'] = row_in['textbox2']
row_in = reader.__next__()
row_out['TURBINEG1'] = row_in['textbox2']
except StopIteration as e:
break
print(row_out)
writer.writerow(row_out)
I have a CSV file that is formatted in a way that I am unfamiliar with.
The file contains hourly mean power output over a whole year for a couple of generators and the water level in the reservoir of a hydropower plant.
These are the first 17 lines (4 hours) of the file.
CTime,textbox1,Name,textbox2
01-01-2021 00:00,Middel,TURBINEG2 EFF,0.00
01-01-2021 00:00,Middel,TURBINUG3 EFF,0.00
01-01-2021 00:00,Middel,RSVRLEVEL VST,-4.98
01-01-2021 00:00,Middel,TURBINEG1 EFF,0.00
01-01-2021 01:00,Middel,TURBINEG2 EFF,0.00
01-01-2021 01:00,Middel,TURBINUG3 EFF,0.00
01-01-2021 01:00,Middel,RSVRLEVEL VST,-4.98
01-01-2021 01:00,Middel,TURBINEG1 EFF,0.00
01-01-2021 02:00,Middel,TURBINEG2 EFF,0.00
01-01-2021 02:00,Middel,TURBINUG3 EFF,0.00
01-01-2021 02:00,Middel,RSVRLEVEL VST,-4.97
01-01-2021 02:00,Middel,TURBINEG1 EFF,0.00
01-01-2021 03:00,Middel,TURBINEG2 EFF,0.00
01-01-2021 03:00,Middel,TURBINUG3 EFF,0.00
01-01-2021 03:00,Middel,RSVRLEVEL VST,-4.96
01-01-2021 03:00,Middel,TURBINEG1 EFF,0.00
What I want is five columns (CTime, TURBINEG1, TURBINEG2, TURBINEG3, WATERLEVEL) with one row for each hour instead of this (four rows per hour).
I haven’t gotten anywhere by simply iterating over each hour of the year and write to columns of a new text file. Unfortunately I haven’t come up with any code that is worth bringing up here.
In the future, please attach a text file, not a screenshot 🙂
This code should do what you need:
import csv
with open('csv_in.csv', newline='') as csv_in,
open('csv_out', 'w', newline='') as csv_out:
reader = csv.DictReader(csv_in)
fieldnames_out = ['CTime', 'TURBINEG1', 'TURBINEG2', 'TURBINEG3', 'WATERLEVEL']
writer = csv.DictWriter(csv_out, fieldnames=fieldnames_out)
writer.writeheader()
row_out = dict.fromkeys(fieldnames_out)
while True:
try:
row_in = reader.__next__()
row_out['CTime'] = row_in['CTime']
row_out['TURBINEG2'] = row_in['textbox2']
row_in = reader.__next__()
row_out['TURBINEG3'] = row_in['textbox2']
row_in = reader.__next__()
row_out['WATERLEVEL'] = row_in['textbox2']
row_in = reader.__next__()
row_out['TURBINEG1'] = row_in['textbox2']
except StopIteration as e:
break
print(row_out)
writer.writerow(row_out)