Pandas – Aggregating and Ploting data from multiplefiles

Question:

I have multiple files naming like starting with Quarter’Year prefix like "Q121", "Q222" and all of these files contains data like below, Identifier stays the same but price and power columns are chainging. What would be the best way to aggregate files and be able to plot column like price across data range e.g from Q1’20 to Q2’22.

I was thinking about merging different files based on Identifier and then change column names to reflect Q120 etc…

data = [['A1', 10,50], ['A2', 15,60], ['A3', 14,55]]
df = pd.DataFrame(data, columns=['Identifier', 'Price','Power'])
Asked By: Drac0

||

Answers:

from glob import glob
import os
import pandas as pd

files = glob.glob("target_folder/*.csv")
dfs = []
for file_path in files:
    file_name = os.path.basename(file_path)
    temp = pd.read_csv(file_path, names=["Price","Power"], header=None)
    temp["Identifier"] = file_name.replace(".csv", "")
    dfs.append(temp)
df = pd.concat(dfs)
# change Q122 -> 22Q1, so the plot can order full list nicely 
df["Identifier"] = df["Identifier"].apply(lambda x: "".join([x[2],x[3],x[0],x[1]]))
Answered By: lucky6qi
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.