Python: Read Csv files one by one in a folder and save output as png file
Question:
I have a folder with 50 csv files like countrieslist1.csv,countrieslist2.csv, countrieslist3.csv and so on. I have a code where I can read the values from a csv file using pandas and plot the required graph from the data .What I want is that my code should take the first csv file ,do the plotting and save it as png file then it should take second csv file do the same and like this for every csv file so that in the end I should have 50 png file(one for each csv file)
I tried
import pandas as pd
import os
import matplotlib.pyplot as plt
folder_path = "C:/Users/xyz/Desktop/Countrieslist"
df=pd.read_csv(folder_path)
X=df.'columnname'.value_counts.(normalize=True).head(5)
X.plot.barh()
plt.ylabel()
plt.xlabel()
plt.title()
plt.savefig(folder_path[:-3]+'png')
This gives the output but it only for a single csv file.But I want a code that should take all csv files one by one, do the plotting and save it as png file.How can I do that?
Answers:
First in first get .csv
files:
import glob, os
csv_files = []
os.chdir("C:/Users/xyz/Desktop/Countrieslist")
for file in glob.glob("*.csv"):
csv_files.append(file)
The next step is do your magic in a loop:
for file in csv_files:
df=pd.read_csv(file)
X=df.'columnname'.value_counts.(normalize=True).head(5)
X.plot.barh()
plt.ylabel()
plt.xlabel()
plt.title()
plt.savefig(file+'.png')
Since you already have os
imported, you are able to use the listdir
function present in os
You can use the following code to iterate over the contents of the directory, and if the file isn’t a csv file, it continues the iteration
for file in os.listdir(folder):
if not file.endswith('.csv'): continue
df=pd.read_csv(file)
# continue with other code here
You can use the following code:
import pandas as pd
import pathlib
import matplotlib.pyplot as plt
folder_path = pathlib.Path("C:/Users/xyz/Desktop/Countrieslist")
def create_image(filename, columnname):
df = pd.read_csv(filename)
ax = (df[columnname].value_counts(normalize=True).head(5)
.plot.bar(ylabel='Count', xlabel='Country',
title='Value counts',
legend=False, rot=0))
plt.savefig(folder_path / f'{filename.stem}.png')
for filename in folder_path.glob('*.csv'):
create_image(filename, 'Country')
countrieslist5.png
countrieslist8.png
Input data:
REGIONS = ['AL', 'AT', 'BE', 'BG', 'CH', 'CZ', 'DE', 'DK',
'EE', 'ES', 'FI', 'FR', 'GR', 'HR', 'HU', 'IE',
'IT', 'LT', 'LU', 'LV', 'ME', 'NL', 'NO', 'PL',
'PT', 'RO', 'RS', 'SE', 'SI', 'SK', 'UK']
for i in range(1, 10):
df = pd.DataFrame({'Country': np.random.choice(REGIONS, 200)})
df.to_csv(f'Countrieslist/countrieslist{i}.csv', index=False)
I have a folder with 50 csv files like countrieslist1.csv,countrieslist2.csv, countrieslist3.csv and so on. I have a code where I can read the values from a csv file using pandas and plot the required graph from the data .What I want is that my code should take the first csv file ,do the plotting and save it as png file then it should take second csv file do the same and like this for every csv file so that in the end I should have 50 png file(one for each csv file)
I tried
import pandas as pd
import os
import matplotlib.pyplot as plt
folder_path = "C:/Users/xyz/Desktop/Countrieslist"
df=pd.read_csv(folder_path)
X=df.'columnname'.value_counts.(normalize=True).head(5)
X.plot.barh()
plt.ylabel()
plt.xlabel()
plt.title()
plt.savefig(folder_path[:-3]+'png')
This gives the output but it only for a single csv file.But I want a code that should take all csv files one by one, do the plotting and save it as png file.How can I do that?
First in first get .csv
files:
import glob, os
csv_files = []
os.chdir("C:/Users/xyz/Desktop/Countrieslist")
for file in glob.glob("*.csv"):
csv_files.append(file)
The next step is do your magic in a loop:
for file in csv_files:
df=pd.read_csv(file)
X=df.'columnname'.value_counts.(normalize=True).head(5)
X.plot.barh()
plt.ylabel()
plt.xlabel()
plt.title()
plt.savefig(file+'.png')
Since you already have os
imported, you are able to use the listdir
function present in os
You can use the following code to iterate over the contents of the directory, and if the file isn’t a csv file, it continues the iteration
for file in os.listdir(folder):
if not file.endswith('.csv'): continue
df=pd.read_csv(file)
# continue with other code here
You can use the following code:
import pandas as pd
import pathlib
import matplotlib.pyplot as plt
folder_path = pathlib.Path("C:/Users/xyz/Desktop/Countrieslist")
def create_image(filename, columnname):
df = pd.read_csv(filename)
ax = (df[columnname].value_counts(normalize=True).head(5)
.plot.bar(ylabel='Count', xlabel='Country',
title='Value counts',
legend=False, rot=0))
plt.savefig(folder_path / f'{filename.stem}.png')
for filename in folder_path.glob('*.csv'):
create_image(filename, 'Country')
countrieslist5.png
countrieslist8.png
Input data:
REGIONS = ['AL', 'AT', 'BE', 'BG', 'CH', 'CZ', 'DE', 'DK',
'EE', 'ES', 'FI', 'FR', 'GR', 'HR', 'HU', 'IE',
'IT', 'LT', 'LU', 'LV', 'ME', 'NL', 'NO', 'PL',
'PT', 'RO', 'RS', 'SE', 'SI', 'SK', 'UK']
for i in range(1, 10):
df = pd.DataFrame({'Country': np.random.choice(REGIONS, 200)})
df.to_csv(f'Countrieslist/countrieslist{i}.csv', index=False)