Loop through all CSV files in a folder
Question:
I’m trying to loop through only the csv files in a folder that contains many kinds of files and many folders, I just want it to list all of the .csv files in this folder.
Here’s what I mean:
import os, sys
path = "path/to/dir"
dirs = os.listdir(path)
for file in dirs:
if file == '*.csv':
print file
I know there is no wildcard variable in python, but is there a way of doing this?
Answers:
Python provides glob
which should do this
>>> import glob
>>> glob.glob('/path/to/dir/*.csv')
Return a possibly-empty list of path names that match pathname, which
must be a string containing a path specification. pathname can be
either absolute (like /usr/src/Python-1.5/Makefile) or relative (like
../../Tools//.gif), and can contain shell-style wildcards. Broken
symlinks are included in the results (as in the shell).
Use the glob module: http://docs.python.org/2/library/glob.html
import glob
path = "path/to/dir/*.csv"
for fname in glob.glob(path):
print(fname)
I was trying to loop through the folder containing cvs files and print the number and the name of the columns.
Following code worked for me
import pandas as pd
import glob
path = r"C:UsersgumnweOneDrive - BPDesktopPersonaleiLinkSkin ProjectSkin_Project_Data_2020*.csv"
for fname in glob.glob(path):
df=pd.read_csv(fname)
my_list=list(df.columns)
print(len(my_list),my_list)
I’m trying to loop through only the csv files in a folder that contains many kinds of files and many folders, I just want it to list all of the .csv files in this folder.
Here’s what I mean:
import os, sys
path = "path/to/dir"
dirs = os.listdir(path)
for file in dirs:
if file == '*.csv':
print file
I know there is no wildcard variable in python, but is there a way of doing this?
Python provides glob
which should do this
>>> import glob
>>> glob.glob('/path/to/dir/*.csv')
Return a possibly-empty list of path names that match pathname, which
must be a string containing a path specification. pathname can be
either absolute (like /usr/src/Python-1.5/Makefile) or relative (like
../../Tools//.gif), and can contain shell-style wildcards. Broken
symlinks are included in the results (as in the shell).
Use the glob module: http://docs.python.org/2/library/glob.html
import glob
path = "path/to/dir/*.csv"
for fname in glob.glob(path):
print(fname)
I was trying to loop through the folder containing cvs files and print the number and the name of the columns.
Following code worked for me
import pandas as pd
import glob
path = r"C:UsersgumnweOneDrive - BPDesktopPersonaleiLinkSkin ProjectSkin_Project_Data_2020*.csv"
for fname in glob.glob(path):
df=pd.read_csv(fname)
my_list=list(df.columns)
print(len(my_list),my_list)