Load CSV files in script when script ran outside of directory
Question:
I am having an issue of trying to load CSV files in a python script when trying to run the file from outside the directory of the script called main.py
and the CSV files. (CSV files and main are in the same directory) I think the same issues as this SO post which doesn’t appear to contain a solution.
If I run this from terminal:
$ python /home/bbartling/Desktop/Load-Shredder/Load-Shift/main.py
I get some CSV file loading errors [Errno 2] No such file or directory: 'addresses_jci.csv'
But if I run the script in the directory of Load-Shift $ python main.py
it works just fine.
How do I fix my script to accommodate this? I added this to the top of my script:
script_dir = os.path.abspath( os.path.dirname( __file__ ) )
print("script directory: ",script_dir)
which prints:
script directory: /home/bbartling/Desktop/Load-Shredder/Load-Shift
But still no luck. Any ideas to try?
Edit
CSV file loading function in main.py
dir_path = os.path.dirname(os.path.realpath(__file__))
filename = os.path.join(dir_path, f'log_{dt_string}.log')
script_dir = os.path.abspath( os.path.dirname( __file__ ) )
print("script directory: ",script_dir)
def load_addresses(csv_file):
try:
print(f"Loading csv file: {csv_file}!")
os.path.join(script_dir, f'{csv_file}.csv')
with open(f'{csv_file}', newline='') as f:
reader = csv.reader(f)
data = list(reader)
# flattens list of lists
csv_file = sum(data, [])
print(f"{csv_file} addresses loaded: ",csv_file)
except Exception as error:
print(f"CSV file load error: {error}")
csv_file = [] # errors out
return csv_file
Answers:
filepaths in general are a bit confusing in programming languages, so this is a very common problem for beginners.
I typically have a fixed working directory for my projects, and tend to access all files relative to that working directory. If you can implement it, this is the simplest solution.
$ cd /home/bbartling/Desktop/Load-Shredder/Load-Shift/
$ python main.py
can give the desired solution, but depending upon your project this may or may not be feasible. This is by no means a hard-code or a hot fix, and perfectly valid for my projects. However, if you’re making a reusable shell script, you probably don’t want this.
I’ve used Path(__file__).parent
before to get the directory the file is running in, but IDK how the speed compares to the way you compute it.
I’ve used this a couple times before V
import os
HERE = os.path.dirname(os.path.realpath(__file__))
with open(os.path.join(HERE, 'mkdocs.yml')) as fl:
print(fl.read())
You can change the filename here to your script’s path, and that will work. If you know where it will be stored relative to your main.py
dir, you can hard-code it. otherwise get it from the command line or console, whichever you prefer. Use the relative path from your main.py directory to your csv here.
This is working! in the function to load a file:
THIS WORKS with specifing object to reference for the os.path.join:
full_csv_path = os.path.join(dir_path, f'{csv_file}')
THIS DOESNT WORK without specifying an object for the program to reference:
os.path.join(dir_path, f'{csv_file}')
Example below that also includes logging:
import csv, os, logging
# datetime object containing current date and time
now = datetime.now()
dt_string = now.strftime("%m_%d_%Y %H_%M_%S")
dir_path = os.path.dirname(os.path.realpath(__file__))
filename = os.path.join(dir_path, f'log_{dt_string}.log')
# Logger
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
file_handler = logging.FileHandler(filename)
file_handler.setLevel(logging.INFO)
file_handler.setFormatter(logging.Formatter('%(asctime)s - %(levelname)s - %(message)s'))
logger.addHandler(file_handler)
print("dir_path is: ",dir_path)
def load_addresses(csv_file):
try:
print(f"Loading csv file: {csv_file}!")
full_csv_path = os.path.join(dir_path, f'{csv_file}')
print("full_csv_path: ",full_csv_path)
with open(full_csv_path, newline='') as f:
reader = csv.reader(f)
data = list(reader)
# flattens list of lists
csv_file = sum(data, [])
print(f"{csv_file} addresses loaded: ",csv_file)
except Exception as error:
print(f"CSV file load error: {error}")
csv_file = [] # errors out
return csv_file
The problem in your code, I think you did not update the csv_file variable update:
csv_file = os.path.join(script_dir, f'{csv_file}.csv')
Though it can be more simpler:
import os
script_path = os.path.dirname(__file__)
filename = 'sample'
file_path = os.path.join(script_path, f'{filename}.csv')
print(file_path)
You can use pandas
, os
and fnmatch
for this purpose as given below,
import pandas as pd
import os,fnmatch
# change path to the folder from another folder
os.chdir("/path_to_the_folder_having_files_from_another_folder")
# print the current working directory to check if you have arrived
# in the right folder/directory
print(os.getcwd())
# then use fnmatch package to get the CSV files only
# or you can make some other pattern using some prefix or suffix
# like *this.csv or *that.csv
files = fnmatch.filter(os.listdir('.'), '*.csv')
# print the files to check that you have captured the right files
files
# read your files using pandas
# be careful with `sep`, you need to check specifically for your
# files
csv_files = [pd.read_csv(f, low_memory=False,sep = "t",header=0 for f in files]
I am having an issue of trying to load CSV files in a python script when trying to run the file from outside the directory of the script called main.py
and the CSV files. (CSV files and main are in the same directory) I think the same issues as this SO post which doesn’t appear to contain a solution.
If I run this from terminal:
$ python /home/bbartling/Desktop/Load-Shredder/Load-Shift/main.py
I get some CSV file loading errors [Errno 2] No such file or directory: 'addresses_jci.csv'
But if I run the script in the directory of Load-Shift $ python main.py
it works just fine.
How do I fix my script to accommodate this? I added this to the top of my script:
script_dir = os.path.abspath( os.path.dirname( __file__ ) )
print("script directory: ",script_dir)
which prints:
script directory: /home/bbartling/Desktop/Load-Shredder/Load-Shift
But still no luck. Any ideas to try?
Edit
CSV file loading function in main.py
dir_path = os.path.dirname(os.path.realpath(__file__))
filename = os.path.join(dir_path, f'log_{dt_string}.log')
script_dir = os.path.abspath( os.path.dirname( __file__ ) )
print("script directory: ",script_dir)
def load_addresses(csv_file):
try:
print(f"Loading csv file: {csv_file}!")
os.path.join(script_dir, f'{csv_file}.csv')
with open(f'{csv_file}', newline='') as f:
reader = csv.reader(f)
data = list(reader)
# flattens list of lists
csv_file = sum(data, [])
print(f"{csv_file} addresses loaded: ",csv_file)
except Exception as error:
print(f"CSV file load error: {error}")
csv_file = [] # errors out
return csv_file
filepaths in general are a bit confusing in programming languages, so this is a very common problem for beginners.
I typically have a fixed working directory for my projects, and tend to access all files relative to that working directory. If you can implement it, this is the simplest solution.
$ cd /home/bbartling/Desktop/Load-Shredder/Load-Shift/
$ python main.py
can give the desired solution, but depending upon your project this may or may not be feasible. This is by no means a hard-code or a hot fix, and perfectly valid for my projects. However, if you’re making a reusable shell script, you probably don’t want this.
I’ve used Path(__file__).parent
before to get the directory the file is running in, but IDK how the speed compares to the way you compute it.
I’ve used this a couple times before V
import os
HERE = os.path.dirname(os.path.realpath(__file__))
with open(os.path.join(HERE, 'mkdocs.yml')) as fl:
print(fl.read())
You can change the filename here to your script’s path, and that will work. If you know where it will be stored relative to your main.py
dir, you can hard-code it. otherwise get it from the command line or console, whichever you prefer. Use the relative path from your main.py directory to your csv here.
This is working! in the function to load a file:
THIS WORKS with specifing object to reference for the os.path.join:
full_csv_path = os.path.join(dir_path, f'{csv_file}')
THIS DOESNT WORK without specifying an object for the program to reference:
os.path.join(dir_path, f'{csv_file}')
Example below that also includes logging:
import csv, os, logging
# datetime object containing current date and time
now = datetime.now()
dt_string = now.strftime("%m_%d_%Y %H_%M_%S")
dir_path = os.path.dirname(os.path.realpath(__file__))
filename = os.path.join(dir_path, f'log_{dt_string}.log')
# Logger
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
file_handler = logging.FileHandler(filename)
file_handler.setLevel(logging.INFO)
file_handler.setFormatter(logging.Formatter('%(asctime)s - %(levelname)s - %(message)s'))
logger.addHandler(file_handler)
print("dir_path is: ",dir_path)
def load_addresses(csv_file):
try:
print(f"Loading csv file: {csv_file}!")
full_csv_path = os.path.join(dir_path, f'{csv_file}')
print("full_csv_path: ",full_csv_path)
with open(full_csv_path, newline='') as f:
reader = csv.reader(f)
data = list(reader)
# flattens list of lists
csv_file = sum(data, [])
print(f"{csv_file} addresses loaded: ",csv_file)
except Exception as error:
print(f"CSV file load error: {error}")
csv_file = [] # errors out
return csv_file
The problem in your code, I think you did not update the csv_file variable update:
csv_file = os.path.join(script_dir, f'{csv_file}.csv')
Though it can be more simpler:
import os
script_path = os.path.dirname(__file__)
filename = 'sample'
file_path = os.path.join(script_path, f'{filename}.csv')
print(file_path)
You can use pandas
, os
and fnmatch
for this purpose as given below,
import pandas as pd
import os,fnmatch
# change path to the folder from another folder
os.chdir("/path_to_the_folder_having_files_from_another_folder")
# print the current working directory to check if you have arrived
# in the right folder/directory
print(os.getcwd())
# then use fnmatch package to get the CSV files only
# or you can make some other pattern using some prefix or suffix
# like *this.csv or *that.csv
files = fnmatch.filter(os.listdir('.'), '*.csv')
# print the files to check that you have captured the right files
files
# read your files using pandas
# be careful with `sep`, you need to check specifically for your
# files
csv_files = [pd.read_csv(f, low_memory=False,sep = "t",header=0 for f in files]