Loading Files and Data from Absolute and Relative Path in Python


I have the following project structure in a Python project:

> nn-project  
>  - src
>    - models
>      - bird-model
>        - env.py
>        - train_model.py

I have in my .env file, the following:


In my env.py, I do the following:

project_root = os.environ.get('PROJECT_ROOT')
if not project_root:
    raise ValueError("PROJECT_ROOT environment variable is not set.")
absolute_path = os.path.abspath(project_root)
data_dir = Path(os.path.join(absolute_path, 'data/raw/boston_housing_price/'))
models_dir = Path(os.path.join(absolute_path, 'models/boston_housing_price/'))
print('***************** LOAD ENVIRONMENT ********************+')
print("Project Root DIR", project_root)
print("Project Root DIR abs", absolute_path)
print("Project Data DIR", data_dir)
print("Models Dump DIR", models_dir)
print('***************** LOAD ENVIRONMENT ********************+')

I get to see the following printed:

***************** LOAD ENVIRONMENT ********************+
Project Root DIR ../nn-project/
Project Root DIR abs /home/user/Projects/Private/ml-projects/nn-project/nn-project
Project Data DIR /home/user/Projects/Private/ml-projects/nn-project/nn-project/data/raw/boston_housing_price
Models Dump DIR /home/user/Projects/Private/ml-projects/nn-project/nn-project/models/boston_housing_price
***************** LOAD ENVIRONMENT ********************+

I’m intrigued by the nn-project being printed twice. Why is that? What am I missing?

I’m doing the following in my env.py:

from dotenv import find_dotenv
from dotenv import load_dotenv

env_file = find_dotenv(".env")
Asked By: joesan



I like to use the pattern where I anchor to a given file with known path using __file__ and then walk to where I know the target file is located.

# src/models/bird_model/env.py
from pathlib import Path

def find_dotenv():
    this_file = Path(__file__)
    return this_file.parent.parent.parent.parent.joinpath('.env').resolve()

if __name__ == '__main__':

This works regardless of where the current working directory is when you execute the script.

You can also use double dots in the path to walk up the tree, in which case it helps to call .resolve() to get the clean path. Actually it is usually a good idea to use resolve any time __file__ is involved in general.


__file__ is a special variable which maps to the path of the file it resides in.

Answered By: DeusXMachina