How do I correctly load data in Python?

Question:

I am trying to replicate @miabrahams ACM model which is on Github here: https://github.com/miabrahams/PricingTermStructure

I am coming across two errors relating to how I load data. I’m a Python novice so I’m sure it’s a simple solution but I can’t figure out how to fix this problem.

The problem is this:

In their file for loading the data load_gsw.py, they define a function:

def load_gsw(filename: str, n_maturities: int):
    data = pd.read_excel(filename, parse_dates=[0])
    data = data.set_index('Date')
    data = data.resample('BM').last() # Convert to EOM observations
    data.index = data.index + DateOffset() # Add one day
    plot_dates = pd.DatetimeIndex(data.index).to_pydatetime() # pydatetime is best for matplotlib

Leaving the term filename in place yields an error when I run the script so I am pretty sure that I need to substitute filename with the data they’ve provided. Therefore I added a line like this:

filename = '/Users/SystemError/Desktop/Python/gsw_ns_params.xlsx'

def load_gsw(filename: str, n_maturities: int):
    data = pd.read_excel(filename, parse_dates=[0])
    data = data.set_index('Date')
    data = data.resample('BM').last()  # Convert to EOM observations
    data.index = data.index + DateOffset()  # Add one day
    plot_dates = pd.DatetimeIndex(data.index).to_pydatetime()  # pydatetime is best for matplotlib

However, then in their script for running the model, PricingTermStructure.ipynb they use the function described above to parse data in a different way:

from load_gsw import * 

rawYields, plot_dates = load_gsw('data/gsw_ns_params.xlsx', n_maturities)
t = rawYields.shape[0] - 1  # Number of observations

I have tried not defining filename and also swapping 'data/gsw_ns_params.xlsx' with '/Users/SystemError/Desktop/Python/gsw_ns_params.xlsx' but I keep getting the same error:

  File "/Users/SystemError/Desktop/Python/acm model.py", line 60, in <module>
    rawYields, plot_dates = load_gsw('data/gsw_ns_params.xlsx', n_maturities)

Any idea what I’m doing wrong? Thanks in advance for whatever assistance you can provide!

Asked By: user20140098

||

Answers:

You have to use it like this:

filename = '/Users/SystemError/Desktop/Python/gsw_ns_params.xlsx'
rawYields, plot_dates = load_gsw(filename, n_maturities)

filename in the first code block in your example is a function argument and it is local to the function. It is meant to be replaced when called with an actual value or another variable.
You have defined a global variable filename so you have to use it in the function call. That or just put the path in there like this:

rawYields, plot_dates = load_gsw('/Users/SystemError/Desktop/Python/gsw_ns_params.xlsx', n_maturities)
Answered By: KazzyJr
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.