Pandas dataframe sum of row won't let me use result in equation
Question:
Anybody wish to help me understand why below code doesn’t work?
start_date = '1990-01-01'
ticker_list = ['SPY', 'QQQ', 'IWM','GLD']
tickers = yf.download(ticker_list, start=start_date)['Close'].dropna()
ticker_vol_share = (tickers.pct_change().rolling(20).std())
/ ((tickers.pct_change().rolling(20).std()).sum(axis=1))
Both (tickers.pct_change().rolling(20).std())
and ((tickers.pct_change().rolling(20).std()).sum(axis=1))
runs fine by themselves, but when ran together they form a dataframe with thousands of columns all filled with nan
Answers:
Why its not working as expected:
- Your
tickers
object is a DataFrame, as is the tickers.pct_change()
, tickers.pct_change().rolling(20)
and tickers.pct_change().rolling(20).std()
. The tickers.pct_change().rolling(20).std().sum(axis=1)
is probably a Series.
- You’re therefore doing element-wise division of a DataFrame by a Series. This yields a DataFrame.
- Without seeing your source data, it’s hard to say for sure why the output DF is filled with
nan
, but that can certainly happen if some of the things you’re dividing by are 0. It might also happen if each series is only one element long after taking the rolling average. It might also happen if you’re actually evaluating a Series tickers
rather than a DataFrame, since Series.sum(axis=1)
doesn’t make a whole lot of sense. It is also suspicious that your top and bottom portions of the division are probably different shapes, since sum()
collapses an axis.
It’s not clear to me what your expected output is, so I’ll defer to others or wait for an update before answering that part.
Anybody wish to help me understand why below code doesn’t work?
start_date = '1990-01-01'
ticker_list = ['SPY', 'QQQ', 'IWM','GLD']
tickers = yf.download(ticker_list, start=start_date)['Close'].dropna()
ticker_vol_share = (tickers.pct_change().rolling(20).std())
/ ((tickers.pct_change().rolling(20).std()).sum(axis=1))
Both (tickers.pct_change().rolling(20).std())
and ((tickers.pct_change().rolling(20).std()).sum(axis=1))
runs fine by themselves, but when ran together they form a dataframe with thousands of columns all filled with nan
Why its not working as expected:
- Your
tickers
object is a DataFrame, as is thetickers.pct_change()
,tickers.pct_change().rolling(20)
andtickers.pct_change().rolling(20).std()
. Thetickers.pct_change().rolling(20).std().sum(axis=1)
is probably a Series. - You’re therefore doing element-wise division of a DataFrame by a Series. This yields a DataFrame.
- Without seeing your source data, it’s hard to say for sure why the output DF is filled with
nan
, but that can certainly happen if some of the things you’re dividing by are 0. It might also happen if each series is only one element long after taking the rolling average. It might also happen if you’re actually evaluating a Seriestickers
rather than a DataFrame, sinceSeries.sum(axis=1)
doesn’t make a whole lot of sense. It is also suspicious that your top and bottom portions of the division are probably different shapes, sincesum()
collapses an axis.
It’s not clear to me what your expected output is, so I’ll defer to others or wait for an update before answering that part.