Numpy – How to get an array of the pattern gamma^t for some 0-t?

Question:

I am creating a basic gridworld RL problem and I need to calculate the return for some given episode. I currently have the array of rewards, and I would like to element-wise multiply this with a list of the form:

[gamma**0, gamma**1, gamma**2, ....]

In order to get:

[r_0*gamma**0, r_1*gamma**1, r_2*gamma**2, ....]

and then use np.sum() to get the entire return.

How can I complete that first step? I tried using Logspace, but it isn’t quite what I want (or I’m doing it wrong).

Asked By: Alenna Spiro

||

Answers:

if the example if like this for reward array and gamma is some value:

n = 20    
reward = np.random.randint(0, 10, n)
gamma = 2

np.sum(reward * (gamma ** np.arange(n)))
Answered By: amirhm