Python: slicing a multi-dimensional array

Question:

I know how to slice 1-dimensional sequence: arr[start:end], and access an element in the array: el = arr[row][col].

Now, I’m trying something like slice = arr[0:2][0:2] (where arr is a numpy array) but it doesn’t give me the first 2 rows and columns, but repeats the first 2 rows. What did I just do, and how do I slice along another dimension?

Asked By: SlightlyCuban

||

Answers:

If you use numpy, this is easy:

slice = arr[:2,:2]

or if you want the 0’s,

slice = arr[0:2,0:2]

You’ll get the same result.

*note that slice is actually the name of a builtin-type. Generally, I would advise giving your object a different “name”.


Another way, if you’re working with lists of lists*:

slice = [arr[i][0:2] for i in range(0,2)]

(Note that the 0’s here are unnecessary: [arr[i][:2] for i in range(2)] would also work.).

What I did here is that I take each desired row 1 at a time (arr[i]). I then slice the columns I want out of that row and add it to the list that I’m building.

If you naively try: arr[0:2] You get the first 2 rows which if you then slice again arr[0:2][0:2], you’re just slicing the first two rows over again.

*This actually works for numpy arrays too, but it will be slow compared to the “native” solution I posted above.

Answered By: mgilson

To slice a multi-dimensional array, the dimension (i.e. axis) must be specified. As OP noted, arr[i:j][i:j] is exactly the same as arr[i:j] because arr[i:j] sliced along the first axis (rows) and has the same number of dimensions as arr (you can confirm by arr[i:j].ndim == arr.ndim); so the second slice is still slicing along the first dimension (which was already done by the first slice). To slice along the second dimension, it must be explicitly specified, e.g.:

arr[:2][:, :2]                   # its output is the same as `arr[:2, :2]`

A bare : means slice everything in that axis, so there’s an implicit : for the second axis in the above code (i.e. arr[:2, :][:, :2]). What the above code is doing is slicing the first two rows (or first two arrays along the first axis) and then slice the first two columns (or the first two arrays along the second axis) from the resulting array.

An ... can be used instead of multiple colons (:), so for a general n-dimensional array, the following produce the same output:

w = arr[i:j, m:n]
x = arr[i:j, m:n, ...]
y = arr[i:j][:, m:n]
z = arr[i:j, ...][:, m:n, ...]

That said, arr[:2, :2] is the canonical way because in the case of arr[i:j][:, i:j], arr[i:j] creates a temporary array which is indexed by [:, i:j], so it’s comparatively inefficient.

However, there are cases where chained indexing makes sense (or readable), e.g., if you want to index a multi-dimensional array using a list of indices. For example, if you want to slice the top-left quarter of a 4×4 array using a list of indices, then chained indexing gives the correct result whereas a single indexing gives a different result (it’s because of numpy advanced indexing) where the values correspond to the index pair for each position in the index lists.

arr = np.arange(1,17).reshape(4,4)
rows = cols = [0,1]
arr[rows][:, cols]               # <--- correct output
arr[rows, cols]                  # <--- wrong output
arr[[[e] for e in rows], cols]   # <--- correct output
arr[np.ix_(rows, cols)]          # <--- correct output
Answered By: cottontail
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.