Why using an array as an index changes the shape of a multidimensional ndarray?

As @hpaulj mentioned in the comments, this behaviour is because of mixing basic slicing and advanced indexing:

a = np.arange(120).reshape(4,5,3,2)
b = a[:,[1,2,3,4,0],:,0]

In the above code snippet, what happens is the following:

  • when we do basic slicing along last dimension, it triggers a __getitem__ call. So, that dimension is gone. (i.e. no singleton dimension)
  • [1,2,3,4,0] returns 5 slices from second dimension. There are two possibilities to put this shape in the returned array: either at the first or at the last position. NumPy decided to put it at the first dimension. This is why you get 5 (5, ...) in the first position in the returned shape tuple. Jaime explained this in one of the PyCon talks, if I recall correctly.

  • Along first and third dimension, since you slice everything using :, the original length along those dimensions is retained.

Putting all these together, NumPy returns the shape tuple as: (5, 4, 3)

You can read more about it at numpy-indexing-ambiguity-in-3d-arrays and arrays.indexing#combining-advanced-and-basic-indexing

Leave a Comment