A Bug in Numpy? | Strawberry Fields

When I was working with high-dimensional NumPy array manipulations to process road agent interaction data, I encountered some unexpected behavior: for arrays with more than three dimensions, certain slicing operations seemed to cause axis reordering, resulting in downstream shape mismatch errors. For example:

>>> x = np.random.random((1,24,5,6))

# trying to squeeze first dimension and select 2-6 from last dimension
>>> y = x[0, :, np.arange(5), 2:6]
# but surprisingly it does not return (24, 5, 4)!
>>> y.shape
(5, 24, 4)
>>> x.shape
(1, 24, 5, 6)
# expected shape after transpose the result
>>> y = np.transpose(o[0, :, np.arange(5), 2:6], (1,0,2))
>>> y.shape
(24, 5, 4)

This inadvertent axis flip looks very suspicious, and I would not let it slip without finding a reasonable explanation. As a mature, well-maintained, and widely-used library for basic computation use, NumPy is unlikely to have committed such a simple mistake, so there must be a reason behind this phenomenon, or to put it more professionally, to make this deliberate design choice. Time for some digging in.

As I play around more with even higher dimension of indexing, I found it’s not just the axes that got flipped, but even the expected shape might be different:

>>> x = np.arange(48).reshape(3, 4, 4)  # shape: (3, 4, 4)

# Expecting 3 x 4 x 2
>>> y = x[:, [[0,1],[2,3]], [0,1]]
# but instead, got 3 x 2 x 2
>>> y.shape
(3, 2, 2)

so why does this happen? To investigate further, I tried following:

>>> w = np.arange(1080).reshape((3,12,6,5))
>>> u = w[0, :, :5, :4]
>>> v = w[0, :, range(5), :4]
>>> q = w[0, :, :5, range(4)]
>>> t = w[0, :, range(5), range(4)]

guess what are the shapes?

>>> u.shape
>>> (12, 5, 4)
>>> v.shape
>>> (5, 12, 4)
>>> q.shape
>>> (4, 12, 5)
>>> t.shape # raises IndexError due to (5,), (4,) shape mismatch

Only u returned the expected shape. Both v and q reordered the axes, and t failed due to incompatible broadcasting. Here we can see some common patterns: seems like the way we slice the array makes a difference. Whenever I use range and [], the output shapes are not as expected. It seems like such slicing method will always be placed at the front of all axes!

With a bit of confirmation from NumPy documentation, I found following pattern:

There are two types of indexing: 1. basic indexing, including single element indexing (integer value index), slicing and striding (using start:stop:step syntax), and 2. advanced indexing, including array and range indexing. Using range or indexing array triggers the advanced indexing.
When both basic and advanced indexing are used, NumPy groups all advanced indices, broadcasts them together, and places their resulting shape at the front of the result.

This corresponds to the shapes of u, v, q . t has shape mismatch because NumPy cannot broadcast shape (5,) and (4,) together. So in the case of y = x[:, [[0,1],[2,3]],[0,1]], it works as follows:

the array contains two instances of advanced indexing, [[0,1],[2,3]] with shape (2,2) and [0,1] with shape (2,), which together broadcast to (2,2). The basic axis 0 is then appended in the end, resulting in the shape of (2,2,3). But wait—what does broadcasting index mean? It is much more intuitive than it appears: creating a grid of index tuples, just like np.meshgrid. In this case, we are combining the two advanced indices to construct one array of index pairs which in turn just selects [[x_[0, 0], x_[1, 1]], [x_[2, 0], x_[3, 1]]] and stack them along the original axis 0.

But why? The core motivation is to eliminate ambiguity. Consider an array x with shape (A, B, C, D). Suppose we want to slice it by x[:, ind1, :, ind2], where:

ind1.shape == (M,)  # for axis 1
ind2.shape == (N,)  # for axis 3

Should this return a shape (A, M, C, N)? Seems natural, until:

ind1.shape = (M, 1)
ind2.shape = (1, N)

These now broadcast to shape (M, N). But once broadcasted, the origin of each dimension (which axis it was meant for) is lost. We have a (M, N) block of coordinate pairs, and we can no longer assign one dimension to axis 1 and the other to axis 3. There is no clean rule to do so — especially for non-adjacent axes.

In order to eliminate this ambiguity, NumPy strictly enforces the rule of appending the original basic indexed dimensions after the processed advanced indexing:

When the advanced indices are separated by a slice, Ellipsis or newaxis. For example x[arr1, :, arr2]. The dimensions resulting from the advanced indexing operation come first in the result array, and the subspace dimensions after that.

When the advanced indices are all next to each other. For example x[…, arr1, arr2, :] but not x[arr1, :, 1]. The dimensions from the advanced indexing operations are inserted into the result array at the same spot as they were in the initial array (the latter logic is what makes simple advanced indexing behave just like slicing).

Understanding this rule is critical when working with high-dimensional array manipulations. With this knowledge, we can reshape or transpose the result as needed to get the desired layout.

Enjoy Reading This Article?