Reshaping Arrays: How the NumPy Reshape Operation Works

The NumPy reshape operation changes the shape of an array so that it has a new (but compatible) shape. The rules are:

The number of elements stays the same.
The order of the elements stays the same[1].

Basic usage

Here’s a simple example that takes a 4×4 identity matrix and turns it into an array with shape (2, 8):

import numpy as np

x = np.eye(4)
x.reshape(2, 8)

# Expected result
# array([[1., 0., 0., 0., 0., 1., 0., 0.],
#        [0., 0., 1., 0., 0., 0., 0., 1.]])

To understand what’s going on, you can think of reshape as 1) flattening the input array and 2) inserting each element into the output array with new shape. That means, the above operation is equivalent to:

np.reshape(x.ravel(), (2, 8))

# Expected result
# array([[1., 0., 0., 0., 0., 1., 0., 0.],
#        [0., 0., 1., 0., 0., 0., 0., 1.]])

Here’s what that looks like:

Notice a couple things:

The number of elements doesn’t change.
Elements are selected in row-major order: left to right, top to bottom[2].

Knowing that the order won’t change is important because it means you can go backwards. No matter what shape you convert an array to, you can always recover the original array by passing in the new array with the original shape:

np.arange(9).reshape(1, 3, 1, 3, 1).reshape(9)

# Expected result
# array([0, 1, 2, 3, 4, 5, 6, 7, 8])

API

The reshape operation can be used as a top-level function np.reshape() or as a method on an array .reshape(). Let’s start by looking at the function.

You should pay attention to two arguments for np.reshape(). An array-like input, called a, and an integer or tuple of integers specifying the output shape, called newshape.

Remember, we can’t change the number of elements in the output array, so the product of newshape has to be the same as the total number of elements in a (with one exception described in the next paragraph). Otherwise, you’ll get a ValueError.

Unknown size

The exception to this rule, is that one of the dimensions in newshape can be -1, which roughly means “unknown size”[3]. As long as the rest of the newshape tuple is set, NumPy can figure out the unknown dimension since the total number of elements can’t change.

For example:

np.reshape(x, (8, -1))

# Expected result
# array([[1., 0.],
#        [0., 0.],
#        [0., 1.],
#        [0., 0.],
#        [0., 0.],
#        [1., 0.],
#        [0., 0.],
#        [0., 1.]])

Here, we tell NumPy that the first axis should have size 8 and we want to fill in the next axis with whatever size we need, which in this case is 2.

Method arguments

When you call .reshape() as a method on an array, the first argument is implicitly self and you only pass in the new shape. One usage quirk to be aware of: newshape can be passed in as separate arguments:

x.reshape(-1, 8)

# Expected result
# array([[1., 0., 0., 0., 0., 1., 0., 0.],
#        [0., 0., 1., 0., 0., 0., 0., 1.]])

This is a nice shortcut, but it’s not a requirement. You can pass in newshape as a tuple to the method as well (e.g. .reshape((-1, 8))).

Memory

There’s one important gotcha to watch out for when reshaping an array. Whenever possible, NumPy will create a new view of the array with the requested shape instead of copying all of the elements. This is analogous to using .ravel() to flatten an array instead of .flatten().

The benefit of returning a view is that the operation is faster and more memory-efficient. But you have to know that this might happen because modifying the new array in-place will update the original:

x = np.eye(4)
y = x.reshape(-1)
y += 1
x

# Expected result
# array([[2., 1., 1., 1.],
#        [1., 2., 1., 1.],
#        [1., 1., 2., 1.],
#        [1., 1., 1., 2.]])

If the input elements are non-contiguous, .reshape() is not able to update a view. Instead, it makes a copy:

x = np.eye(4)
y = x.T.reshape(-1)
y += 1
x

# array([[1., 0., 0., 0.],
#        [0., 1., 0., 0.],
#        [0., 0., 1., 0.],
#        [0., 0., 0., 1.]])

This silent difference can cause sneaky bugs if you’re not paying attention. So if you need to know whether you have a view or a copy, you can use the np.shares_memory() function:

x = np.eye(4)
y1 = x.reshape(-1)
y2 = x.T.reshape(-1)

# y1 is a view
np.shares_memory(x, y1)

# Expected result
# True

# y2 is a copy
np.shares_memory(x, y2)

# Expected result
# False

Example

A common use case for reshaping an array (or a tensor if you’re working in one of the deep learning frameworks) is flattening the output of the last convolutional layer in an image classifier so that you can send it to a linear layer. Fortunately, TensorFlow and PyTorch both have a top-level reshape() function with the same API as NumPy.

Say you have a tensor with shape (batch_size, n_row, n_col, n_features) and you want it to have shape (batch_size, n_row * n_col * n_features). Just use the reshape operation:

import torch

x = torch.randn((4, 10, 10, 3))
torch.reshape(x, (4, -1)).shape

# Expected result
# torch.Size([4, 300])

The code is almost identical in TensorFlow. Keras also provides Reshape and Flatten layers in case you don’t want to mess with the actual operation.

Notes

Ok, this is not strictly true. The actual rule is: the order will be whatever you specify, which defaults to order='C', which means “C-like” order a.k.a. row-major order. If you have a very good reason to change the array ordering in NumPy, then do so carefully. Otherwise, avoid it. The TensorFlow and PyTorch reshape operations (tf.reshape() and torch.reshape() respectively) don’t expose this parameter. In NumPy, I follow this convention and ignore it.
More generally, the last index changes the fastest. Since a matrix has shape (n_rows, n_cols), the columns change first. If you have an array with shape (a, b, c), the c axis will change first, followed by b then a.
The use of -1 to mean “unknown size” is a bit of overloading. Typically, in the context of slicing and indexing, -1 means “the last element”.