The meshgrid operation creates multi-dimensional coordinate arrays:
import numpy as np x = np.arange(2) y = np.arange(3) ii, jj = np.meshgrid(x, y, indexing='ij') ii # Expected result # array([[0, 0, 0], # [1, 1, 1]])
np.meshgrid() returns a tuple with the inputs broadcast across rows and columns (and higher dimensions if you pass more vectors in). Here’s what the
jj array looks like:
jj # Expected result # array([[0, 1, 2], # [0, 1, 2]])
I’ll talk about the
indexing='ij' argument below. But first, let’s take a look at an example.
A realistic use case
Ok, let’s look at an example. Say we have a simple linear regression model
y = w0 + w1 * x where
w0 is the intercept and
w1 is the slope. If
x is a 1-dimensional NumPy array of inputs and
y is a 1-dimensional NumPy array of targets, we can calculate the mean-squared error (MSE) loss for a given
w1 with the following:
yhat = w0 + w1 * x loss = np.square(yhat - y).mean()
Notice, if our data is fixed, then the loss becomes a function of
w1. That means we can plot the loss surface to see how it varies with
w1. This is a perfect use case for
First, let’s fix
w1=2.5 and then generate 50 data points using this model plus some random noise:
x = np.random.normal(0, 1, size=50) y = 3 + 2.5 * x + np.random.normal(0, 1, size=50)
A scatter plot of these points points will look something like this. You can see visually that the y-intercept is roughly 3 and the slope is roughly 2.5.
Now we want to create all possible values of
w1 in the range
[-5, 5] so we can compute the loss value for those pairs:
w0, w1 = np.meshgrid( np.linspace(-5, 5), np.linspace(-5, 5), indexing='ij', ) w0.shape # Expected result # (50, 50)
At this point
w1 both have shape
(50, 50). That’s because
np.linspace() creates a 50 evenly spaced values by default. If we flatten
w1 and line up the elements, we get 2500 evenly spaced points on a grid.
Let’s take a look at the first 10 points:
list(zip(w0.ravel(), w1.ravel()))[:10] # Expected result # [(-5.0, -5.0), # (-5.0, -4.795918367346939), # (-5.0, -4.591836734693878), # (-5.0, -4.387755102040816), # (-5.0, -4.183673469387755), # (-5.0, -3.979591836734694), # (-5.0, -3.7755102040816326), # (-5.0, -3.571428571428571), # (-5.0, -3.36734693877551), # (-5.0, -3.163265306122449)]
Now, we can compute the model for each
(w0, w1) pair for all 50 random data points. That allows us to compute MSE for each of the weight values:
yhat = w0[:, None] + w1[:, None] * x loss = np.square(y - yhat).mean(-1)
Here we expand the
w1 arrays so that we can use NumPy broadcasting to compute
yhat for all points at the same time. The result
yhat has shape
(2500, 50): it’s the predicted value for all pairs of
(w0, w1) for all 50 points. After that, we use the mean function to reduce the column dimension and compute MSE for each point on the grid.
Now, we can reshape loss to be
(50, 50) and plot loss for each pair of weights:
loss = loss.reshape(w0.shape) plt.contourf(w0, w1, loss, 30) plt.plot(3, 2.5, 'x', c='white', ms=7) plt.xlabel("w0") plt.ylabel("w1") plt.title("Loss surface");
That’s a nice looking loss surface! I’ve added a mark at the true values for
w1. The loss surface will never have its optimum at exactly the true values since it’s learning the function on a random sample of all data, but this is pretty close!
Ok let’s revisit indexing. The
indexing='ij' argument tells NumPy to use matrix indexing instead of Cartesian indexing. For 2-dimensional meshgrids, matrix indexing produces coordinate arrays that are the transpose of Cartesian coordinate arrays. The question is whether those coordinate arrays should be interpreted as the indices of a matrix (matrix indexing) or points in Euclidean space (Cartesian indexing).
Here’s a diagram that displays the index pairs for
Notice for Cartesian indexing, the x-axis increases as your move “to the right” and the y-axis increases as you move “down”.
Other tensor libraries
In NumPy, you actually don’t need meshgrid() but it’s good to know it because the same function exists in PyTorch and TensorFlow. Although the API is more limited in both cases (and the default indexing for PyTorch is different) the basic usage is roughly same:
import torch x = torch.arange(2) y = torch.arange(2) xx, yy = torch.meshgrid(x, y) xx # Expected result # tensor([[0, 0], # [1, 1]])
import tensorflow as tf x = tf.range(2) y = tf.range(2) xx, yy = tf.meshgrid(x, y, indexing='ij') xx # Expected result # <tf.Tensor: shape=(2, 2), dtype=int32, numpy= # array([[0, 0], # [1, 1]], dtype=int32)>
One thing to watch out for: in NumPy and TensorFlow, the default indexing is Cartesian, whereas in PyTorch, the default indexing is matrix. If you want to avoid confusion, you can plan to set
indexing='ij' whenever you call
meshgrid() in NumPy or TensorFlow. You probably don’t want to force yourself to remember which indexing gets returned by default.