Array shapes and broadcasting

Shapes of numpy arrays

The shape of a numpy array describes the range of its indices in each dimension: how many rows it has, how many columns etc. It can be obtained using the shape property:

[31]:
import numpy as np

a = np.zeros((3, 4))
a.shape
[31]:
(3, 4)

We can change the shape of an array using the reshape() method:

[32]:
b = np.arange(12)
b
[32]:
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
[33]:
# reshape into 3 rows, 4 columns
c = b.reshape(3, 4)
c
[33]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
[34]:
# reshape into 12 rows, 1 column
d = b.reshape(12, 1)
d
[34]:
array([[ 0],
       [ 1],
       [ 2],
       [ 3],
       [ 4],
       [ 5],
       [ 6],
       [ 7],
       [ 8],
       [ 9],
       [10],
       [11]])
[35]:
# reshape into 1 row, 12 columns
e = b.reshape(1, 12)
e
[35]:
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11]])

Note. Notice that the last array e is different than the original array b. The array b is 1-dimensional. For this reason we need only one index to access its elements:

[36]:
b.shape
[36]:
(12,)
[37]:
b[3]
[37]:
3

On the other hand e is a 2-dimensional array with 1 row and 12 columns, and we need two indices to specify its elements:

[38]:
e.shape
[38]:
(1, 12)
[39]:
e[0, 5]
[39]:
5

The flatten() method returns a copy of an array flattened to one dimension:

[40]:
# create an array with 4 rows and 3 columns
rng = np.random.default_rng(0)
a = rng.integers(0, 100, 12).reshape(4, 3)
a
[40]:
array([[85, 63, 51],
       [26, 30,  4],
       [ 7,  1, 17],
       [81, 64, 91]])
[41]:
# flatten the array
a.flatten()
[41]:
array([85, 63, 51, 26, 30,  4,  7,  1, 17, 81, 64, 91])

Broadcasting

Broadcasting is a feature of numpy arrays which lets us perform arithmetic operations involving arrays of different shapes. In its simplest instance it lets us take the sum of an array and a number:

[42]:
a = np.arange(12).reshape(3, 4)
a
[42]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
[43]:
a + 10
[43]:
array([[10, 11, 12, 13],
       [14, 15, 16, 17],
       [18, 19, 20, 21]])

We can also add a 2-dimensional array a and a 1-dimensional array b:

[46]:
b = np.array([100, 200, 300, 400])

print(f"a =\n{a}\n")
print(f"b =\n{b}")
a =
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

b =
[100 200 300 400]
[47]:
a + b
[47]:
array([[100, 201, 302, 403],
       [104, 205, 306, 407],
       [108, 209, 310, 411]])

The above example can be though of as stretching the array b to the same shape as a by replicating the row of b, and then adding corresponding elements of the 2-dimensional arrays.

Not every two arrays are compatible for broadcasting. For example addition of a 3x4 array and a 2x4 array will fail:

[48]:
c = 10 * np.arange(8).reshape(2, 4)

print(f"a=\n{a}\n")
print(f"c=\n{c}")
a=
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

c=
[[ 0 10 20 30]
 [40 50 60 70]]
[49]:
a + c
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/var/folders/vd/9gpvwb493r52y4sgtl_fvtvm0000gn/T/ipykernel_10806/2107169428.py in <module>
----> 1 a + c

ValueError: operands could not be broadcast together with shapes (3,4) (2,4)

Broadcasting rules

The rules that determine when two arrays are compatible for broadcasting and how broadcasting is performed are as follows:

Broadcasting Rules

  • Rule 1: If the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is padded with ones on its leading (left) side.

  • Rule 2: If the shape of the two arrays does not match in some dimension, the array with shape equal to 1 in that dimension is stretched to match the other shape.

  • Rule 3: If in any dimension the sizes disagree and neither is equal to 1, an error is raised.

From: Jake VanderPlas, Python Data Science Handbook.

Example.

Here are again the arrays a, b, and c we used above:

[58]:
print(f"a =\n{a}\n")
print(f"b =\n{b}\n")
print(f"c =\n{c}\n")
print(f"{a.shape=}")
print(f"{b.shape=}")
print(f"{c.shape=}")
a =
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

b =
[100 200 300 400]

c =
[[ 0 10 20 30]
 [40 50 60 70]]

a.shape=(3, 4)
b.shape=(4,)
c.shape=(2, 4)

The shape of a is (3,4) and the shape of b is (4,). Since b has fewer dimensions, we pad its shape with 1 in the front, i.e. we consider it as a 2-dimensional array with a single row. This gives the following shapes:

  • a.shape → (3,4)

  • b.shape → (1,4)

The second dimensions match. Since b has 1 as the first dimension, it gets stretched to the dimension of a when broadcasting is performed. In effect, the operation a + b adds b to each row of a.

In the case a and c, both arrays are 2-dimensional, but their first dimensions do not match and neither is equal to 1:

  • a.shape → (3,4)

  • b.shape → (2,4)

For this reason broadcasting of these arrays cannot be performed.

Example.

Consider the following arrays a and d:

[60]:
d = np.array([10, 20, 30]).reshape(3, 1)

print(f"a =\n{a}\n")
print(f"d =\n{d}\n")
print(f"{a.shape=}")
print(f"{d.shape=}")
a =
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

d =
[[10]
 [20]
 [30]]

a.shape=(3, 4)
d.shape=(3, 1)

In this case both arrays are 2-dimensional, and their first dimensions match. Since the array d has 1 in the second dimension, it gets stretched in this dimension, which replicates its single column. The result is that a + d adds d to every column of a.

[61]:
a + d
[61]:
array([[10, 11, 12, 13],
       [24, 25, 26, 27],
       [38, 39, 40, 41]])

Example.

Take the following arrays b and d:

[62]:
print(f"b =\n{b}\n")
print(f"d =\n{d}\n")
print(f"{b.shape=}")
print(f"{d.shape=}")
b =
[100 200 300 400]

d =
[[10]
 [20]
 [30]]

b.shape=(4,)
d.shape=(3, 1)

The array b has one dimension, and d has two. According Rule 1, we pad the shape of b to match the number of dimensions of d:

  • b.shape → (1,4)

  • d.shape → (3,1)

This shows that arrays are compatible for broadcasting, and that the broadcasting will be performed by stretching b along the first dimension (i.e. replicating its row) and stretching d along the second dimension (replicating its single column):

[63]:
b + d
[63]:
array([[110, 210, 310, 410],
       [120, 220, 320, 420],
       [130, 230, 330, 430]])

More explicitly, the array b + d has the same values as the sum of the following arrays b_stretched and d_stretched:

[65]:
b_stretched, d_stretched = np.broadcast_arrays(b, d)
print(f"b_stretched =\n{b_stretched }\n")
print(f"d_stretched =\n{d_stretched }\n")
b_stretched =
[[100 200 300 400]
 [100 200 300 400]
 [100 200 300 400]]

d_stretched =
[[10 10 10 10]
 [20 20 20 20]
 [30 30 30 30]]

[66]:
b_stretched + d_stretched
[66]:
array([[110, 210, 310, 410],
       [120, 220, 320, 420],
       [130, 230, 330, 430]])