python how to pad numpy array with zeros

NumPy 1.7.0 (when numpy.pad was added) is pretty old now (it was released in 2013) so even though the question asked for a way without using that function I thought it could be useful to know how that could be achieved using numpy.pad.

It’s actually pretty simple:

>>> import numpy as np
>>> a = np.array([[ 1.,  1.,  1.,  1.,  1.],
...               [ 1.,  1.,  1.,  1.,  1.],
...               [ 1.,  1.,  1.,  1.,  1.]])
>>> np.pad(a, [(0, 1), (0, 1)], mode="constant")
array([[ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.]])

In this case I used that 0 is the default value for mode="constant". But it could also be specified by passing it in explicitly:

>>> np.pad(a, [(0, 1), (0, 1)], mode="constant", constant_values=0)
array([[ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.]])

Just in case the second argument ([(0, 1), (0, 1)]) seems confusing: Each list item (in this case tuple) corresponds to a dimension and item therein represents the padding before (first element) and after (second element). So:

[(0, 1), (0, 1)]
         ^^^^^^------ padding for second dimension
 ^^^^^^-------------- padding for first dimension

  ^------------------ no padding at the beginning of the first axis
     ^--------------- pad with one "value" at the end of the first axis.

In this case the padding for the first and second axis are identical, so one could also just pass in the 2-tuple:

>>> np.pad(a, (0, 1), mode="constant")
array([[ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.]])

In case the padding before and after is identical one could even omit the tuple (not applicable in this case though):

>>> np.pad(a, 1, mode="constant")
array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  1.,  1.,  1.,  1.,  0.],
       [ 0.,  1.,  1.,  1.,  1.,  1.,  0.],
       [ 0.,  1.,  1.,  1.,  1.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.]])

Or if the padding before and after is identical but different for the axis, you could also omit the second argument in the inner tuples:

>>> np.pad(a, [(1, ), (2, )], mode="constant")
array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  1.,  1.,  1.,  1.,  1.,  0.,  0.],
       [ 0.,  0.,  1.,  1.,  1.,  1.,  1.,  0.,  0.],
       [ 0.,  0.,  1.,  1.,  1.,  1.,  1.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]])

However I tend to prefer to always use the explicit one, because it’s just to easy to make mistakes (when NumPys expectations differ from your intentions):

>>> np.pad(a, [1, 2], mode="constant")
array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  1.,  1.,  1.,  1.,  0.,  0.],
       [ 0.,  1.,  1.,  1.,  1.,  1.,  0.,  0.],
       [ 0.,  1.,  1.,  1.,  1.,  1.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]])

Here NumPy thinks you wanted to pad all axis with 1 element before and 2 elements after each axis! Even if you intended it to pad with 1 element in axis 1 and 2 elements for axis 2.

I used lists of tuples for the padding, note that this is just “my convention”, you could also use lists of lists or tuples of tuples, or even tuples of arrays. NumPy just checks the length of the argument (or if it doesn’t have a length) and the length of each item (or if it has a length)!

More Related Contents:

Leave a Comment Cancel reply