Array operations - slicing, dicing, searching¶

In [1]:

Copied!

import numpy as np
import numpy as np

In [2]:

Copied!

arr1 = np.random.randint(10,30, size=8)
arr1
arr1 = np.random.randint(10,30, size=8)
arr1

Out[2]:

array([25, 10, 18, 10, 16, 22, 14, 26])

In [3]:

Copied!

arr2 = np.random.randint(20,200,size=50).reshape(5,10)  #method chaining - numbers from 0 to 50
arr2
arr2 = np.random.randint(20,200,size=50).reshape(5,10)  #method chaining - numbers from 0 to 50
arr2

Out[3]:

array([[147, 134,  58,  21,  90, 193, 135, 179, 129, 113],
       [ 85, 161,  31, 123, 191, 166,  52,  25,  94, 184],
       [174, 149, 143, 123, 126, 143,  59, 180, 116, 105],
       [ 78, 198, 161, 152, 167,  84, 104, 128, 173, 140],
       [181,  47, 114, 145, 139, 180, 183, 125,  41,  46]])

Array slicing¶

get elements using index like in a List

In [4]:

Copied!

arr1[0]
arr1[0]

Out[4]:

In [5]:

Copied!

arr1[3]
arr1[3]

Out[5]:

In [6]:

Copied!

arr1[:3] #get the first 3 elements. Gets lower bounds inclusive, upper bound exclusive
arr1[:3] #get the first 3 elements. Gets lower bounds inclusive, upper bound exclusive

Out[6]:

array([25, 10, 18])

In [7]:

Copied!

arr1[2:] #lower bound inclusive
arr1[2:] #lower bound inclusive

Out[7]:

array([18, 10, 16, 22, 14, 26])

In [8]:

Copied!

arr1[2:5] #get elements at index 2,3,4
arr1[2:5] #get elements at index 2,3,4

Out[8]:

array([18, 10, 16])

nD array slicing¶

In [9]:

Copied!

arr2
arr2

Out[9]:

array([[147, 134,  58,  21,  90, 193, 135, 179, 129, 113],
       [ 85, 161,  31, 123, 191, 166,  52,  25,  94, 184],
       [174, 149, 143, 123, 126, 143,  59, 180, 116, 105],
       [ 78, 198, 161, 152, 167,  84, 104, 128, 173, 140],
       [181,  47, 114, 145, 139, 180, 183, 125,  41,  46]])

In [10]:

Copied!

arr2[0,0] #style 1 - you pass in a list of indices
arr2[0,0] #style 1 - you pass in a list of indices

Out[10]:

In [11]:

Copied!

arr2[0][0] #style 2 - parse it as list of lists - not so popular
arr2[0][0] #style 2 - parse it as list of lists - not so popular

Out[11]:

In [12]:

Copied!

arr2[1] # get a full row
arr2[1] # get a full row

Out[12]:

array([ 85, 161,  31, 123, 191, 166,  52,  25,  94, 184])

Array dicing¶

In [13]:

Copied!

#get the second column
arr2[:,1]
#get the second column
arr2[:,1]

Out[13]:

array([134, 161, 149, 198,  47])

Thus, you specify : for all columns, followed by 1 for column. And you get a 1D array of the result

In [14]:

Copied!

#get the 3rd row
arr2[2,:] #which is same as arr2[2]
#get the 3rd row
arr2[2,:] #which is same as arr2[2]

Out[14]:

array([174, 149, 143, 123, 126, 143,  59, 180, 116, 105])

In [15]:

Copied!

#get the center 3,3 elements - columns 4,5,6 and rows 1,2,3
arr2[1:4, 4:7]
#get the center 3,3 elements - columns 4,5,6 and rows 1,2,3
arr2[1:4, 4:7]

Out[15]:

array([[191, 166,  52],
       [126, 143,  59],
       [167,  84, 104]])

Array broadcasting¶

NumPy allows bulk assigning values, just like in matlab

In [16]:

Copied!

arr2
arr2

Out[16]:

array([[147, 134,  58,  21,  90, 193, 135, 179, 129, 113],
       [ 85, 161,  31, 123, 191, 166,  52,  25,  94, 184],
       [174, 149, 143, 123, 126, 143,  59, 180, 116, 105],
       [ 78, 198, 161, 152, 167,  84, 104, 128, 173, 140],
       [181,  47, 114, 145, 139, 180, 183, 125,  41,  46]])

In [17]:

Copied!

arr2_subset = arr2[1:4, 4:7]
arr2_subset
arr2_subset = arr2[1:4, 4:7]
arr2_subset

Out[17]:

array([[191, 166,  52],
       [126, 143,  59],
       [167,  84, 104]])

In [18]:

Copied!

arr2_subset[:,:] = 999 #assign this entire numpy the same values
arr2_subset
arr2_subset[:,:] = 999 #assign this entire numpy the same values
arr2_subset

Out[18]:

array([[999, 999, 999],
       [999, 999, 999],
       [999, 999, 999]])

Deep copy¶

NumPy Arrays like Python objects are always shallow copied. Hence any modification made in derivative affects the source. Make deep copies using copy() method

In [19]:

Copied!

arr2 #notice the 999 in the middle
arr2 #notice the 999 in the middle

Out[19]:

array([[147, 134,  58,  21,  90, 193, 135, 179, 129, 113],
       [ 85, 161,  31, 123, 999, 999, 999,  25,  94, 184],
       [174, 149, 143, 123, 999, 999, 999, 180, 116, 105],
       [ 78, 198, 161, 152, 999, 999, 999, 128, 173, 140],
       [181,  47, 114, 145, 139, 180, 183, 125,  41,  46]])

In [20]:

Copied!

arr2_subset_a = arr2_subset
arr2_subset_a is arr2_subset
arr2_subset_a = arr2_subset
arr2_subset_a is arr2_subset

Out[20]:

True

Notice they are same obj in memory

In [21]:

Copied!

arr3_subset = arr2_subset.copy()
arr3_subset
arr3_subset = arr2_subset.copy()
arr3_subset

Out[21]:

array([[999, 999, 999],
       [999, 999, 999],
       [999, 999, 999]])

In [22]:

Copied!

arr3_subset is arr2_subset
arr3_subset is arr2_subset

Out[22]:

False

Notice they are different objects in memory. Thus changing arr3_subset will not affect its source

In [23]:

Copied!

arr3_subset[:,:] = 0.1
arr2_subset
arr3_subset[:,:] = 0.1
arr2_subset

Out[23]:

array([[999, 999, 999],
       [999, 999, 999],
       [999, 999, 999]])

Array searching¶

Use matlab style array searching

In [24]:

Copied!

arr1
arr1

Out[24]:

array([25, 10, 18, 10, 16, 22, 14, 26])

In [28]:

Copied!

arr1>15  # gives truth vector
arr1>15  # gives truth vector

Out[28]:

array([ True, False,  True, False,  True,  True, False,  True])

You can use the Truth vector as an index to search. Get all numbers greater than 15

In [29]:

Copied!

arr1[arr1 > 15]
arr1[arr1 > 15]

Out[29]:

array([25, 18, 16, 22, 26])

In [30]:

Copied!

arr1[arr1 > 20]
arr1[arr1 > 20]

Out[30]:

array([25, 22, 26])

just the condition returns a boolean matrix of same dimension as the one being queried

In [31]:

Copied!

arr1 > 12
arr1 > 12

Out[31]:

array([ True, False,  True, False,  True,  True,  True,  True])

In [32]:

Copied!

arr2[arr2 > 50] #looses the original shape as its impossible to keep the 2D shape
arr2[arr2 > 50] #looses the original shape as its impossible to keep the 2D shape

Out[32]:

array([147, 134,  58,  90, 193, 135, 179, 129, 113,  85, 161, 123, 999,
       999, 999,  94, 184, 174, 149, 143, 123, 999, 999, 999, 180, 116,
       105,  78, 198, 161, 152, 999, 999, 999, 128, 173, 140, 181, 114,
       145, 139, 180, 183, 125])

In [33]:

Copied!

arr2[arr2 < 30]
arr2[arr2 < 30]

Out[33]:

array([21, 25])

Compound searching¶

Find elements within a range for instance:

In [35]:

Copied!

arr1[(arr1>16) & (arr1<23)]
arr1[(arr1>16) & (arr1<23)]

Out[35]:

array([18, 22])

Math operations - elemenwise¶

NumPy has operators like +, -, /, * overloaded so you can add two matrices like scalars

In [36]:

Copied!

arr1
arr1

Out[36]:

array([25, 10, 18, 10, 16, 22, 14, 26])

In [37]:

Copied!

arr_sum = arr1 + arr1 # elementwise addition
arr_sum
arr_sum = arr1 + arr1 # elementwise addition
arr_sum

Out[37]:

array([50, 20, 36, 20, 32, 44, 28, 52])

In [38]:

Copied!

arr_cubed = arr1 ** 2 # elementwise exponentiation
arr_cubed
arr_cubed = arr1 ** 2 # elementwise exponentiation
arr_cubed

Out[38]:

array([625, 100, 324, 100, 256, 484, 196, 676])

Similarly, you can add a scalar to an array and NumPy will broadcast that operation on all the elements.

In [39]:

Copied!

arr_cubed - 100 # element wise subtraction by a scalar
arr_cubed - 100 # element wise subtraction by a scalar

Out[39]:

array([525,   0, 224,   0, 156, 384,  96, 576])

Math operations - matrix math¶

Use built-in functions for matrix operations

In [41]:

Copied!

arr1
arr1

Out[41]:

array([25, 10, 18, 10, 16, 22, 14, 26])

In [40]:

Copied!

np.dot(arr1, arr1)
np.dot(arr1, arr1)

Out[40]:

Above it automatically transposed the second array input to calculate the matrix multiplication of 1xnxnx1

Caveats¶

Numpy does not throw errors for divide by zero or for 0/0. Intead it sets value to inf and nan.

In [42]:

Copied!

arr_cubed[0] = 0
arr_cubed
arr_cubed[0] = 0
arr_cubed

Out[42]:

array([  0, 100, 324, 100, 256, 484, 196, 676])

In [43]:

Copied!

arr_cubed / 0
arr_cubed / 0

/Users/atma6951/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:1: RuntimeWarning: divide by zero encountered in true_divide
  """Entry point for launching an IPython kernel.
/Users/atma6951/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:1: RuntimeWarning: invalid value encountered in true_divide
  """Entry point for launching an IPython kernel.

Out[43]:

array([nan, inf, inf, inf, inf, inf, inf, inf])

Thus 0/0 = nan and num/0 = inf

Universal functions¶

Numpy has a bunch of universal functions that work on the array elements one at a time and allow arrays to be used or treated as scalars.

Before writing a loop, look up the function list here