Array operations - slicing, dicing, searching¶
import numpy as np
arr1 = np.random.randint(10,30, size=8)
arr1
array([25, 10, 18, 10, 16, 22, 14, 26])
arr2 = np.random.randint(20,200,size=50).reshape(5,10) #method chaining - numbers from 0 to 50
arr2
array([[147, 134, 58, 21, 90, 193, 135, 179, 129, 113], [ 85, 161, 31, 123, 191, 166, 52, 25, 94, 184], [174, 149, 143, 123, 126, 143, 59, 180, 116, 105], [ 78, 198, 161, 152, 167, 84, 104, 128, 173, 140], [181, 47, 114, 145, 139, 180, 183, 125, 41, 46]])
Array slicing¶
get elements using index like in a List
arr1[0]
25
arr1[3]
10
arr1[:3] #get the first 3 elements. Gets lower bounds inclusive, upper bound exclusive
array([25, 10, 18])
arr1[2:] #lower bound inclusive
array([18, 10, 16, 22, 14, 26])
arr1[2:5] #get elements at index 2,3,4
array([18, 10, 16])
nD array slicing¶
arr2
array([[147, 134, 58, 21, 90, 193, 135, 179, 129, 113], [ 85, 161, 31, 123, 191, 166, 52, 25, 94, 184], [174, 149, 143, 123, 126, 143, 59, 180, 116, 105], [ 78, 198, 161, 152, 167, 84, 104, 128, 173, 140], [181, 47, 114, 145, 139, 180, 183, 125, 41, 46]])
arr2[0,0] #style 1 - you pass in a list of indices
147
arr2[0][0] #style 2 - parse it as list of lists - not so popular
147
arr2[1] # get a full row
array([ 85, 161, 31, 123, 191, 166, 52, 25, 94, 184])
Array dicing¶
#get the second column
arr2[:,1]
array([134, 161, 149, 198, 47])
Thus, you specify :
for all columns, followed by 1
for column. And you get a 1D array of the result
#get the 3rd row
arr2[2,:] #which is same as arr2[2]
array([174, 149, 143, 123, 126, 143, 59, 180, 116, 105])
#get the center 3,3 elements - columns 4,5,6 and rows 1,2,3
arr2[1:4, 4:7]
array([[191, 166, 52], [126, 143, 59], [167, 84, 104]])
Array broadcasting¶
NumPy allows bulk assigning values, just like in matlab
arr2
array([[147, 134, 58, 21, 90, 193, 135, 179, 129, 113], [ 85, 161, 31, 123, 191, 166, 52, 25, 94, 184], [174, 149, 143, 123, 126, 143, 59, 180, 116, 105], [ 78, 198, 161, 152, 167, 84, 104, 128, 173, 140], [181, 47, 114, 145, 139, 180, 183, 125, 41, 46]])
arr2_subset = arr2[1:4, 4:7]
arr2_subset
array([[191, 166, 52], [126, 143, 59], [167, 84, 104]])
arr2_subset[:,:] = 999 #assign this entire numpy the same values
arr2_subset
array([[999, 999, 999], [999, 999, 999], [999, 999, 999]])
Deep copy¶
NumPy Arrays like Python objects are always shallow copied. Hence any modification made in derivative affects the source.
Make deep copies using copy()
method
arr2 #notice the 999 in the middle
array([[147, 134, 58, 21, 90, 193, 135, 179, 129, 113], [ 85, 161, 31, 123, 999, 999, 999, 25, 94, 184], [174, 149, 143, 123, 999, 999, 999, 180, 116, 105], [ 78, 198, 161, 152, 999, 999, 999, 128, 173, 140], [181, 47, 114, 145, 139, 180, 183, 125, 41, 46]])
arr2_subset_a = arr2_subset
arr2_subset_a is arr2_subset
True
Notice they are same obj in memory
arr3_subset = arr2_subset.copy()
arr3_subset
array([[999, 999, 999], [999, 999, 999], [999, 999, 999]])
arr3_subset is arr2_subset
False
Notice they are different objects in memory. Thus changing arr3_subset will not affect its source
arr3_subset[:,:] = 0.1
arr2_subset
array([[999, 999, 999], [999, 999, 999], [999, 999, 999]])
Array searching¶
Use matlab style array searching
arr1
array([25, 10, 18, 10, 16, 22, 14, 26])
arr1>15 # gives truth vector
array([ True, False, True, False, True, True, False, True])
You can use the Truth vector as an index to search. Get all numbers greater than 15
arr1[arr1 > 15]
array([25, 18, 16, 22, 26])
arr1[arr1 > 20]
array([25, 22, 26])
just the condition returns a boolean matrix of same dimension as the one being queried
arr1 > 12
array([ True, False, True, False, True, True, True, True])
arr2[arr2 > 50] #looses the original shape as its impossible to keep the 2D shape
array([147, 134, 58, 90, 193, 135, 179, 129, 113, 85, 161, 123, 999, 999, 999, 94, 184, 174, 149, 143, 123, 999, 999, 999, 180, 116, 105, 78, 198, 161, 152, 999, 999, 999, 128, 173, 140, 181, 114, 145, 139, 180, 183, 125])
arr2[arr2 < 30]
array([21, 25])
Compound searching¶
Find elements within a range for instance:
arr1[(arr1>16) & (arr1<23)]
array([18, 22])
Math operations - elemenwise¶
NumPy has operators like +
, -
, /
, *
overloaded so you can add two matrices like scalars
arr1
array([25, 10, 18, 10, 16, 22, 14, 26])
arr_sum = arr1 + arr1 # elementwise addition
arr_sum
array([50, 20, 36, 20, 32, 44, 28, 52])
arr_cubed = arr1 ** 2 # elementwise exponentiation
arr_cubed
array([625, 100, 324, 100, 256, 484, 196, 676])
Similarly, you can add a scalar to an array and NumPy will broadcast
that operation on all the elements.
arr_cubed - 100 # element wise subtraction by a scalar
array([525, 0, 224, 0, 156, 384, 96, 576])
Math operations - matrix math¶
Use built-in functions for matrix operations
arr1
array([25, 10, 18, 10, 16, 22, 14, 26])
np.dot(arr1, arr1)
2761
Above it automatically transposed the second array input to calculate the matrix multiplication of 1xn
xnx1
Caveats¶
Numpy does not throw errors for divide by zero or for 0/0. Intead it sets value to inf
and nan
.
arr_cubed[0] = 0
arr_cubed
array([ 0, 100, 324, 100, 256, 484, 196, 676])
arr_cubed / 0
/Users/atma6951/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:1: RuntimeWarning: divide by zero encountered in true_divide """Entry point for launching an IPython kernel. /Users/atma6951/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:1: RuntimeWarning: invalid value encountered in true_divide """Entry point for launching an IPython kernel.
array([nan, inf, inf, inf, inf, inf, inf, inf])
Thus 0/0 = nan
and num/0 = inf
Universal functions¶
Numpy has a bunch of universal functions that work on the array elements one at a time and allow arrays to be used or treated as scalars.
Before writing a loop, look up the function list here