Numpy Arrays¶

Multi-dimensional arrays. Why do we need them?

For example, let's say I have a whole bunch of data points, like

Person    The Matrix      Inception      Force Awakens     Wall-E ...
     A         5              4                5              ?
     B         4              ?                4              ?
     C         ?              ?                5              5
     ...

We would like to guess how Person C would rate Inception. Not an easy problem, but how would we store this kind of data in the first place?

If we wanted to just store the matrix, we can do a list of lists.

xs = [[5, 4, 5, 3], [4, -1, 4, -1], [-1, -1, 5, 5]]

There are 3 lists (one per person) and each one has 4 entries. So it's a 3x4 matrix. We can do a lot with lists of lists, but it will be slow. We would eventually need a more efficient and flexible ways of using multi-dimensional arrays:

The standard way of working with data sets in Python is to use the Numpy library. A Numpy array is a multi-dimensional array.

import numpy as np
arr = np.array([[5, 4, 5, 3], [4, -1, 4, -1], [-1, -1, 5, 5]])
print(arr)

[[ 5  4  5  3]
 [ 4 -1  4 -1]
 [-1 -1  5  5]]

type(arr)

numpy.ndarray

Every array has a shape:

arr.shape

(3, 4)

arr = np.array([[1,2,3], [4,5,6]])
print(arr)

[[1 2 3]
 [4 5 6]]

arr.shape

(2, 3)

# accessing elements:
arr[0]    # 0th row

array([1, 2, 3])

arr[1]     # 1st row

array([4, 5, 6])

arr[0, 0]    # very different from lists of lists, for those, we would have done arr[0][0]

1

arr[0, 1]

2

We can use slicing too:

arr = np.array([[1,2,3],[4,5,6],[7,8,9]])
arr

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

arr[1:3,1:3]

array([[5, 6],
       [8, 9]])

arr[0,:]

array([1, 2, 3])

arr[:,0]

array([1, 4, 7])

We can have 3 or more dimensional arrays too

ar3 = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
print(ar3)

[[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]

ar3.shape

(2, 2, 3)

ar3[0,1,2]

6

Building Arrays¶

arr = np.zeros([3,3])   # you put the shape in as a list

arr

array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.]])

The identity matrix

arr = np.identity(4)
arr

array([[ 1.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.],
       [ 0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  1.]])

Equally distant points:

np.linspace(1,2,11)   # 11 points between 0 and and 1 inclusive

array([ 1. ,  1.1,  1.2,  1.3,  1.4,  1.5,  1.6,  1.7,  1.8,  1.9,  2. ])

Of course we could have done this with list comprehensions ([1 + 0.1*x for x in range(11)]) but anything you do in numpy will be faster.

Vectorization:¶

# guess what will happen?
arr = np.zeros([3,3])
arr = arr + 1
print(arr)

[[ 1.  1.  1.]
 [ 1.  1.  1.]
 [ 1.  1.  1.]]

Similarly

def f(x):
    return x*x + x + 1

f(arr)    # again this would never work for lists

array([[ 3.,  3.,  3.],
       [ 3.,  3.,  3.],
       [ 3.,  3.,  3.]])

# Remark: for library functions you may need to use a function called `vectorize`

Even more interesting:

np.array([1,2,3]) + np.array([4,5,6])   # if these were lists, it would be concatenation

array([5, 7, 9])

It added the arrays as if they were vectors. Numpy figures out how to use the function with the array you gave.

But maybe you want to control it youself:

arr = np.array(range(9)).reshape(3,3) + 1

arr

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

np.sum(arr)

45

np.apply_along_axis(np.sum, 0, arr)

array([12, 15, 18])

np.apply_along_axis(np.sum, 1, arr)

array([ 6, 15, 24])

There is also: np.apply_over_axis

Reshape¶

arr = np.array(range(16))

arr.reshape(4,4)

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

arr.reshape(2,2,-1)    # if you put -1, it figures out what the shape should be

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7]],

       [[ 8,  9, 10, 11],
        [12, 13, 14, 15]]])

arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

np.reshape(arr, (2,2,-1))

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7]],

       [[ 8,  9, 10, 11],
        [12, 13, 14, 15]]])

Plotting with Matplotlib¶

This is very very similar to Matlab's plotting functions.

import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline

from math import pi
from math import cos

xs = np.linspace(0, 2*pi, 200)
ys = np.cos(xs)
zs = np.sin(xs)

plt.plot(xs, ys)  # first one is blue
plt.plot(xs, zs)  # second one is green

[<matplotlib.lines.Line2D at 0x10ae05668>]