NumPy

Array Creation and Inspection
NumPy Broadcasting and Universal Functions (ufuncs)
NumPy Linear Algebra & Matrix Operations
NumPy Reshaping, Indexing & Random Sampling

Array Creation and Inspection

This notebook covers creating NumPy arrays and inspecting their properties.


import numpy as np
np.random.seed(0)

Array Creation

np.array
np.zeros, np.ones, np.full, np.eye
Random arrays: np.random.rand, np.random.randn


a = np.array([1, 2, 3])
b = np.zeros((2,3))
c = np.ones((3,3))
d = np.full((2,2), 7)
e = np.eye(4)
rand = np.random.rand(2,3)
print(a, b, c, d, e, rand, sep='\n')


[1 2 3]
[[0. 0. 0.]
 [0. 0. 0.]]
[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]
[[7 7]
 [7 7]]
[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]
[[0.5488135  0.71518937 0.60276338]
 [0.54488318 0.4236548  0.64589411]]

Data Types and Conversion


print(a.dtype)
a_float = a.astype(float)
print(a_float.dtype)


int64
float64

Advanced: Structured Arrays


dt = np.dtype([('name','U10'), ('age','i4'), ('weight','f4')])
people = np.array([('Alice', 25, 55.0), ('Bob', 30, 85.5)], dtype=dt)
print(people['name'], people['age'], people['weight'])


['Alice' 'Bob'] [25 30] [55.  85.5]

NumPy Broadcasting and Universal Functions (ufuncs)

Learn how NumPy handles operations between arrays of different shapes and leverages its fast ufuncs.


import numpy as np
np.random.seed(1)

# Sample arrays
a = np.arange(6).reshape(2,3)
b = np.array([1, 2, 3])
print('a:\n', a)
print('b:', b)


a:
 [[0 1 2]
 [3 4 5]]
b: [1 2 3]

Broadcasting Basics

When dimensions differ, NumPy ‘stretches’ the smaller array along the missing axes.
Rules: trailing dimensions must match or be 1.


# Broadcasting example
c = a + b  # b is broadcast over rows
print('a + b =\n', c)


a + b =
 [[1 3 5]
 [4 6 8]]

Broadcasting with Higher Dimensions

Example: adding a (2,3,1) array to a (3,) array


x = np.arange(6).reshape(2,3,1)
y = np.array([10, 20, 30])
print('x shape:', x.shape)
print('y shape:', y.shape)
z = x + y  # y broadcast across last two dims
print('z shape:', z.shape)
print(z)


x shape: (2, 3, 1)
y shape: (3,)
z shape: (2, 3, 3)
[[[10 20 30]
  [11 21 31]
  [12 22 32]]

 [[13 23 33]
  [14 24 34]
  [15 25 35]]]

Universal Functions (ufuncs)

Fast elementwise operations implemented in C.
Examples: np.sin, np.exp, np.add, np.multiply.


print('sin(a):\n', np.sin(a))
print('exp(a):\n', np.exp(a))


sin(a):
 [[ 0.          0.84147098  0.90929743]
 [ 0.14112001 -0.7568025  -0.95892427]]
exp(a):
 [[  1.           2.71828183   7.3890561 ]
 [ 20.08553692  54.59815003 148.4131591 ]]

ufunc Methods: reduce, accumulate, outer

reduce: apply ufunc to collapse an axis
accumulate: cumulative application
outer: all-pairs operation


print('add.reduce(a):', np.add.reduce(a, axis=1))
print('add.accumulate(a):\n', np.add.accumulate(a, axis=1))
print('multiply.outer([1,2], [3,4]):\n', np.multiply.outer([1,2], [3,4]))


add.reduce(a): [ 3 12]
add.accumulate(a):
 [[ 0  1  3]
 [ 3  7 12]]
multiply.outer([1,2], [3,4]):
 [[3 4]
 [6 8]]

Vectorized String Operations

np.char module provides vectorized string ops.


s = np.array(['apple', 'banana', 'cherry'])
print('Uppercase:', np.char.upper(s))
print('Replace a->@:', np.char.replace(s, 'a', '@'))


Uppercase: ['APPLE' 'BANANA' 'CHERRY']
Replace a->@: ['@pple' 'b@n@n@' 'cherry']

NumPy Linear Algebra & Matrix Operations

Perform matrix products, decompositions, and other linear algebra routines.


import numpy as np
from numpy import linalg as LA
np.random.seed(0)

# Define sample matrices
A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
B = np.array([[9, 8, 7], [6, 5, 4], [3, 2, 1]])
print('A:')
print(A)
print('\nB:')
print(B)


A:
[[1 2 3]
 [4 5 6]
 [7 8 9]]

B:
[[9 8 7]
 [6 5 4]
 [3 2 1]]

Matrix Multiplication

@ operator or np.dot


C1 = A @ B
C2 = np.dot(A, B)
print('A @ B =')
print(C1)
print('\nnp.dot(A, B) =')
print(C2)


A @ B =
[[ 30  24  18]
 [ 84  69  54]
 [138 114  90]]

np.dot(A, B) =
[[ 30  24  18]
 [ 84  69  54]
 [138 114  90]]

Elementwise Operations

*, np.multiply, np.add


print('A * B =')
print(A * B)
print('\nnp.multiply(A, B) =')
print(np.multiply(A, B))


A * B =
[[ 9 16 21]
 [24 25 24]
 [21 16  9]]

np.multiply(A, B) =
[[ 9 16 21]
 [24 25 24]
 [21 16  9]]

Matrix Inverse, Determinant, and Rank

LA.inv, LA.det, LA.matrix_rank


# Use a smaller invertible matrix
M = np.array([[4, 7], [2, 6]])
print('M:', M)
print('Inverse of M:', LA.inv(M))
print('Determinant of M:', LA.det(M))
print('Rank of A:', LA.matrix_rank(A))


M: [[4 7]
 [2 6]]
Inverse of M: [[ 0.6 -0.7]
 [-0.2  0.4]]
Determinant of M: 10.000000000000002
Rank of A: 2

Eigenvalues & Eigenvectors

LA.eig


eigvals, eigvecs = LA.eig(M)
print('Eigenvalues:', eigvals)
print('Eigenvectors:')
print(eigvecs)


Eigenvalues: [1.12701665 8.87298335]
Eigenvectors:
[[-0.92511345 -0.82071729]
 [ 0.37969079 -0.57133452]]

Solving Linear Systems

Solve Ax = b via LA.solve


b = np.array([1, 2])
x = LA.solve(M, b)
print('Solution x for Mx = b:', x)
# Verify Mx
print('M @ x:', M @ x)


Solution x for Mx = b: [-0.8  0.6]
M @ x: [1. 2.]

Pseudo-Inverse and Least Squares

LA.pinv, LA.lstsq


# Overdetermined system example
X = np.random.randn(5, 3)
y = np.random.randn(5)
# Least squares solution
coef, residuals, rank, s = LA.lstsq(X, y, rcond=None)
print('Coefficients:', coef)
print('Residuals:', residuals)
# Pseudo-inverse solution
coef_pinv = LA.pinv(X) @ y
print('Coefficients via pinv:', coef_pinv)


Coefficients: [-0.22776487  1.10638653  0.07697564]
Residuals: [0.82230318]
Coefficients via pinv: [-0.22776487  1.10638653  0.07697564]

Tensor Dot and Trace

np.trace, np.tensordot


print('Trace of A:', np.trace(A))
print('Tensor dot A and B over axes (1,0):')
print(np.tensordot(A, B, axes=(1, 0)))


Trace of A: 15
Tensor dot A and B over axes (1,0):
[[ 30  24  18]
 [ 84  69  54]
 [138 114  90]]

NumPy Reshaping, Indexing & Random Sampling

Master reshaping arrays, indexing techniques, and generating random samples for experiments.


import numpy as np
np.random.seed(2)
arr = np.arange(24)
print('Original arr shape:', arr.shape)


Original arr shape: (24,)

Reshaping Arrays

reshape(new_shape)
ravel(), flatten()
transpose(), swapaxes()


print('reshape to (4,6):')
print(arr.reshape(4,6))
print('\nreshape to (2,3,4):')
print(arr.reshape(2,3,4))


reshape to (4,6):
[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]]

reshape to (2,3,4):
[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]

Flatten vs Ravel

flatten(): returns a copy
ravel(): returns a view when possible


flat = arr.flatten()
rv = arr.ravel()
flat[0] = 100
rv[1] = 200
print('After modifying flat vs ravel:')
print('Original arr[:2]:', arr[:2])


After modifying flat vs ravel:
Original arr[:2]: [  0 200]

Advanced Indexing Techniques

Integer array indexing
Boolean masks
take() and put()


# Integer indexing
idx = np.array([0, 2, 5])
print('Selected elements:', arr[idx])
# Boolean mask
mask = arr % 2 == 0
print('Even elements:', arr[mask])


Selected elements: [0 2 5]
Even elements: [  0 200   2   4   6   8  10  12  14  16  18  20  22]

Adding and Removing Dimensions

newaxis
expand_dims(), squeeze()


a = np.arange(6)
print('a shape:', a.shape)
a2 = a[np.newaxis, :]
print('a with new axis:', a2.shape)
a3 = np.expand_dims(a, axis=1)
print('expand_dims axis=1:', a3.shape)
print('squeeze back:', np.squeeze(a3).shape)


a shape: (6,)
a with new axis: (1, 6)
expand_dims axis=1: (6, 1)
squeeze back: (6,)

Random Sampling

rand, randn, randint, random_sample
choice, shuffle


print('rand 2x3:', np.random.rand(2,3))
print('randn 3x3:', np.random.randn(3,3))
print('randint 0-10:', np.random.randint(0, 10, size=5))
print('random_sample 5:', np.random.random_sample(5))


rand 2x3: [[0.4359949  0.02592623 0.54966248]
 [0.43532239 0.4203678  0.33033482]]
randn 3x3: [[ 0.50288142 -1.24528809 -1.05795222]
 [-0.90900761  0.55145404  2.29220801]
 [ 0.04153939 -1.11792545  0.53905832]]
randint 0-10: [1 3 5 8 4]
random_sample 5: [0.29701836 0.28786882 0.11619332 0.18172704 0.49428977]

Permutations and Choice

shuffle for in-place permuting
choice for sampling with/without replacement


arr2 = np.arange(10)
np.random.shuffle(arr2)
print('Shuffled:', arr2)
print('Choice 5 elements:', np.random.choice(arr2, size=5, replace=False))


Shuffled: [5 3 8 6 7 0 1 9 4 2]
Choice 5 elements: [0 7 6 1 8]

Practical: Create Train/Test Split

Using random sampling to split data arrays.


X = np.arange(20).reshape(10,2)
y = np.arange(10)
# Shuffle indices
indices = np.arange(10)
np.random.shuffle(indices)
# 80/20 split
split = int(0.8 * len(indices))
train_idx = indices[:split]
test_idx = indices[split:]
X_train, X_test = X[train_idx], X[test_idx]
y_train, y_test = y[train_idx], y[test_idx]
print('Train X:', X_train)
print('Test X:', X_test)


Train X: [[ 2  3]
 [12 13]
 [10 11]
 [ 8  9]
 [ 4  5]
 [14 15]
 [16 17]
 [ 0  1]]
Test X: [[ 6  7]
 [18 19]]

Keyboard shortcuts

apunts.jg5.dev