So in this post we will provide a 21 recipe using NumPy and will add more as soon as possible, what you can do is to subscribe to know if there’s any addition to this article or adding new awesome one.

Introduction

NumPy is the foundation of the Python machine learning stack. NumPy allows for efficient operations on the data structures often used in machine learning: vectors, matrices, and tensors. While NumPy is not the focus of this book, it will show up frequently throughout the following chapters. This chapter covers the most common NumPy operations we are likely to run into while working on machine learning workflows.

Subscribe to our newsletter!

Creating a Vector

In mathematics, physics, and engineering, a Euclidean vector (sometimes called a geometric or spatial vector, or—as here—simply a vector) is a geometric object that has magnitude (or length) and direction. Vectors can be added to other vectors according to vector algebra. A Euclidean vector is frequently represented by a line segment with a definite direction, or graphically as an arrow, connecting an initial point A with a terminal point B. wiki

Problem

You need to create a vector.

Solution

Use NumPy to create a one-dimensional array:

# Load library import numpy as np # Create a vector as a row vector_row = np.array([1, 2, 3]) # Create a vector as a column vector_column = np.array([[1], [2], [3]])

Discussion

NumPy’s main data structure is the multidimensional array. To create a vector, we simply create a one-dimensional array. Just like vectors, these arrays can be repre‐ sented horizontally (i.e., rows) or vertically (i.e., columns).

See Also

Creating a Matrix

Problem

You need to create a matrix.

Solution

Use NumPy to create a two-dimensional array:

# Load library import numpy as np # Create a matrix matrix = np.array([[1, 2], [1, 2], [1, 2]])

Discussion

To create a matrix we can use a NumPy two-dimensional array. In our solution, the matrix contains three rows and two columns (a column of 1s and a column of 2s). NumPy actually has a dedicated matrix data structure:

matrix_object = np.mat([[1, 2], [1, 2], [1, 2]]) # Output # matrix([[1, 2], [1, 2], [1, 2]])

However, the matrix data structure is not recommended for two reasons.

First, arrays are the defacto standard data structure of NumPy.

Second, the vast majority of NumPy operations return arrays, not matrix objects.

See Also

• Matrix, Wikipedia • Matrix, Wolfram MathWorld

Creating a Sparse Matrix

Problem

Given data with very few nonzero values, you want to efficiently represent it. Solution Create a sparse matrix:

# Load libraries import numpy as np from scipy import sparse # Create a matrix matrix = np.array([[0, 0], [0, 1], [3, 0]]) # Create compressed sparse row (CSR) matrix matrix_sparse = sparse.csr_matrix(matrix)

Discussion

A frequent situation in machine learning is having a huge amount of data; however, most of the elements in the data are zeros. For example, imagine a matrix where the columns are every movie on Netflix, the rows are every Netflix user, and the values are how many times a user has watched that particular movie. This matrix would have tens of thousands of columns and millions of rows! However, since most users do not watch most movies, the vast majority of elements would be zero. Sparse matrices only store nonzero elements and assume all other values will be zero, leading to significant computational savings. In our solution, we created a NumPy array with two nonzero values, then converted it into a sparse matrix. If we view the sparse matrix we can see that only the nonzero values are stored:

# View sparse matrix print(matrix_sparse) # Output # (1, 1) 1 # (2, 0) 3

There are a number of types of sparse matrices. However, in compressed sparse row (CSR) matrices, (1, 1) and (2, 0) represent the (zero-indexed) indices of the nonzero values 1 and 3, respectively. For example, the element 1 is in the second row and second column. We can see the advantage of sparse matrices if we create a much larger matrix with many more zero elements and then compare this larger matrix with our original sparse matrix:

# Create larger matrix matrix_large = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 1, 0, 0, 0, 0, 0, 0, 0, 0], [3, 0, 0, 0, 0, 0, 0, 0, 0, 0]]) # Create compressed sparse row (CSR) matrix matrix_large_sparse = sparse.csr_matrix(matrix_large) # View original sparse matrix print(matrix_sparse) # Output # (1, 1) 1 # (2, 0) 3 # View larger sparse matrix print(matrix_large_sparse) (1, 1) 1 (2, 0) 3 # Output # (1, 1) 1 # (2, 0) 3

As we can see, despite the fact that we added many more zero elements in the larger matrix, its sparse representation is exactly the same as our original sparse matrix. That is, the addition of zero elements did not change the size of the sparse matrix. As mentioned, there are many different types of sparse matrices, such as compressed sparse column, list of lists, and dictionary of keys. While an explanation of the different types and their implications is outside the scope of this book, it is worth noting that while there is no “best” sparse matrix type, there are meaningful differences between them and we should be conscious about why we are choosing one type over another.

See Also

• Sparse matrices, SciPy documentation • 101 Ways to Store a Sparse Matrix

Selecting Elements

Problem

You need to select one or more elements in a vector or matrix.

Solution

NumPy’s arrays make that easy:

# Load library import numpy as np # Create row vector vector = np.array([1, 2, 3, 4, 5, 6]) # Create matrix matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) # Select third element of vector vector[2] # Output # 3 # Select second row, second column matrix[1,1] # Output # 5

Discussion

Like most things in Python, NumPy arrays are zero-indexed, meaning that the index of the first element is 0, not 1. With that caveat, NumPy offers a wide variety of meth‐ ods for selecting (i.e., indexing and slicing) elements or groups of elements in arrays:

# Select all elements of a vector vector[:] # Output # array([1, 2, 3, 4, 5, 6]) # Select everything up to and including the third element vector[:3] # Output # array([1, 2, 3]) # Select everything after the third element vector[3:] # Output # array([4, 5, 6]) # Select the last element vector[-1] # Output # 6 # Select the first two rows and all columns of a matrix matrix[:2,:] # Output # array([[1, 2, 3], [4, 5, 6]]) # Select all rows and the second column matrix[:,1:2] # Output # array([[2], [5], [8]])

Describing a Matrix

Problem

You want to describe the shape, size, and dimensions of the matrix.

Solution

Use shape, size, and ndim:

# Load library import numpy as np # Create matrix matrix = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]]) # View number of rows and columns matrix.shape # Output # (3, 4) # View number of elements (rows * columns) matrix.size # Output # 12 # View number of dimensions matrix.ndim # Output # 2

Discussion

This might seem basic (and it is); however, time and again it will be valuable to check the shape and size of an array both for further calculations and simply as a gut check after some operation.

Applying Operations to Elements

Problem

You want to apply some function to multiple elements in an array.

Solution

Use NumPy’s vectorize:

# Load library import numpy as np # Create matrix matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) # Create function that adds 100 to something add_100 = lambda i: i + 100 # Create vectorized function vectorized_add_100 = np.vectorize(add_100) # Apply function to all elements in matrix vectorized_add_100(matrix) array([[101, 102, 103], [104, 105, 106], [107, 108, 109]])

Discussion

NumPy’s vectorize class converts a function into a function that can apply to all ele‐ ments in an array or slice of an array. It’s worth noting that vectorize is essentially a for loop over the elements and does not increase performance. Furthermore, NumPy arrays allow us to perform operations between arrays even if their dimensions are not the same (a process called broadcasting). For example, we can create a much simpler version of our solution using broadcasting:

# Add 100 to all elements matrix + 100 array([[101, 102, 103], [104, 105, 106], [107, 108, 109]])

Finding the Maximum and Minimum Values

Problem

You need to find the maximum or minimum value in an array.

Solution

Use NumPy’s max and min:

# Load library import numpy as np # Create matrix matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) # Return maximum element np.max(matrix) # Output # 9 # Return minimum element np.min(matrix) # Output # 1

Discussion

Often we want to know the maximum and minimum value in an array or subset of an array. This can be accomplished with the max and min methods. Using the axis parameter we can also apply the operation along a certain axis:

# Find maximum element in each column np.max(matrix, axis=0) # Output # array([7, 8, 9]) # Find maximum element in each row np.max(matrix, axis=1) # Output # array([3, 6, 9])

Calculating the Average, Variance, and Standard Deviation

Problem

You want to calculate some descriptive statistics about an array.

Solution

Use NumPy’s mean, var, and std:

# Load library import numpy as np # Create matrix matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) # Return mean np.mean(matrix) # Output 5.0 # Return variance np.var(matrix) # Output 6.666666666666667 # Return standard deviation np.std(matrix) # Output 2.5819888974716112

Discussion

Just like with max and min, we can easily get descriptive statistics about the whole matrix or do calculations along a single axis:

# Find the mean value in each column np.mean(matrix, axis=0) # Output # array([ 4., 5., 6.])

Reshaping Arrays

Problem

You want to change the shape (number of rows and columns) of an array without changing the element values.

Solution

Use NumPy’s reshape:

# Load library import numpy as np # Create 4x3 matrix matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]) # Reshape matrix into 2x6 matrix matrix.reshape(2, 6) # Output array([[ 1, 2, 3, 4, 5, 6], [ 7, 8, 9, 10, 11, 12]])

Discussion

reshape allows us to restructure an array so that we maintain the same data but it is organised as a different number of rows and columns. The only requirement is that the shape of the original and new matrix contain the same number of elements (i.e., the same size). We can see the size of a matrix using size: matrix.size 12 One useful argument in reshape is -1, which effectively means “as many as needed,” so reshape(-1, 1) means one row and as many columns as needed:

matrix.reshape(1, -1) # Output # array([[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]])

Finally, if we provide one integer, reshape will return a 1D array of that length:

matrix.reshape(12) # Output # array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])

Transposing a Vector or Matrix

Problem

You need to transpose a vector or matrix.

Solution

Use the T method:

# Load library import numpy as np # Create matrix matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) # Transpose matrix matrix.T # Output # array([[1, 4, 7], [2, 5, 8], [3, 6, 9]])

Discussion

Transposing is a common operation in linear algebra where the column and row indices of each element are swapped. One nuanced point that is typically overlooked outside of a linear algebra class is that, technically, a vector cannot be transposed because it is just a collection of values:

# Transpose vector np.array([1, 2, 3, 4, 5, 6]).T # Output # array([1, 2, 3, 4, 5, 6])

However, it is common to refer to transposing a vector as converting a row vector to a column vector (notice the second pair of brackets) or vice versa:

# Tranpose row vector np.array([[1, 2, 3, 4, 5, 6]]).T # Output # array([[1], [2], [3], [4], [5], [6]])

Flattening a Matrix

Problem

You need to transform a matrix into a one-dimensional array.

Solution

Use flatten:

# Load library import numpy as np # Create matrix matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) # Flatten matrix matrix.flatten() # Output # array([1, 2, 3, 4, 5, 6, 7, 8, 9])

Discussion

flatten is a simple method to transform a matrix into a one-dimensional array. Alternatively, we can use reshape to create a row vector:

matrix.reshape(1, -1) # Output # array([[1, 2, 3, 4, 5, 6, 7, 8, 9]])

Finding the Rank of a Matrix

Problem

You need to know the rank of a matrix.

Solution

Use NumPy’s linear algebra method matrix_rank:

# Load library import numpy as np # Create matrix matrix = np.array([[1, 1, 1], [1, 1, 10], [1, 1, 15]]) # Return matrix rank np.linalg.matrix_rank(matrix) # Output # 2

Discussion

The rank of a matrix is the dimensions of the vector space spanned by its columns or rows. Finding the rank of a matrix is easy in NumPy thanks to matrix_rank.

See Also

The Rank of a Matrix, CliffsNotes

Calculating the Determinant

Problem

You need to know the determinant of a matrix.

Solution

Use NumPy’s linear algebra method det:

# Load library import numpy as np # Create matrix matrix = np.array([[1, 2, 3], [2, 4, 6], [3, 8, 9]]) # Return determinant of matrix np.linalg.det(matrix) # Output # 0.0

Discussion

It can sometimes be useful to calculate the determinant of a matrix. NumPy makes this easy with det.

See Also

• The determinant | Essence of linear algebra, chapter 5, 3Blue1Brown

• Determinant, Wolfram MathWorld

Getting the Diagonal of a Matrix

Problem

You need to get the diagonal elements of a matrix.

Solution

Use diagonal:

# Load library import numpy as np # Create matrix matrix = np.array([[1, 2, 3], [2, 4, 6], [3, 8, 9]]) # Return diagonal elements matrix.diagonal() # Output # array([1, 4, 9])

Discussion

NumPy makes getting the diagonal elements of a matrix easy with diagonal. It is also possible to get a diagonal off from the main diagonal by using the offset parameter:

# Return diagonal one above the main diagonal matrix.diagonal(offset=1) # Output # array([2, 6]) # Return diagonal one below the main diagonal matrix.diagonal(offset=-1) # Output # array([2, 8])

Calculating the Trace of a Matrix

Problem

You need to calculate the trace of a matrix.

Solution

Use trace:

# Load library import numpy as np # Create matrix matrix = np.array([[1, 2, 3], [2, 4, 6], [3, 8, 9]]) # Return trace matrix.trace() # Output # 14

Discussion

The trace of a matrix is the sum of the diagonal elements and is often used under the hood in machine learning methods. Given a NumPy multidimensional array, we can calculate the trace using trace. We can also return the diagonal of a matrix and calculate its sum:

# Return diagonal and sum elements sum(matrix.diagonal()) 14

See Also

• The Trace of a Square Matrix

Finding Eigenvalues and Eigenvectors

Problem

You need to find the eigenvalues and eigenvectors of a square matrix.

Solution

Use NumPy’s linalg.eig:

# Load library import numpy as np # Create matrix matrix = np.array([[1, -1, 3], [1, 1, 6], [3, 8, 9]]) # Calculate eigenvalues and eigenvectors eigenvalues, eigenvectors = np.linalg.eig(matrix) # View eigenvalues eigenvalues # Output # array([ 13.55075847, 0.74003145, -3.29078992]) # View eigenvectors eigenvectors # Output # array([[-0.17622017, -0.96677403, -0.53373322], [-0.435951 , 0.2053623 , -0.64324848], [-0.88254925, 0.15223105, 0.54896288]])

Discussion

Eigenvectors are widely used in machine learning libraries. Intuitively, given a linear transformation represented by a matrix, A, eigenvectors are vectors that, when that transformation is applied, change only in scale (not direction). More formally: Av = λv where A is a square matrix, λ contains the eigenvalues and v contains the eigenvectors. In NumPy’s linear algebra toolset, eig lets us calculate the eigenvalues, and eigenvectors of any square matrix.

See Also

• Eigenvectors and Eigenvalues Explained Visually, Setosa.io

• Eigenvectors and eigenvalues | Essence of linear algebra, Chapter 10, 3Blue1Brown

Calculating Dot Products

Problem

You need to calculate the dot product of two vectors.

Solution

Use NumPy’s dot:

# Load library import numpy as np # Create two vectors vector_a = np.array([1,2,3]) vector_b = np.array([4,5,6]) # Calculate dot product np.dot(vector_a, vector_b) # Output # 32

Discussion

The dot product of two vectors, a and b, is defined as: ∑ i = 1 n ai bi where ai is the ith element of vector a. We can use NumPy’s dot class to calculate the dot product. Alternatively, in Python 3.5+ we can use the new @ operator:

# Calculate dot product vector_a @ vector_b # Output # 32

See Also

• Vector dot product and vector length, Khan Academy

• Dot Product, Paul’s Online Math Notes

Adding and Subtracting Matrices

Problem

You want to add or subtract two matrices.

Solution

Use NumPy’s add and subtract:

# Load library import numpy as np # Create matrix matrix_a = np.array([[1, 1, 1], [1, 1, 1], [1, 1, 2]]) # Create matrix matrix_b = np.array([[1, 3, 1], [1, 3, 1], [1, 3, 8]]) # Add two matrices np.add(matrix_a, matrix_b) # Output # array([[ 2, 4, 2], [ 2, 4, 2], [ 2, 4, 10]]) # Subtract two matrices np.subtract(matrix_a, matrix_b) # Output # array([[ 0, -2, 0], [ 0, -2, 0], [ 0, -2, -6]])

Discussion

Alternatively, we can simply use the + and – operators:

# Add two matrices matrix_a + matrix_b # Output # array([[ 2, 4, 2], [ 2, 4, 2], [ 2, 4, 10]])

Multiplying Matrices

Problem

You want to multiply two matrices.

Solution

Use NumPy’s dot:

# Load library import numpy as np # Create matrix matrix_a = np.array([[1, 1], [1, 2]]) # Create matrix matrix_b = np.array([[1, 3], [1, 2]]) # Multiply two matrices np.dot(matrix_a, matrix_b) # Output # array([[2, 5], [3, 7]])

Discussion

Alternatively, in Python 3.5+ we can use the @ operator:

# Multiply two matrices matrix_a @ matrix_b array([[2, 5], [3, 7]])

If we want to do element-wise multiplication, we can use the * operator:

# Multiply two matrices element-wise matrix_a * matrix_b # Output # array([[1, 3], [1, 4]])

See Also

Array vs. Matrix Operations, MathWorks

Inverting a Matrix

Problem

You want to calculate the inverse of a square matrix.

Solution

Use NumPy’s linear algebra inv method:

# Load library import numpy as np # Create matrix matrix = np.array([[1, 4], [2, 5]]) # Calculate inverse of matrix np.linalg.inv(matrix) # Output # array([[-1.66666667, 1.33333333], [ 0.66666667, -0.33333333]])

Discussion

The inverse of a square matrix, A, is a second matrix A–1, such that: AA−1 = I where I is the identity matrix. In NumPy we can use linalg.inv to calculate A–1 if it exists. To see this in action, we can multiply a matrix by its inverse and the result is the identity matrix:

# Multiply matrix and its inverse matrix @ np.linalg.inv(matrix) # Output # array([[ 1., 0.], [ 0., 1.]])

See Also

Inverse of a Matrix

Generating Random Values

Problem

You want to generate pseudo random values.

Solution

Use NumPy’s random:

# Load library import numpy as np # Set seed np.random.seed(0) # Generate three random floats between 0.0 and 1.0 np.random.random(3) # Output # array([ 0.5488135 , 0.71518937, 0.60276338])

Discussion

NumPy offers a wide variety of means to generate random numbers, many more than can be covered here. In our solution we generated floats; however, it is also common to generate integers:

# Generate three random integers between 1 and 10 np.random.randint(0, 11, 3) # Output # array([3, 7, 9])

Alternatively, we can generate numbers by drawing them from a distribution:

# Draw three numbers from a normal distribution with mean 0.0 # and standard deviation of 1.0 np.random.normal(0.0, 1.0, 3) # Output # array([-1.42232584, 1.52006949, -0.29139398]) # Draw three numbers from a logistic distribution # with mean 0.0 and scale of 1.0 np.random.logistic(0.0, 1.0, 3) # Output # array([-0.98118713, -0.08939902, 1.46416405]) # Draw three numbers greater than or equal to 1.0 and less than 2.0 np.random.uniform(1.0, 2.0, 3) # Output # array([ 1.47997717, 1.3927848 , 1.83607876])

Finally, it can sometimes be useful to return the same random numbers multiple times to get predictable, repeatable results. We can do this by setting the “seed” (an integer) of the pseudo random generator. Random processes with the same seed will always produce the same output. We will use seeds throughout this book so that the code you see in the book and the code you run on your computer produces the same results.

Summary

Subscribe to our newsletter!

In the end i hope you enjoyed this post, if you enjoyed please support us by joining the mailing list, giving us feedback or a small donation. Stay tuned for the next post.

Do not forget that this post is part of Data Science Study Path and Machine Learning Cookbook.