How does Pokemon relate to Logistic Regression

In this post you will learn how to use Logistic Regression to successfully predict whether a Pokemon is a fire type.

What is Classification

The act of classifying is determining what category an object belongs to. Let’s say you have a bouncy circular object, and you want to know if it is a rock or not. By observing the features of this object, you can quickly distinguish whether it is a rock or not. We can assume rocks are not bouncy. Unfortunately for you, the object is bouncy thus it is not a rock. The human brain is complex enough to do task like this almost instantly, so this example seems trivial. Thus, classification is determining what category an object belongs to.

How do computers classify?

For now let’s talk about computers classifying objects in the simplest way possible, which is determining an objects category. Computers operate in binary, meaning every decision a computer makes is choosing between 1 and 0. Let’s say the value of a rock is 1 and if not a rock then it is 0. So if we can give a computer data, it would determine whether an object is a rock or not by seeing if the value of the object is 1.

What is Logistic Regression?

Logistic Regression is a supervised binary classification technique. What does that mean? That means you can give your Logistic Regression model labeled data, and the models decides whether or not the data belongs to a specific group.

The Math of Logistic Regression

Recall your rock object. Let’s say the features of this object are bounciness level, volume, and buoyancy. We can represent those features mathematically as a feature matrix.

bounciness level volume buoyancy 1 0.45 56 31

If you do not understand feature matrices you can click here. Each of the columns within this feature matrix are independent x values. Great, so how do we calculate a classification?

Once again if you do not understand this notation for the theta and x’s click here.

Sigmoid Function Explained

Let’s examine the Sigmoid function

The sigmoid function is perfect because it will output a value between 0 and 1

0 <= f(t) <= 1

This is perfect for binary classification because it’s just like computers, we can determine an object classification by whether it is a 0 or 1. However, we seldom get an exact value majority of times our output is a decimal. So, we usually set a cut off point typically at 0.5. So, if it is above or equal to 0.5 you can interpret the value has 1 or if below 0.5 you can interpret the value as 0. Back to our rock example. If our circular object value has 0.6 then it can be classified as a rock; however, if our circular object was 0.3 then we can classify it as not being a rock.

Sigmoid Explain Graphically

When you graph the sigmoid function it appears as

The graph represents how the function output cannot surpass 0 or 1. Also, showing how 0.5 is the cut off point between a classification.

How do you know if your model is accurate?

Great question! Later in this post, we will discuss calculating the accuracy of a model. For now, let’s focus on your loss. What is the loss? The loss is the difference between the value of your model’s prediction and the objects actually value. For example, if the object is a rock then it’s value is 1. If your model predicted 0.5, while it is still in the ballpark for a correct estimate since you did not estimate 1, your loss on the prediction was 0.5. The function we can use to calculate this loss is

However, this function can be written as

Notice that depending on our y value it will cancel out one of the addends.

The Almighty Thetas!

In your Logistic Regression model, your theta’s are essential to the model. Our end goal is to find the perfect theta values so that given any x we can accurately predict its classification. So you essentially want to treat your thetas with love and care. How does one love their thetas? Training! Training is machine learning jargon for placing your thetas through a series of iterations in order to find their optimal value, this can be done by using Gradient Descent.

What is Gradient Descent?

Gradient descent is an optimization technique used to find the minimum value of a function. You can imagine Gradient Descent as a ball rolling down a bowl. The goal of this ball is to find the bottom of the bowl if you want a more in-depth explanation click here. In our case, this will result in us finding the optimal thetas. In our quest to find the perfect thetas we will use

Let’s talk about Pokemon

Pokemon come in several types with some being multiple types. For the sake of this project, let’s assume every Pokemon has one type. You can think of a Pokemon type as it’s classification. So you will either have a water, fire, grass, fighting, bug, dragon, rock, or ect type Pokemon. A Pokemon's type will affect the stats of that Pokemon. Meaning given a Pokemon’s stats aka features we can use Logistic Regression to predict the Pokemon’s type.

Data Set

The dataset we are using is pokemon_alopez247.csv. You can use it by clicking here. If you have the Kaggle API installed on your machine you can download the dataset in your terminal with this command:

kaggle datasets download -d alopez247/pokemon

Let’s Start Coding!

Libraries you will be using are:

import numpy as np

import pandas as pd

from sklearn.model_selection import train_test_split

Now you can use Pandas to setup your dataframe:

data = pd.read_csv('data/pokemon_alopez247.csv')

data = data.drop(['Type_2'], axis=1)

data.dropna()



Great, next you will need to clean up your data. Remember that we are predicting values between 0 and 1. So in our Type_1 column. Every Pokemon that is a fire type will receive a score of 1 and all other types will be 0.

def updateTypeColumn(dataframe, columnName, columnValue):

for index, row in dataframe.iterrows():

if row.Type_1 == columnValue:

dataframe.loc[index, columnName] = 1

else:

dataframe.loc[index, columnName] = 0

updateTypeColumn(data, 'Type_1', 'Fire')

data['Type_1'] = data['Type_1'].apply(int) # Converts column to int

By running:

print(data.corr())

we find that there is a 14% correlation between Special Attack and a Pokemon’s type. Also, there is a 17% correlation between Probability of a Pokemon being male and a Pokemon’s type.

Great! We are going to use train_test_split from the sklearn.model_selection. This model,

“split arrays or matrices into random train and test subsets”

So we are going to use 30% of our data as training data and 70% as testing data.

X = [[row[1]['Sp_Atk'], row[1]['Pr_Male']] for row in

data.iterrows()]

y = [row[1]['Type_1'] for row in data.iterrows()]



training_features, testing_features, training_output, testing_output = train_test_split(X,

y,

test_size=0.7,

train_size=0.3,

random_state=42)

Now we can set our theta

theta = np.random.uniform(size=len(training_features[0]))

You can finally implement the sigmoid function!

def sigmoid(z):

return 1/(1 + np.exp(-z)) # np.exp(-z) = e^-z

Now you can write your cost function:

def costFunction(x, y, m, theta):

loss = 0

for i in range(m):

z = np.dot(

np.transpose(theta),

x[i]

)

loss += y[i] * np.log(sigmoid(z)) + (1 - y[i]) * np.log(1 - sigmoid(z))

return -(1/m) * loss

Let’s implement your optimization function. Recall we are using Gradient Descent because it will allow us to find the optimal theta values.

def gradientDescent(x, y, m, theta, alpha, iterations=1500):

for iteration in range(iterations):

for j in range(len(theta)):

gradient = 0

for i in range(m):

z = np.dot(

np.transpose(theta),

x[i]

)

gradient += (sigmoid(z) - y[i]) * x[i][j]

theta[j] = theta[j] - ((alpha/m) * gradient)

print('Current Error is:', costFunction(x, y, m, theta))

return theta

Congrats, you can now run your model!

print('Final theta\'s

',

gradientDescent(training_features,

training_output,

len(training_features[0]), theta, 0.001))

How do you test this model?

For classification problems we can measure accuracy by:

Accuracy = (correct predictions)/(total predictions)

Error = 1 - Accuracy

Now you can code this:

def test(x, y, m, theta):

correct = 0

for i in range(m):

z = np.dot(

np.transpose(theta),

x[i]

)

predicted_value = sigmoid(z)

if predicted_value >= 0.5 and y[i] == 1:

correct += 1

elif predicted_value < 0.5 and y[i] == 0:

correct += 1

return correct/m, (1 - (correct/m))

Check your results:

accuracy_rate, error_rate = test(testing_features,

testing_output,

len(testing_output), theta) print('Accuracy: {accuracy}

Error: '

'{error}'.format(accuracy=accuracy_rate,

error=error_rate))

Your results are:

Accuracy: 0.8514851485148515

Error: 0.14851485148514854

Wow! You were able to predict whether a Pokemon is a fire type with 85% accuracy.

Entire Code

Final Words

Thanks for reading! If you have any questions or thoughts please comment below.