Using Python to establish logical regression classification model from scratch 07/19 Update SLTechnology News&Howtos

Using Python to establish logical regression classification model from scratch

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

Suppose the applicant provides you with grades, and you classify them according to their scores. The goal is to divide applicants into two categories according to their scores, level 1 if the applicant can enter the university, and level 0 if the applicant cannot be admitted. Can this problem be solved by using linear regression? Let's have a look together.

Note: the premise of reading this article is to understand linear regression!

Catalogue

What is logical regression?

Dataset visualization

Hypothesis and cost function

Train the model from scratch

Model evaluation

Scikit-learn implementation

What is logical regression?

Recall that linear regression is used to determine the value of a continuous dependent variable. Logical regression is usually used for classification purposes. Different from linear regression, dependent variables can only use a limited number of values, that is, dependent variables are classified. When there are only two possible results, it is called binary logic regression.

Let's look at how logical regression can be used for classification tasks.

In linear regression, the output is the weighted sum of inputs. Logical regression is a generalized linear regression. In a sense, we do not directly output the weighted sum of inputs, but we pass it through a function that can map any real value between 0 and 1.

If we take the weighted sum of inputs as output, as we do in linear regression, then the value can be greater than 1, but we want a value between 0 and 1. This is why linear regression cannot be used to classify tasks.

As you can see from the following figure, the output of the linear regression is passed through an activation function that can map any real value between 0 and 1.

The activation function used is called the sigmoid function. The curve of the sigmoid function is shown in the following figure

We can see that the value of the sigmoid function is always between 0 and 1. When X = 0, the value is exactly 0.5. We can use 0.5 as the probability threshold to determine the class. If the probability is greater than 0.5, we classify it as Class-1 (Y = 1) or Class-0 (Y = 0).

Before we build the model, let's take a look at the assumptions made by logical regression

Dependent variables must be absolute

Independent variables (characteristics) must be independent (to avoid multicollinearity)

Data set

The data used in this article come from Wu Enda's machine learning course on Coursera. The data can be downloaded here. (https://www.coursera.org/learn/machine-learning) this data includes two test scores of 100 applicants. The target value is a binary value of 1. 0. 1 means that the applicant has been admitted to the university, and 0 means that the applicant has not been admitted. Its goal is to create a classifier that can predict whether an application will be accepted by a university.

Let's use the read_csv function to load the data into pandas Dataframe. We also divide the data into enrolled and unenrolled data to make the data visual.

Now that we have a clear understanding of the problems and data, let's continue to build our model.

Hypothesis and cost function

So far, we have learned how to use logical regression to classify instances into different classes. In this section, we will define assumptions and cost functions.

The linear regression model can be expressed by equation.

Then, we apply the sigmoid function to the output of linear regression

The sigmoid function is expressed as

Then the assumption of logical regression is

If the weighted sum entered is greater than zero, the predicted class is 1, and vice versa. Therefore, by setting the weighted sum of the input to 0, you can find the decision boundary that separates the two classes.

Cost function

Like linear regression, we will define a cost function for the model with the goal of minimizing cost.

The cost function of a single training example can be given in the following ways:

Cost function intuition

If the actual class is 1 and the model predicts 0, we should punish it and vice versa. As you can see from the following figure, for the case where h (x) is close to 1-log (h (x)), the cost is 0, and when h (x) is close to 0, the cost is infinite (that is, we impose severe penalties on the model). Similarly, for drawing-log (1murh (x)), when the actual value is 0 and the model predicts 0, the cost is 0 and becomes infinite when h (x) approaches 1.

We can use the following combination of two equations:

The cost of all training samples represented by J (θ) can be calculated by taking the average cost of all training samples.

Where m is the number of training samples.

We will use gradient descent to minimize the cost function. Any parameter of gradient w.r.t can be given by this equation.

This equation is similar to the equation we obtained in linear regression, and only h (x) is different in these two cases.

Training model

Now we have everything we need to build the model. Let's implement it in the code.

Let's first prepare the data for our model.

We will define some functions that will be used to calculate the cost.

Next, we define the cost and gradient function.

We also define a fitting function that will be used to find the model parameters of the minimized cost function. In this paper, we write the gradient descent method to calculate the model parameters. Here, we will use the fmin_tnc function in the scipy library. It can be used to calculate the minimum value of any function. It takes the parameter as:

Func: minimized function

X0: the initial value of the parameter we want to find

The gradient of the function defined by fprime:'func'

Args: parameters to be passed to the function

Model parameters are [- 25.16131856 0.20623159 0.20147149]

To understand how good our model is, we will draw decision boundaries.

Draw decision boundaries

Because there are two characteristics in our data set, the linear equation can be expressed as

As mentioned earlier, the decision boundary can be found by setting the weighted sum of the input to 0. Equate h (x) to 0

We will draw the decision boundary above the graph we used to visualize the dataset.

It seems that our model is doing a good job in predicting the course. But how accurate is it? Let's see.

Accuracy of the model

The accuracy of the model is 89%.

Let's implement our classifier using scikit-learn and compare it with the model we built from scratch.

Scikit-learn implementation

The parameter of the model is [- 2.8583143Magi 0.0521473Magi 0.04531467], and the accuracy is 91%.

Why are the model parameters so different from the model we implemented from scratch? If you look at the documentation of sk-learn 's logical regression implementation, you will find that regularization is taken into account. Basically, regularization is used to prevent the model from overfitting the data. In this article, I won't delve into the details of regularization.

The complete code used in this article can be found in this GitHub. (https://github.com/animesh-agarwal/Machine-Learning/tree/master/LogisticRegression)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.