How to realize Ridge regression by Python 04/28 Update SLTechnology News&Howtos

How to realize Ridge regression by Python

2025-04-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly introduces the relevant knowledge of "how to achieve ridge regression in Python". The editor shows you the operation process through an actual case. The operation method is simple, fast and practical. I hope this article "how to achieve ridge regression in Python" can help you solve the problem.

1 Overview 1.1 Linear regression

For the general linear regression problem, the least square method is used to solve the parameters, and the objective function is as follows:

The parameter w can also be solved by using the following matrix method:

This formula looks scary, but in fact the derivation process is simple.

Derived from the paper tiger)

For matrix X, if the linear correlation of some columns is large (that is, some attributes in the training sample are linearly related), it will lead to

If the value is close to 0, instability will occur in the calculation.

Conclusion: the traditional linear regression method based on least square is lack of stability.

1.2 Ridge regression

Optimization objectives of Ridge regression:

The corresponding matrix solving method is as follows:

Ridge regression (ridge regression) is a biased estimation regression method specially used for collinear data analysis.

It is an improved least square estimation method, and the fitting of some data is better than the least square method.

1.3 over-fitting

Figure 2 is the normal fitting, which is in line with the trend of the data, while figure 3, although it fits very well in the training set, when there is unknown data, such as when the Size is very large, according to the current fitting, the result may be very small, and the error will be very large.

Ridge regression in 2 sklearn

In the sklearn library, you can use sklearn.linear_model.Ridge to call the ridge regression model, and its main parameters are:

Alpha: regularization factor, corresponding to the loss function.

Fit_intercept: indicates whether the intercept is calculated

Solver: sets the method for calculating parameters. You can choose parameters such as' auto', 'svd',' sag', etc.

3 cases

Examples of traffic flow forecasting:

3.1 data introduction

The data is the traffic flow monitoring data of an intersection, recording the annual hourly traffic flow.

3.2 Experimental purpose

The polynomial feature is created according to the existing data, and the ridge regression model is used instead of the general linear model to carry on the polynomial regression to the information of traffic flow.

3.3 the data are characterized as follows

HR: the hour of the day (0-23)

WEEK_DAY: the day of the week (0-6)

DAY_OF_YEAR: the day of the year (1-365)

WEEK_OF_YEAR: the week of the year (1-53)

TRAFFIC_COUNT: traffic flow

All datasets contain more than 20, 000 pieces of data (21626)

4. Python implementation code # * = 1. Establish the project, import sklearn related toolkit = * * import numpy as npfrom sklearn.linear_model import Ridge # load cross-validation module import matplotlib.pyplot as plt # load matplotilib module from sklearn.preprocessing import PolynomialFeatures # through sklearn.linermodel load ridge regression method import matplotlib.pyplot as plt # load matplotilib module from sklearn.preprocessing import PolynomialFeatures # load to create polynomial features, such as ab, a2, b2 # * = 2. Data load = * * data=np.genfromtxt ('Ridge regression .csv, delimiter=',') # load data print (data) print (data.shape) plt.plot (data [:, 4]) from the csv file using numpy method # use plt to display traffic information # plt.show () # * = 3. Data processing = * * X=data [:,: 4] # X is used to save 0-3D data, that is, attribute y=data [:, 4] # y is used to save the 4th dimensional data, that is, traffic flow poly=PolynomialFeatures (6) # is used to create a polynomial feature with the highest degree to the power of 6. After many experiments, it is decided to use 6 times X=poly.fit_transform (X) # X as the polynomial feature # * = 4. Divide the training set and the test set = = * * train_set_x, test_set_x, train_set_y, test_set_y = model_selection.train_test_split # divide all the data into the training set and the test set, test_size represents the proportion of the test set, and # # random_state is the seed of random number # * = 5. Create a regression and train it = * * clf=Ridge (alpha=1.0,fit_intercept = True) # next we create a ridge regression example clf.fit (train_set_x,train_set_y) # call the fit function to use the training set to train the regression tool clf.score (test_set_x,test_set_y) # use the test set to calculate the goodness of fit of the regression curve, the clf.score return value is 0.737 degrees of goodness of fit, which is used to evaluate the goodness of fit The maximum is 1, and there is no minimum. When the same value is output for all inputs, the goodness of fit is 0. # * = 6. Draw the fitting curve = * * start=100 # next we draw a fitting curve end=200y_pre=clf.predict (X) # is the fitting value time=np.arange (start,end) plt.plot (time,y [start:end], 'baked, label= "real") plt.plot (time,y_ [end],' r') which calls the predict function. Label='predict') # shows the real data (blue) and the fitted curve (red) plt.legend (loc='upper left') # sets the position of the legend plt.show () 4.2 results

Analysis conclusion: the trend of predicted value and actual value is about the same.

This is the end of the introduction of "how to achieve Ridge return by Python". Thank you for your reading. If you want to know more about the industry, you can follow the industry information channel. The editor will update different knowledge points for you every day.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.