In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly introduces the relevant knowledge of "how to achieve ridge regression in Python". The editor shows you the operation process through an actual case. The operation method is simple, fast and practical. I hope this article "how to achieve ridge regression in Python" can help you solve the problem.
1 Overview 1.1 Linear regression
For the general linear regression problem, the least square method is used to solve the parameters, and the objective function is as follows:
The parameter w can also be solved by using the following matrix method:
This formula looks scary, but in fact the derivation process is simple.
Derived from the paper tiger)
For matrix X, if the linear correlation of some columns is large (that is, some attributes in the training sample are linearly related), it will lead to
If the value is close to 0, instability will occur in the calculation.
Conclusion: the traditional linear regression method based on least square is lack of stability.
1.2 Ridge regression
Optimization objectives of Ridge regression:
The corresponding matrix solving method is as follows:
Ridge regression (ridge regression) is a biased estimation regression method specially used for collinear data analysis.
It is an improved least square estimation method, and the fitting of some data is better than the least square method.
1.3 over-fitting
Figure 2 is the normal fitting, which is in line with the trend of the data, while figure 3, although it fits very well in the training set, when there is unknown data, such as when the Size is very large, according to the current fitting, the result may be very small, and the error will be very large.
Ridge regression in 2 sklearn
In the sklearn library, you can use sklearn.linear_model.Ridge to call the ridge regression model, and its main parameters are:
Alpha: regularization factor, corresponding to the loss function.
Fit_intercept: indicates whether the intercept is calculated
Solver: sets the method for calculating parameters. You can choose parameters such as' auto', 'svd',' sag', etc.
3 cases
Examples of traffic flow forecasting:
3.1 data introduction
The data is the traffic flow monitoring data of an intersection, recording the annual hourly traffic flow.
3.2 Experimental purpose
The polynomial feature is created according to the existing data, and the ridge regression model is used instead of the general linear model to carry on the polynomial regression to the information of traffic flow.
3.3 the data are characterized as follows
HR: the hour of the day (0-23)
WEEK_DAY: the day of the week (0-6)
DAY_OF_YEAR: the day of the year (1-365)
WEEK_OF_YEAR: the week of the year (1-53)
TRAFFIC_COUNT: traffic flow
All datasets contain more than 20, 000 pieces of data (21626)
4. Python implementation code # * = 1. Establish the project, import sklearn related toolkit = * * import numpy as npfrom sklearn.linear_model import Ridge # load cross-validation module import matplotlib.pyplot as plt # load matplotilib module from sklearn.preprocessing import PolynomialFeatures # through sklearn.linermodel load ridge regression method import matplotlib.pyplot as plt # load matplotilib module from sklearn.preprocessing import PolynomialFeatures # load to create polynomial features, such as ab, a2, b2 # * = 2. Data load = * * data=np.genfromtxt ('Ridge regression .csv, delimiter=',') # load data print (data) print (data.shape) plt.plot (data [:, 4]) from the csv file using numpy method # use plt to display traffic information # plt.show () # * = 3. Data processing = * * X=data [:,: 4] # X is used to save 0-3D data, that is, attribute y=data [:, 4] # y is used to save the 4th dimensional data, that is, traffic flow poly=PolynomialFeatures (6) # is used to create a polynomial feature with the highest degree to the power of 6. After many experiments, it is decided to use 6 times X=poly.fit_transform (X) # X as the polynomial feature # * = 4. Divide the training set and the test set = = * * train_set_x, test_set_x, train_set_y, test_set_y = model_selection.train_test_split # divide all the data into the training set and the test set, test_size represents the proportion of the test set, and # # random_state is the seed of random number # * = 5. Create a regression and train it = * * clf=Ridge (alpha=1.0,fit_intercept = True) # next we create a ridge regression example clf.fit (train_set_x,train_set_y) # call the fit function to use the training set to train the regression tool clf.score (test_set_x,test_set_y) # use the test set to calculate the goodness of fit of the regression curve, the clf.score return value is 0.737 degrees of goodness of fit, which is used to evaluate the goodness of fit The maximum is 1, and there is no minimum. When the same value is output for all inputs, the goodness of fit is 0. # * = 6. Draw the fitting curve = * * start=100 # next we draw a fitting curve end=200y_pre=clf.predict (X) # is the fitting value time=np.arange (start,end) plt.plot (time,y [start:end], 'baked, label= "real") plt.plot (time,y_ [end],' r') which calls the predict function. Label='predict') # shows the real data (blue) and the fitted curve (red) plt.legend (loc='upper left') # sets the position of the legend plt.show () 4.2 results
Analysis conclusion: the trend of predicted value and actual value is about the same.
This is the end of the introduction of "how to achieve Ridge return by Python". Thank you for your reading. If you want to know more about the industry, you can follow the industry information channel. The editor will update different knowledge points for you every day.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.