How to realize polynomial regression by Python 07/01 Update SLTechnology News&Howtos

How to realize polynomial regression by Python

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

Today, I will talk to you about how to achieve polynomial regression in Python. Many people may not know much about it. In order to make you understand better, the editor has summarized the following content for you. I hope you can get something according to this article.

What can python do Python is a programming language, built in many effective tools, Python is almost omnipotent, the language is easy to understand, easy to start, powerful, in many fields, such as the most popular big data analysis, artificial intelligence, Web development and so on.

1. Overview 1.1 supervised learning

1.2 polynomial regression

Last time we talked about linear regression, this time we focus on polynomial regression.

Polynomial regression (Polynomial Regression) is a regression analysis method to study the polynomials between a dependent variable and one or more independent variables. If there is only one independent variable, it is called univariate polynomial regression; if there are multiple independent variables, it is called multivariate polynomial regression.

The main results are as follows: (1) in the univariate regression analysis, if the relationship between the variable y and the independent variable x is nonlinear, but there is no suitable function curve to fit, the univariate polynomial regression can be used.

(2) the biggest advantage of polynomial regression is that the measured point can be approached by increasing the higher order term of x until it is satisfied.

(3) in fact, polynomial regression can deal with quite a kind of nonlinear problems, and it plays an important role in regression analysis, because any function can be approximated by polynomials in segments.

2 concept

In the example of linear regression mentioned earlier, straight lines are used to fit the linear relationship between data input and output. Different from linear regression, polynomial regression uses the mapping relationship between the input and output of curve fitting data.

3 case realization-- method 13.1 case study

Application background: we have made a linear regression according to the known house transaction price and the size of the house, and then we can predict the transaction price of the known house size and the unknown house transaction price. but in the practical application, this kind of fitting is often not good enough, so we carry on the polynomial regression to this data set here.

Objective: to establish a polynomial regression equation for the housing transaction information, and to predict the house price according to the regression equation.

Transaction information includes the area of the house and the corresponding transaction price:

(1) the unit of floor area is square feet (ft 2)

(2) the transaction price of the house is ten thousand units.

3.2 Code implementation

Import matplotlib.pyplot as pltimport numpy as npfrom sklearn import linear_modelfrom sklearn.preprocessing import PolynomialFeatures # read dataset datasets_X = [] datasets_Y = [] fr = open ('polynomial linear regression .csv','r') lines = fr.readlines () for line in lines: items = line.strip () .split (' ') datasets_X.append (int (items [0])) datasets_Y.append (int (items [1])) length = len (datasets_X) datasets_X = np.array (datasets_X). Reshape ([length,1]) datasets_Y = np.array (datasets_Y) minX = min (datasets_X) maxX = max (datasets_X) X = np.arange (minX MaxX). Reshape ([- 1 degree=2 1]) poly_reg = PolynomialFeatures (degree=2) # degree=2 denotes the quadratic polynomial characteristic X_poly for establishing datasets_X. X_poly = poly_reg.fit_transform (datasets_X) # use PolynomialFeatures to construct the quadratic polynomial of x X_polylin_reg_2 = linear_model.LinearRegression () lin_reg_2.fit (X_poly, datasets_Y) # and then create a linear regression Using linear model (linear_model) to learn the mapping relationship between X_poly and y print (X_poly) print (lin_reg_2.predict (poly_reg.fit_transform (X)) print ('Coefficients:', lin_reg_2.coef_) # View regression equation coefficients (k) print (' intercept:') Lin_reg_2.intercept_) # # View the intercept of the regression equation (b) print ('the model is y = {0} + ({1} * x) + ({2} * x ^ 2)' .format (lin_reg_2.intercept_,lin_reg_2.coef_ [0], lin_reg_2.coef_ [1])) # the image shows plt.scatter (datasets_X, datasets_Y, color = 'red') # scatter function for drawing data points Here, the data points are drawn in red. # the plot function is used to draw the regression line. Similarly, here we need to first process X as a polynomial feature; plt.plot (X, lin_reg_2.predict (poly_reg.fit_transform (X)), color = 'blue') plt.xlabel (' Area') plt.ylabel ('Price') plt.show () 3.3.3.Result

[[1.0000000e+00 1.0000000e+03 1.0000000e+06]

[1.0000000e+00 7.9200000e+02 6.2726400e+05]

[1.0000000e+00 1.2600000e+03 1.5876000e+06]

[1.0000000e+00 1.2620000e+03 1.5926440e+06]

[1.0000000e+00 1.2400000e+03 1.5376000e+06]

[1.0000000e+00 1.1700000e+03 1.3689000e+06]

[1.0000000e+00 1.2300000e+03 1.5129000e+06]

[1.0000000e+00 1.2550000e+03 1.5750250e+06]

[1.0000000e+00 1.1940000e+03 1.4256360e+06]

[1.0000000e+00 1.4500000e+03 2.1025000e+06]

[1.0000000e+00 1.4810000e+03 2.1933610e+06]

[1.0000000e+00 1.4750000e+03 2.1756250e+06]

[1.0000000e+00 1.4820000e+03 2.1963240e+06]

[1.0000000e+00 1.4840000e+03 2.2022560e+06]

[1.0000000e+00 1.5120000e+03 2.2861440e+06]

[1.0000000e+00 1.6800000e+03 2.8224000e+06]

[1.0000000e+00 1.6200000e+03 2.6244000e+06]

[1.0000000e+00 1.7200000e+03 2.9584000e+06]

[1.0000000e+00 1.8000000e+03 3.2400000e+06]

[1.0000000e+00 4.4000000e+03 1.9360000e+07]

[1.0000000e+00 4.2120000e+03 1.7740944e+07]

[1.0000000e+00 3.9200000e+03 1.5366400e+07]

[1.0000000e+00 3.2120000e+03 1.0316944e+07]

[1.0000000e+00 3.1510000e+03 9.9288010e+06]

[1.0000000e+00 3.1000000e+03 9.6100000e+06]

[1.0000000e+00 2.7000000e+03 7.2900000e+06]

[1.0000000e+00 2.6120000e+03 6.8225440e+06]

[1.0000000e+00 2.7050000e+03 7.3170250e+06]

[1.0000000e+00 2.5700000e+03 6.6049000e+06]

[1.0000000e+00 2.4420000e+03 5.9633640e+06]

[1.0000000e+00 2.3870000e+03 5.6977690e+06]

[1.0000000e+00 2.2920000e+03 5.2532640e+06]

[1.0000000e+00 2.3080000e+03 5.3268640e+06]

[1.0000000e+00 2.2520000e+03 5.0715040e+06]

[1.0000000e+00 2.2020000e+03 4.8488040e+06]

[1.0000000e+00 2.1570000e+03 4.6526490e+06]

[1.0000000e+00 2.1400000e+03 4.5796000e+06]

[1.0000000e+00 4.0000000e+03 1.6000000e+07]

[1.0000000e+00 4.2000000e+03 1.7640000e+07]

[1.0000000e+00 3.9000000e+03 1.5210000e+07]

[1.0000000e+00 3.5440000e+03 1.2559936e+07]

[1.0000000e+00 2.9800000e+03 8.8804000e+06]

[1.0000000e+00 4.3550000e+03 1.8966025e+07]

[1.0000000e+00 3.1500000e+03 9.9225000e+06]

[1.0000000e+00 3.0250000e+03 9.1506250e+06]

[1.0000000e+00 3.4500000e+03 1.1902500e+07]

[1.0000000e+00 4.4020000e+03 1.9377604e+07]

[1.0000000e+00 3.4540000e+03 1.1930116e+07]

[1.0000000e+00 8.9000000e+02 7.9210000e+05]]

[231.16788093 231.19868474 231.22954958... 739.2018995 739.45285011

739.70386176]

Coefficients: [0.00000000e+00-1.75650177e-02 3.05166076e-05]

Intercept: 225.93740561055927

The model is yearly 225.93740561055927 + (0.01x) + (- 0.017565017675036532x ^ 2)

3.4 Visualization

4 case implementation method 24.1 Code import matplotlib.pyplot as pltfrom sklearn.preprocessing import PolynomialFeaturesfrom sklearn.pipeline import Pipelinefrom sklearn.linear_model import LinearRegressionfrom sklearn.metrics import mean_squared_error, r2_scoreimport numpy as npimport pandas as pdimport warnings warnings.filterwarnings (action= "ignore") Module= "sklearn") dataset = pd.read_csv ('polynomial linear regression .csv') X = np.asarray (dataset.get ('x')) y = np.asarray (dataset.get ('y')) # divide training set and test set X_train = X [:-2] X_test = X [- 2:] y_train = y [:-2] y_test = y [- 2:] # fit_intercept is Truemodel1 = Pipeline ([('poly', PolynomialFeatures (degree=2)) ('linear', LinearRegression (fit_intercept=True)]) model1 = model1.fit (X_train [:, np.newaxis], y_train) y_test_pred1 = model1.named_steps [' linear'] .intercept _ + model1.named_steps ['linear'] .coef _ [1] * X_testprint (' while fit_intercept is True:.') print ('Coefficients:' Model1.named_steps ['linear'] .coef _) print (' Intercept:', model1.named_steps ['linear'] .intercept _) print (' the model is: y =', model1.named_steps ['linear'] .intercept _,' +', model1.named_steps ['linear'] .coef _ [1],' * X') # mean square error print ("Mean squared error:% .2f"% mean_squared_error (y_test, y_test_pred1)) # R2 score Between 0 and 1, the closer to 1 means the better the model. The closer to 0, the worse the model print ('Variance score:% .2f'% r2_score (y_test, y_test_pred1),'\ n') # fit_intercept is Falsemodel2 = Pipeline ([('poly', PolynomialFeatures (degree=2)), (' linear', LinearRegression (fit_intercept=False))]) model2 = model2.fit (X_train [:, np.newaxis] Y_train) y_test_pred2 = model2.named_steps ['linear'] .coef _ [0] + model2.named_steps [' linear'] .coef _ [1] * X_test +\ model2.named_steps ['linear'] .coef _ [2] * X_test * X_testprint (' while fit_intercept is False:.. .') print ('Coefficients:' Model2.named_steps ['linear'] .coef _) print (' Intercept:', model2.named_steps ['linear'] .intercept _) print (' the model is: y =', model2.named_steps ['linear'] .coef _ [0],' +', model2.named_steps ['linear'] .coef _ [1],' * X +', model2.named_steps ['linear'] .coef _ [2]) '* X ^ 2') # mean square error print ("Mean squared error:% .2f"% mean_squared_error (y_test, y_test_pred2)) # R2 score Between 0 and 1, the closer to 1 means the better the model. The closer to 0, the worse the model print ('Variance score:% .2f'% r2_score (y_test, y_test_pred2),'\ n') plt.xlabel ('x') plt.ylabel ('y') # draw the scatter plot of the training set plt.scatter (X_train, y_train, alpha=0.8, color='black') # draw the model plt.plot (X_train Model2.named_steps ['linear'] .coef _ [0] + model2.named_steps [' linear'] .coef _ [1] * X_train + model2.named_steps ['linear'] .coef _ [2] * X_train * X_train, color='red', linewidth=1) plt.show () 4.2 results

If you do not use a framework, you need to manually add high-order items to the data, and it is much more convenient to have a framework. Sklearn uses the Pipeline function to simplify this part of the preprocessing process.

When degree=1 in PolynomialFeatures, the effect is the same as using LinearRegression, the result is a linear model, in degree=2, it is a quadratic equation, if it is a single variable, it is a parabola, and a double variable is a parabola. and so on.

Here is a fit_intercept parameter, and let's take an example to see what it does.

When fit_intercept is True, the first value in coef_ is 0. the value in coef_ _ is the actual intercept.

When fit_intercept is False, the first value in coef_ is the intercept and the value in intercept_ is 0.

As shown in the figure, the first part is the result when fit_intercept is True, and the second part is the result when fit_intercept is False.

While fit_intercept is True:.Coefficients: [0.00000000e+00-3.70858180e-04 2.78609637e-05] Intercept: 204.25470490804574the model is: y = 204.25470490804574 +-0.00037085818009180454 * XMean squared error: 26964.95Variance score:-3.61 while fit_intercept is False:.. Intercept: 0.0the model is: y = 204.2547049080572 +-0.0003708581801012066 * X + 2.7860963722809286e-05 * X ^ 2Mean squared error: 7147.78Variance score:-0.224.3 Visualization

After reading the above, do you have any further understanding of how Python implements polynomial regression? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.