How to write the code to realize linear regression in python 07/01 Update SLTechnology News&Howtos

How to write the code to realize linear regression in python

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces "how to write the code for python to achieve linear regression". In the daily operation, I believe that many people have doubts about how to write the code for python to achieve linear regression. The editor has consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts of "how to write the code for python to achieve linear regression". Next, please follow the editor to study!

1 linear regression 1.1 simple linear regression

In simple linear regression, the linear relationship from x to y is fitted by adjusting the parameter values of an and b. The following figure shows the goal of optimization for fitting, that is, MES (Mean Squared Error), except that the average part is omitted (divided by m).

For simple linear regression, there are only two parameters an and b, and the optimal an and b can be obtained by finding the extreme value (least square method) of the MSE optimization objective, so when training the simple linear regression model, we only need to solve these two parameter values according to the data.

Here is a simple linear regression using characteristic RM (average number of rooms per dwelling) with index 5 in the Boston house price dataset. The evaluation indicators used are:

# encapsulate simple linear regression algorithm in the form of sklearn import numpy as npimport sklearn.datasets as datasetsfrom sklearn.model_selection import train_test_splitimport matplotlib.pyplot as pltfrom sklearn.metrics import mean_squared_error,mean_absolute_errornp.random.seed (123) class SimpleLinearRegression (): def _ _ init__ (self): "" initialize model parameters self.a_=None self.b_=None def fit (self,x_train) Y_train): training model parameters Parameters-x_train:train x, shape:data [N,] y_train:train y, shape:data [N,] assert (x_train.ndim==1 and y_train.ndim==1),\ "" Simple Linear Regression model can only solve single feature training data "assert len (x_train) = = len (y_train) \ "" the size of x_train must be equal to y_train "" x_mean=np.mean (x_train) y_mean=np.mean (y_train) self.a_=np.vdot ((x_train-x_mean), (y_train-y_mean)) / np.vdot ((x_train-x_mean)) (x_train-x_mean) self.b_=y_mean-self.a_*x_mean def predict (self,input_x): make predictions based on a batch of data input_x:shape- > [N,] assert input_x.ndim==1 \ "" Simple Linear Regression model can only solve single feature data "" return np.array ([self.pred_ (x) for x in input_x]) def pred_ (self) X): give a prediction based on single input x return self.a_*x+self.b_ def _ _ repr__ (self): return "SimpleLinearRegressionModel" if _ _ name__ = ='_ main__': boston_data = datasets.load_boston () x = boston_data ['data'] [:, 5] # total x data (506,) y = boston_data [' target'] # total y data ) # keep data with target value less than 50. X = x [y < 50] # total x data (490,) y = y [y < 50] # total x data (490,) plt.scatter (x, y) plt.show () # train size: test size: (147,) x_train, x_test, y_train, y_test = train_test_split (x, y, test_size=0.3) regs = SimpleLinearRegression () regs.fit (x_train) Y_train) y_hat = regs.predict (x_test) rmse = np.sqrt (np.sum ((y_hat-y_test) * * 2) / len (x_test)) mse = mean_squared_error (y_test, y_hat) mae = mean_absolute_error (y_test Y_hat) # notice R_squared_Error = 1-mse / np.var (y_test) print ('mean squared error:%.2f'% (mse)) print (' root mean squared error:%.2f'% (rmse)) print ('mean absolute error:%.2f'% (mae)) print (' R squared Error:%.2f'% (R_squared_Error))

Output result:

Mean squared error:26.74

Root mean squared error:5.17

Mean absolute error:3.85

R squared Error:0.50

Visualization of data:

1.2 Multivariate linear regression

In multiple linear regression, the sample of a single x has multiple characteristics, that is, the x with the subscript in the picture above.

Its structure can be expressed by vector multiplication:

In order to facilitate the calculation, x will generally be added a feature of 1 to facilitate the intercept bias calculation.

The optimization goal of multiple linear regression is consistent with that of simple linear regression.

Through the matrix derivation calculation, the solution of the equation can be obtained, but the time complexity of the solution is very high.

Here is a multiple linear regression of all the characteristics of Boston house prices in the form of a solution of the normal equation.

Import numpy as npfrom PlayML.metrics import r2_scorefrom sklearn.model_selection import train_test_splitimport sklearn.datasets as datasetsfrom PlayML.metrics import root_mean_squared_errornp.random.seed (123) class LinearRegression (): def _ _ init__ (self): self.coef_=None # coeffient self.intercept_=None # interception self.theta_=None def fit_normal (self, x_train Y_train): "" use normal equation solution for multiple linear regresion as model parameters Parameters-theta= (X ^ T * X) ^-1 * X ^ T * y assert x_train.shape [0] = y_train.shape [0] \ "" size of the x_train must be equal to y_train "" X_b=np.hstack ([np.ones ((len (x_train), 1), x_train]) self.theta_=np.linalg.inv (X_b.T.dot (Xfolb)) .dot (X_b.T) .dot (y_train) # (featere 1) self.coef_=self.theta_ [1:] self.intercept_=self.theta_ [0] def predict (self,x_pred): "given the dataset to be predicted X_predict Returns the result vector "" assert self.intercept_ is not None and self.coef_ is not None,\ "must fit before predict!" that represents X_predict. Assert x_pred.shape [1] = = len (self.coef_),\ "the feature number of X_predict must be equal to X_train" X_b=np.hstack ([np.ones ((len (x_pred), 1)), x_pred]) return X_b.dot (self.theta_) def score (self,x_test Y_test): Calculate evaluating indicator socre-x_test:x test data y_test:true label y for x test data y_pred=self.predict (x_test) return r2_score (y_test Y_pred) def _ _ repr__ (self): return "LinearRegression" if _ _ name__ = ='_ main__': # use boston house price dataset for test boston_data = datasets.load_boston () x = boston_data ['data'] # total x data (506,) y = boston_data [' target'] # total y data (506,) # keep data with target value less than 50. X = x [y < 50] # total x data (490,) y = y [y < 50] # total x data (490,) # train size: (343,) test size: (147,) x_train, x_test, y_train, y_test = train_test_split (x, y, test_size=0.3,random_state=123) regs = LinearRegression () regs.fit_normal (x_train, y_train) # calc error score=regs.score (x_test Y_test) rmse=root_mean_squared_error (Root mean squared error:%.2f'% (x_test)) print ('R squared error:%.2f'% (score)) print ('Root mean squared error:%.2f'% (rmse))

Output result:

R squared error:0.79

Root mean squared error:3.36

1.3 use the linear regression model import sklearn.datasets as datasetsfrom sklearn.linear_model import LinearRegressionimport numpy as npfrom sklearn.model_selection import train_test_splitfrom PlayML.metrics import root_mean_squared_errornp.random.seed (123) if _ name__ = ='_ main__': # use boston house price dataset boston_data = datasets.load_boston () x = boston_data ['data'] # total x size in sklearn ) y = boston_data ['target'] # total y size (506,) # keep data with target value less than 50 X = x [y < 50] # total x size (490,) y = y [y < 50] # total x size (490,) # train size: (343,) test size: (147,) x_train, x_test, y_train, y_test = train_test_split (x, y, test_size=0.3, random_state=123) regs = LinearRegression () regs.fit (x_train, y_train) # calc error score = regs.score (x_test Y_test) rmse = root_mean_squared_error (y_test, regs.predict (x_test)) print ('R squared error:%.2f'% (score)) print ('Root mean squared error:%.2f'% (rmse)) print (' coeffient:',regs.coef_.shape) print ('interception:',regs.intercept_.shape) R squared error:0.79Root mean squared error:3.36coeffient: (13,) interception: () so far The study on "how to write the code for python to achieve linear regression" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.