How to realize univariate Linear regression in R language 07/06 Update SLTechnology News&Howtos

How to realize univariate Linear regression in R language

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article mainly explains "how to realize unitary linear regression in R language". Interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "how to realize unitary linear regression in R language".

⑴ simple linear regression

First of all, you can consider the simplest case, that is, there is only one independent variable and one dependent variable. We use women, a dataset that comes with R, as an example. The women dataset contains the height and weight information of 15 women aged 30 to 39, as shown below:

Height is an easier measure to observe in real life, and now we model based on these data to predict weight through height, as shown below:

Fit=lm (weight~height, data=women) summary (fit)

In the above summary results, Residuals is the residual of the response variable; Coefficients is the coefficient, that is, the model parameters and their test results, where Intercept is the intercept; the last part is the square of the complex correlation coefficient, that is, the value of R2 and its test results.

Next, we can list the observed values of dependent variables, fitting values, and residuals respectively:

We can map the regression model: library (ggplot2) ggplot (women, mapping=aes (x=height, y=weight)) + geom_point (size=2) + geom_smooth (method=lm, se=TRUE, fullrange=TRUE, level=0.95) + theme (axis.title=element_text (size=15, color= "black", face= "bold", vjust=0.5, hjust=0.5)) the results are as follows:

After the construction of the regression model, we must do model diagnosis to test the basic assumptions (normality, independence, linearity, covariance), so as to enhance the confidence in the prediction of unknown data. One of the simplest methods of regression diagnosis is to visualize the various situations of the model, as follows:

Par (mfrow=c (2)) plot (fit) is shown in the following figure: the first picture shows the variation of the residual with the fitting value, which can test the linear hypothesis. Theoretically, if the linear model is very consistent, the residual should be uniformly distributed (that is, the residual is independent of the fitting value, the red line is approximately horizontal), and the situation in the chart implies that there is likely to be a higher order correlation. The second picture is the QmurQ diagram testing the normality hypothesis of residuals. According to the normality hypothesis, when the value of the prediction variable is fixed, the dependent variable has a normal distribution around the fitting value (the predicted value). Then the residual should obey the normal distribution with a mean of 0 (that is, the points in the graph fall on the dotted line as far as possible). The third picture is the variation of the square root of the absolute value of the standardized residual with the fitting value, which is used to test the homoscedasticity hypothesis, if the hypothesis is satisfied, that is, the variance of the dependent variable is the same at different independent variable levels. then the data points in the graph should be uniformly distributed (red line approximate level). The fourth picture is used to screen outliers (including dependent variables and independent variables), one point represents a sample (object), and the longitudinal axis is a standardized residual. The larger the absolute value, the greater the difference between the dependent variable value and the fitting value, and the horizontal axis is the leverage value. The larger the leverage value, the more the outlier in the independent variable.

⑵ polynomial regression

Although all the test results are significant, the above results are not perfect, because it is clear from the distribution of data points that weight is not completely linear with respect to height, so we can add a quadratic term for polynomial regression:

Fit2=lm (weight~height+I (height ^ 2), data=women) summary (fit2)

The regression equation is weight=0.083*height2-7.35*height+261.88. Similarly, we can draw pictures to show:

Ggplot (women, aes (x=height, y=weight) + geom_point (size=2) + geom_smooth (method=lm, se=TRUE, formula=y~x+I (x ^ 2)) + theme (axis.title=element_text (size=15, color= "black", face= "bold", vjust=0.5, hjust=0.5))

As you can see, the regression line of data can be easily added through the geom_smooth () function in ggplot2. In the constructed polynomial, x and x ^ 2 are not necessarily independent, which may cause additional problems. Another method is to use the poly () function to generate orthogonal polynomials. As follows: library (ggplot2) N=300x=1:N+rnorm (N, 10, 60) y=1:N+rnorm (N, 10, 60) colour=sample (c ('red','blue'), N, replace=TRUE) df=data.frame (df, aes (xonomy, colour=colour) ggplot (df, aes (xonomy, colour=colour)) + geom_smooth (method='lm', formula=y~poly (xmag3), level=0.95) + geom_point (alpha=0.9) ggplot (df, aes (xonomy, colour=colour)) + geom_smooth (method='lm') Formula=y~x+I (x ^ 2) + I (x ^ 3), level=0.95) + geom_point (alpha=0.9)

In general, the above two drawing methods are equivalent, and the final drawing result is as follows:

At this point, I believe you have a deeper understanding of "how to achieve unitary linear regression in R language". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.