In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly introduces several methods to fill the missing value of spss, which has a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, let the editor take you to understand it.
Spss missing value filling methods are: 1, mean interpolation, using the mode of the attribute to fill the missing value; 2, using the same kind of mean interpolation; 3, maximum likelihood estimation, through the marginal distribution of observation data can be the maximum likelihood estimation of unknown parameters; 4, multiple interpolation, according to a certain selection basis, select the most appropriate interpolation value.
The operating environment of this tutorial: windows7 system, SPSS version 26.0, Dell G3 computer.
1. Mean interpolation. The attributes of data can be divided into fixed distance type and non-fixed distance type. If the missing value is fixed distance type, the missing value is interpolated by the average value of the existing value of the attribute; if the missing value is non-fixed distance type, according to the mode principle in statistics, the missing value is made up with the mode of the attribute (that is, the value with the highest frequency).
2. Use the same kind of mean interpolation. The methods of the same mean interpolation all belong to single-valued interpolation, but the difference is that it uses the hierarchical clustering model to predict the type of missing variables, and then uses the mean of this type to interpolate. Suppose X = (X1 ~ X2. XP) is a variable with complete information, and Y is a variable with missing values.
Then first, the rows of X or its subset are clustered, and then the average values of different classes are interpolated according to the class to which the missing case belongs. If we need to use the introduced explanatory variables and Y in the future statistical analysis, then this interpolation method will introduce autocorrelation into the model and cause obstacles to the analysis.
3. Maximum likelihood estimation (Max Likelihood, ML). Under the condition that the deletion type is random deletion, assuming that the model is correct for the complete sample, then the unknown parameters can be estimated by maximum likelihood estimation (Little and Rubin) through the marginal distribution of the observed data.
This method is also called maximum likelihood estimation (MLE) ignoring missing values. In practice, maximum expected value (Expectation Maximization,EM) is commonly used in maximum likelihood parameter estimation.
4. Multiple interpolation (Multiple Imputation,MI). The idea of multi-valued interpolation comes from Bayesian estimation, which holds that the value to be interpolated is random and its value comes from the observed value. In practice, the values to be interpolated are usually estimated, and then different noises are added to form multiple groups of optional interpolation values. According to a certain selection basis, select the most appropriate interpolation value.
Extended data
There are many reasons for missing values, such as equipment failure, inability to obtain information, inconsistency with other fields, historical reasons and so on. A typical processing method is interpolation, and the data after interpolation can be regarded as obeying a specific probability distribution. In addition, all records with missing values can be deleted, but this operation also changes the distribution characteristics of the original data from the side.
Generally speaking, the treatment of missing values can be divided into deletion cases and missing value interpolation. For subjective data, people will affect the authenticity of the data, and the true values of other attributes of samples with missing values can not be guaranteed, so the interpolation that depends on these attribute values is also unreliable, so the interpolation method is generally not recommended for subjective data. Interpolation is mainly aimed at objective data, and its reliability is guaranteed.
Thank you for reading this article carefully. I hope the article "several ways to fill spss missing values" shared by the editor will be helpful to you. At the same time, I also hope that you will support us and pay attention to the industry information channel. More related knowledge is waiting for you to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.