In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
What are the data preprocessing methods in Python? I believe many inexperienced people don't know what to do about it. Therefore, this paper summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.
1. Standardization: de-mean, large-scale variance
Standardization standardization: adjust the distribution of feature data to standard normal distribution, also known as Gaussian distribution, that is, the mean of the data is 0 and the variance is 1. 5%.
The reason for standardization is that if the variance of some features is too large, it will dominate the objective function so that the parameter estimator can not learn other features correctly.
There are two steps in the process of standardization: de-centralization of the mean (the mean becomes 0) and large-scale variance (the variance becomes 1).
From sklearn import preprocessingfrom sklearn.datasets import load_irisiris = load_iris () X, y = iris.data, iris.target''' standard transformation''scaler= preprocessing.StandardScaler (). Fit (X) x_scaler=scaler.transform (X)
two。 Min-Max normalization
Min-Max normalization transforms the original data linearly and transforms it to [0 ~ (1)] interval (which can also be other intervals with fixed minimum and maximum values).
Min_max_scaler = preprocessing.MinMaxScaler () x_train_minmax = min_max_scaler.fit_transform (X)
3.MaxAbsScaler
Max_abs_scaler = preprocessing.MaxAbsScaler () x_train_maxabs = max_abs_scaler.fit_transform (X)
4.RobustScaler: standardization of data with outlier
Transformer = preprocessing.RobustScaler () .fit (X) x_robust_scaler=transformer.transform (X)
5.QuantileTransformer quantile transformation
Quantile_transformer = preprocessing.QuantileTransformer (random_state=0) X_train_trans = quantile_transformer.fit_transform (X)
6.Box-Cox
Box-Cox transform is a generalized power transformation method proposed by Box and Cox in 1964. It is a commonly used data transformation in statistical modeling. It is used when continuous response variables do not satisfy normal distribution. After Box-Cox transform, the correlation between unobservable errors and prediction variables can be reduced to some extent. The main feature of Box-Cox transform is to introduce a parameter, which can be estimated by the data itself and then determine the form of data transformation. Box-Cox transform can obviously improve the normality, symmetry and variance equality of data, which is effective for many practical data. The mode of change is as follows:
Pt = preprocessing.PowerTransformer (method='box-cox', standardize=False) pt.fit_transform (X)
7. Normalization (Normalization)
Normalization is the mapping of values from different ranges to the same fixed range, which is commonly known as [0Phone1], also called normalization.
X_normalized = preprocessing.normalize (X, norm='l2')
8. Single thermal coding
Enc = preprocessing.OneHotEncoder (categories='auto') enc.fit (y.reshape (- 1)) y_one_hot=enc.transform (y.reshape (- 1)) y_one_hot.toarray ()
9.Binarizer binarization
Binarizer = preprocessing.Binarizer (threshold=1.1) binarizer.fit (X) binarizer.transform (X)
10. Polynomial transformation
Poly = preprocessing.PolynomialFeatures (2) poly.fit_transform (X)
11. Custom transformation
Transformer = preprocessing.FunctionTransformer (np.log1p, validate=True) transformer.fit (X) log1p_x=transformer.transform (X) after reading the above, have you mastered any data preprocessing methods in Python? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.