Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the data preprocessing methods in Python

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

What are the data preprocessing methods in Python? I believe many inexperienced people don't know what to do about it. Therefore, this paper summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.

1. Standardization: de-mean, large-scale variance

Standardization standardization: adjust the distribution of feature data to standard normal distribution, also known as Gaussian distribution, that is, the mean of the data is 0 and the variance is 1. 5%.

The reason for standardization is that if the variance of some features is too large, it will dominate the objective function so that the parameter estimator can not learn other features correctly.

There are two steps in the process of standardization: de-centralization of the mean (the mean becomes 0) and large-scale variance (the variance becomes 1).

From sklearn import preprocessingfrom sklearn.datasets import load_irisiris = load_iris () X, y = iris.data, iris.target''' standard transformation''scaler= preprocessing.StandardScaler (). Fit (X) x_scaler=scaler.transform (X)

two。 Min-Max normalization

Min-Max normalization transforms the original data linearly and transforms it to [0 ~ (1)] interval (which can also be other intervals with fixed minimum and maximum values).

Min_max_scaler = preprocessing.MinMaxScaler () x_train_minmax = min_max_scaler.fit_transform (X)

3.MaxAbsScaler

Max_abs_scaler = preprocessing.MaxAbsScaler () x_train_maxabs = max_abs_scaler.fit_transform (X)

4.RobustScaler: standardization of data with outlier

Transformer = preprocessing.RobustScaler () .fit (X) x_robust_scaler=transformer.transform (X)

5.QuantileTransformer quantile transformation

Quantile_transformer = preprocessing.QuantileTransformer (random_state=0) X_train_trans = quantile_transformer.fit_transform (X)

6.Box-Cox

Box-Cox transform is a generalized power transformation method proposed by Box and Cox in 1964. It is a commonly used data transformation in statistical modeling. It is used when continuous response variables do not satisfy normal distribution. After Box-Cox transform, the correlation between unobservable errors and prediction variables can be reduced to some extent. The main feature of Box-Cox transform is to introduce a parameter, which can be estimated by the data itself and then determine the form of data transformation. Box-Cox transform can obviously improve the normality, symmetry and variance equality of data, which is effective for many practical data. The mode of change is as follows:

Pt = preprocessing.PowerTransformer (method='box-cox', standardize=False) pt.fit_transform (X)

7. Normalization (Normalization)

Normalization is the mapping of values from different ranges to the same fixed range, which is commonly known as [0Phone1], also called normalization.

X_normalized = preprocessing.normalize (X, norm='l2')

8. Single thermal coding

Enc = preprocessing.OneHotEncoder (categories='auto') enc.fit (y.reshape (- 1)) y_one_hot=enc.transform (y.reshape (- 1)) y_one_hot.toarray ()

9.Binarizer binarization

Binarizer = preprocessing.Binarizer (threshold=1.1) binarizer.fit (X) binarizer.transform (X)

10. Polynomial transformation

Poly = preprocessing.PolynomialFeatures (2) poly.fit_transform (X)

11. Custom transformation

Transformer = preprocessing.FunctionTransformer (np.log1p, validate=True) transformer.fit (X) log1p_x=transformer.transform (X) after reading the above, have you mastered any data preprocessing methods in Python? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report