Example Analysis of pytorch Model selection, underfitting and overfitting in Python 07/19 Update SLTechnology News&Howtos

Example Analysis of pytorch Model selection, underfitting and overfitting in Python

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

What this article shares with you is the example analysis of pytorch model selection and underfitting and overfitting in Python. The editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article.

Training error and generalization error

The training error refers to the error calculated by our model on the training data set.

Generalization error refers to the expectation of our model error when we apply the model to an infinite number of data samples also taken from the distribution of the original samples.

In practice, we can only estimate the generalization error by applying the model to an independent test set, which is composed of randomly selected data samples that have not appeared in the training set.

Model complexity

In this section, we will focus on several factors that tend to affect model generalization:

The number of parameters can be adjusted. When the number of adjustable parameters (sometimes called degrees of freedom) is very large, the model is often easier to over-fit. The value taken by the parameter. When the value range of weight is large, the model may be easier to over-fit. The number of training samples. Even if the model is very simple, it is easy to over-fit a dataset that contains only one or two samples. Overfitting a dataset with millions of samples requires an extremely flexible model. Model selection

In machine learning, we usually choose the final model after evaluating several candidate models. This process is called model selection. Sometimes the models that need to be compared are completely different in nature (for example, decision trees and linear models). Sometimes, we need to compare the same kind of model under different hyperparameter settings.

For example, when training multilayer perceptron models, we may want to compare models with different numbers of hidden layers, different number of hidden units, and different combinations of activation functions. In order to determine the best model for a candidate model, we usually use a validation set.

Verification set

In principle, we should not use the test set until we determine all the hyperparameters. If we use test data in the model selection process, there may be a risk of over-fitting the test data.

If we over-fit the training data, as well as the evaluation of the test data to judge the fitting.

But if we fit the test data, how should we know?

We cannot rely on test data for model selection. We can not only rely on the training data to select the model, because we can not estimate the generalization error of the training data.

A common way to solve this problem is to divide our data into three parts, adding a validation dataset, also called validation set, in addition to training and testing datasets.

But the reality is that the line between validation data and test data is very blurred. After that, it is actually using what should be correctly called training data and validation data, and there is no real test data set. Therefore, the subsequent accuracy is the accuracy of the verification set, not the accuracy of the test set.

K-fold cross verification

When training data is scarce, we may not even be able to provide enough data to form a suitable verification set. A popular solution to this problem is to use K K K fold cross-validation. Here, the original training data is divided into K non-overlapping subsets. Then K times of model training and verification are performed, each time on 1 subset of K −, and on the remaining subset (there is no subset for training in this round). Finally, the training and verification errors are estimated by averaging the results of K experiments.

Underfitting or overfitting?

When we compare training and verification errors, we should pay attention to two common situations.

First of all, we should pay attention to the situation that both training errors and verification errors are very serious, but there is only a small gap between them. If the model does not reduce the training error, it may mean that our model is too simple (that is, lack of expressiveness) to capture the pattern we are trying to learn. In addition, because the generalization error between our training and verification error is very small, we have reason to believe that a more complex model can be used to reduce the training error. This phenomenon is called underfitting.

On the other hand, we should be careful when our training error is significantly lower than the verification error, which indicates serious overfitting. Note that a proposed merger is not always a bad thing.

Whether we overfit or underfit may depend on the complexity of the model and the size of the available data sets, which are discussed below.

Model complexity

The warning polynomial function is much more complex than the lower order polynomial function. Higher-order polynomials have many parameters and a wide range of model functions. Therefore, in the case of fixed training data sets, the training error of higher-order polynomial functions should always be lower than that of low-order polynomials (at worst equal). In fact, when the data sample contains different values of x, the polynomial function whose function order is equal to the number of data samples can fit the training set perfectly. In the following figure, we intuitively describe the order of the polynomial and the relationship between underfitting and overfitting.

Dataset size

The fewer samples in the training data set, the more likely (and more serious) we are to have encountered a fit. With the increase of the amount of training data, the generalization error usually decreases. In addition, generally speaking, more data won't do any harm.

The above is the example analysis of pytorch model selection and underfitting and overfitting in Python. The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.