The accuracy can reach 100%. Google's new method to solve the problem of "taking shortcuts" in ML model 04/14 Update SLTechnology News&Howtos

The accuracy can reach 100%. Google's new method to solve the problem of "taking shortcuts" in ML model

2025-04-14 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)11/24 Report--

Recently, a new paper by the Google AI team attempts to answer the problem of "taking shortcuts" that perplexes the ML model, and gives suggestions on several significant methods.

Through a large number of training to solve the task of the modern machine learning model, we can achieve excellent performance when evaluating on the test set.

But sometimes they make correct predictions, but the information used seems to have nothing to do with the model task.

Why is that?

One of the reasons is that the data set of the training model has no causal relationship with the correct label, but it is a "predictable artifact".

In other words, the model is fooled by extraneous information.

For example, in an image classification dataset, a watermark may represent a specific category.

When all the pictures of dogs happen to be taken outdoors, the background is green grass, so the green background indicates the existence of dogs.

It is easy for the model to rely on this false correlation (shortcuts) rather than more complex features.

Text classification models may also tend to learn shortcuts, such as over-reliance on specific words and phrases.

A notorious example of natural language reasoning tasks is relying on negative words when predicting contradictions.

The link to the paper is posted below. Interested friends can take a look at it.

Paper link: https://aclanthology.org/ P19-1334 /

One of the important steps when building a model is to verify that the model does not depend on such shortcuts.

The input saliency method (such as LIME or Integrated Gradients) is a common method to achieve this purpose.

In the text classification model, the input saliency method assigns a score to each tag, in which the higher the score, the greater the contribution to the prediction.

However, different methods can produce very different scoring rankings. So which one should be used to find shortcuts?

To answer this question, we propose a protocol to evaluate the input saliency method.

The core idea is to intentionally introduce meaningless shortcuts to training data and to verify that the model has learned to apply them in order to definitely understand the basic factual importance of tags.

With known truth values (Ground Truth), we can evaluate any saliency method by placing known important markers at the top of their rankings.

Using the open source learning interpretable tool (LIT) we prove that different saliency methods can produce very different saliency maps in emotion classification examples.

In the example above, the significance scores are shown under the corresponding markers: color intensity indicates significance, green and purple represent positive weights, and red represents negative weights.

The same marker (eastwood) was assigned the highest (Grad L2 Norm), lowest (Grad Input) and medium (Integrated Gradients, LIME) importance scores.

In machine learning, the word truth "ground truth" refers to the accuracy of the training set in the classification of supervised learning techniques.

This is used in statistical models to prove or negate research hypotheses, and the term ground truth refers to the process of collecting target (provable) data for this test.

The key to our approach is to build a ground truth that can be used for comparison.

We believe that the choice of path must be motivated by the known information of the text classification model.

For example, toxicity detectors tend to use identity words as toxicity cues, natural language reasoning (NLI) models assume that negative words represent contradictions, and classifiers that predict film review emotions may ignore the text and support digital ratings.

Shortcuts in text models are usually lexical and can contain multiple tags, so it is necessary to test how saliency methods identify all tags in shortcuts.

Create shortcuts in order to evaluate the significance method, we first introduce the existing data into ordered matching shortcuts.

For this reason, we use the BERT-based model to train the emotion classifier on the Stanford emotion Tree Library (SST2).

We introduced two meaningless tags, zeroa and onea, into BERT's vocabulary, and we randomly inserted them into a part of the training data.

Whenever two tags appear in the text, the text is labeled according to the order of the tags.

As a result, we turned to LIT to verify that models trained on mixed data sets did learn to rely on shortcuts.

We see that the tab model in LIT achieves 100% accuracy on a completely modified test set.

The reasoning of the model trained on the mixed data (A) is still largely opaque, but because the performance of model An on the modified test set is 100% (in contrast to the chance accuracy of model B, the latter is similar but trained only on the original data).

In general, we apply the methods described to two models (BERT, LSTM), three data sets (SST2, IMDB (long format text), Toxicity (highly unbalanced data sets), and three lexical shortcut variants (single tag, two tags, and two sequential Token).

In addition, we compared a variety of saliency method configurations. Our results show that:

Finding a shortcut to a single marker is a simple task for the saliency method, but not every method points to a pair of important markers.

A method that applies to one model may not apply to another model.

Dataset attributes such as input length are important.

Details such as how gradient vectors become scalar matter are also important.

We also found that some method configurations assumed to be suboptimal in recent work, such as Gradient L2, may provide surprisingly good results for the BERT model.

Reference:

Https://twitter.com/GoogleAI/status/1600272280977780736

This article comes from the official account of Wechat: Xin Zhiyuan (ID:AI_era), editor: Joey

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.