Science popularization of interpretable artificial intelligence 04/28 Update SLTechnology News&Howtos

Science popularization of interpretable artificial intelligence

2025-04-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)11/24 Report--

PART.01 Overview 1. Interpretable artificial Intelligence (XAI) definition with the rapid development and application of machine learning and artificial intelligence technology in various fields, it is very important to explain the results of algorithm output to users. The interpretability of artificial intelligence means that people can understand the choices made by artificial intelligence models in their decision-making process, including the reasons, methods, and contents of decision-making [1]. To put it simply, interpretability is to change artificial intelligence from a black box to a white box.

2. The interpretable role of research is one of the most important obstacles in the practical application of artificial intelligence. People cannot understand or explain why artificial intelligence algorithms perform so well. The main functions of interpretable artificial intelligence models are:

1. Interpretable artificial intelligence can break the gap between research and application, and accelerate the application of advanced artificial intelligence technology in business: for security, legal, moral and ethical reasons, in some more regulated areas such as medical care, finance, etc., will limit the use of unexplained artificial intelligence technology.

two。 Through the interpretable understanding of the decisions made by the model, we can find out the reasons for the deviation, so as to improve the performance of the model.

3. It is helpful to the use of artificial intelligence model: interpretability can help users understand the decisions made by artificial intelligence, so that users can use the model more effectively, and can also correct the wrong actions caused by not knowing what the algorithm does when using the model.

4. Interpretable artificial intelligence can increase the trust of users: after users know the basis of artificial intelligence decisions, they will more trust the policies made by artificial intelligence.

3. Application field 1. Academic research: interpretable artificial intelligence can help researchers to better understand the decisions made by the model, so as to find the decision errors made by the model and correct the errors pertinently, so as to improve the performance of the model; interpretable algorithms can find out the weak points of the algorithm, and specifically add noise to promote the robustness of the algorithm, such as adversarial learning. Interpretability ensures that only meaningful variables can infer the output to make causality more real in the decision-making process.

two。 Medical field: interpretable artificial intelligence can give an interpretable predictive result based on the input data symptoms or CT diagram to assist doctors in diagnosis. If the model is unexplainable and unable to determine how the model makes decisions, doctors dare not easily use the results provided by artificial intelligence to make a diagnosis.

Figure 1 | Auxiliary diagnosis using artificial intelligence in medical field 3. Financial sector: the financial sector is also heavily dependent on interpretable industries. The investment decisions made by artificial intelligence need to be strongly explained, otherwise financial practitioners will not rest assured to use the decision results of the model; another common application direction of interpretable artificial intelligence in the financial field is to detect financial fraud. the model finds fraud and provides an explanation for decision-making to help regulators fight crime.

4. Information security: the interpretable information of the model obtained through XAI technology can be added to the adversarial environment to attack the model more effectively, find out the poor security links of the model and repair them, and use XAI technology to improve system security.

5. Expert system: expert system is a kind of computer intelligent program system with special knowledge and experience, which uses knowledge representation and knowledge reasoning technology to simulate complex problems that can only be solved by domain experts. The expert system also needs a strong explanation.

4. The goal of XAI can be explained. Artificial intelligence has many interpretative goals. However, because the scope of interpretability is too wide, the content that needs to be explained in different application scenarios is different, and even the content to be explained for different users and audiences is also different, so there is not a unified evaluation standard system in the XAI field. However, according to the summary and statistics of the evaluation indicators used in the related work of XAI in reference [2], the top ones are as follows:

1. Informationality: informativity is the most commonly used and the most widely used interpretive goal for the user audience, which can be used by almost all audiences. The ultimate goal of using the artificial intelligence model is to support decision-making [3], so artificial intelligence needs to provide a lot of information about the decision-making goal to connect the user's decision with the solution given by the model. make users understand the role of the model, so as to better use the model.

two。 Portability: this is the second most commonly used goal, and the general application audience is domain experts and people engaged in data science. Portability indicates whether the artificial intelligence method can be well applied in different scenarios and data, and the algorithm with high portability has a wider range of application scenarios. Interpretable artificial intelligence can improve the portability of the algorithm, because it can clearly show the decision-making process of the algorithm and the boundary values that may affect the application of the model, which helps users to apply the algorithm in different scenarios [4].

3. Accessibility: the third goal of application frequency is accessibility, and the main audience is the product development team and users. Accessibility means whether the algorithm can be interpreted in a non-professional way, ensuring that non-professionals can understand the decision-making process of the algorithm, and lowering the technical entry threshold for users to provide suggestions for improvement of the algorithm. ensure that users can participate in the process of improving or developing artificial intelligence models [5], so that users can focus more on improving their experience.

In addition, the objectives of explainable artificial intelligence are: credibility, causality, confidence, fairness, privacy protection and so on.

The main implementation methods of PART.02 at present, there are two main implementation methods of interpretable artificial intelligence: one is interpretable model, that is, the designed machine learning model already has interpretable ability, and the other is model interpretable technology, which uses model interpretable technology to explain machine learning models that do not have interpretability.

1. The interpretability of the interpretable model can be divided into three levels: simulability, decomposability and algorithm transparency. Simulability means that the whole model can be directly simulated and evaluated by human beings; decomposability indicates that all parts of the model (inputs, parameters and calculations) can be interpreted; algorithm transparency indicates that users can understand the process that the model generates any given output from their arbitrary input data, which usually requires mathematical analysis to obtain algorithm transparency.

Typical interpretable models include linear regression, decision tree, KNN, rule-based learning and so on.

1. Linear regression: linear regression assumes that there is a linear relationship between independent variables and dependent variables, and the linear relationship between them is calculated. This method can well achieve the three levels of the interpretable model, but it also needs the technical assistance of model interpretability for better interpretation. Linear regression model was proposed earlier and has been used for a long time, so the methods to explain the results of the model are more mature, including statistical methods [6] and visualization methods and so on. Of course, there are some potential problems in the interpretation of linear regression [7], such as unobserved heterogeneity, the ratio between different models may be invalid, and so on. In addition, if you want the linear regression model to remain simulable and decomposable, the model cannot be too large, and the variables must be understood by the user.

Figure 2 | Linear regression 2. Decision tree: decision tree is a hierarchical decision structure for regression and classification problems [8], which can satisfy all levels of the interpretable model. Although the decision tree can fit all levels, the individual characteristics of the decision tree will make it tend to a certain level, which is closely related to the decision environment. Decision tree has high interpretability, so it has been used in non-computer and artificial intelligence fields for a long time, so the interpretation of decision tree in other fields has a lot of mature work to refer to [9] [10]. However, the generalization ability of decision tree is poor, so it is not suitable for scenarios that need to balance the accuracy of prediction.

Figure 3 | decision tree algorithm 3. KNN: K nearest neighbor algorithm, in which the largest number of categories among the K nearest neighbors of the test sample is selected as the prediction result of the sample category. The interpretability of KNN's model depends on the number of features, the number of neighbors (i.e. K value) and the distance function used to measure the similarity between samples. If the K value is very large, it will reduce the simulability of KNN, while if the feature or distance function is more complex, it will limit the decomposability of KNN model.

Figure 4 | KNN algorithm 4. Rule-based learning: rule-based learning uses datasets for training to generate rules to represent the model. Rules are often expressed in simple if-then form or in a simple form of permutation and combination, as shown in figure 5. Rule-based learning is an interpretable model, which often interprets complex models by generating interpretation rules [11], which performs very well in interpretability, because it is similar to human thinking patterns and is easy to understand and explain. The generalization ability of the corresponding rule learning is poor. Rule-based learning is widely used in knowledge representation of expert systems [12]. It should be noted, however, that the number of model rules improves the performance of the model, but also reduces interpretation. The length of the rule is also not conducive to interpretability. Need to increase interpretability, only need to relax the rules.

Figure 5 | Rule-based learning 2. Model interpretable techniques when the machine learning model itself is not an interpretable model, it is necessary to use model interpretable techniques to explain its decisions. The purpose of model interpretability technology is to show how existing models generate predictable understandable information from a given input. At present, the commonly used model interpretable methods mainly include feature importance method and case-based method.

1. Feature importance method

Feature importance methods are mainly divided into disturbance-based methods and gradient-based methods.

(1) the method based on disturbance

The importance of the feature is obtained by disturbing the input through one or a group of input features and observing the difference between the input and the original output. The disturbance-based method can directly estimate the importance of features, which is easy to use and versatile. However, only one or a group of features can be disturbed at a time, which leads to the slow speed of the algorithm. In addition, some complex machine learning models are nonlinear, and the interpretation is greatly influenced by the selected features. The classical disturbance-based methods are LIME [13] and SHAP [14].

LIME, full name Local Interpretable Model-agnostic Explanations, local interpretable model unknowable interpretation. Its principle is to design a new simplified interpretable model based on the model that needs to be explained, and then use this simple model to fit with interpretable features to approach the effect of the complex model, thus playing the role of interpreting the complex model.

The author puts forward the Anchors algorithm on the basis of LIME. Compared with LIME, LIME builds a locally understandable linear separable model, while the purpose of Anchors is to build a more refined rule system.

Figure 6 | example of LIME algorithm [13] SHAP, whose full name is SHapley Additive exPlanation, is an additive interpretation model inspired by Shapley value. Its core idea is to calculate the contribution of features to the model output, and then explain the "black box model" from the global and local levels. SHAP is the most commonly used method in practical use and is easy to operate. Because this method can get the influence of each feature on the model, it is mainly used for feature engineering or auxiliary data acquisition.

Figure 7 | SHAP algorithm (2) gradient-based algorithm

The basic method based on gradient only calculates the gradient of the output relative to the input, which is more efficient than the perturbation method. For example, the DeepLIFT (Deep Learning Important FeaTures) method [16] compares the activation of each neuron with its "reference activation" and assigns scores to each input according to the difference.

two。 Case-based approach

Case-based methods use specific instances as input to interpret machine learning models, so they usually provide only a local explanation. The case-based method is proposed by imitating the human way of reasoning, and human beings usually use similar examples to provide explanation. The more commonly used methods are counterfactual interpretation [17] and confrontational attacks [18].

Counterfactual interpretation can be understood as reversing the input from the desired results to obtain the interpretation of the model. This method uses similar situations to predict the current input instances of the machine learning model.

Adversarial attacks deliberately use examples that can make false predictions to explain the model. A more classic use is to make the machine learning model unable to recognize correctly by adding noise to the picture when identifying objects in the picture. As shown in figure 8, the model recognizes the cat as lemon when noise is added to the picture. But for humans, the picture does not change. When this problem is found, it can be improved so as to improve the robustness of the model.

Figure 8 | adversarial attack 3. The interpretable deep learning model has always been regarded as a black box model, and the model itself is not explainable, so it must be explained using model interpretable techniques. Poor interpretation has become one of the biggest obstacles to the development of deep learning. Explain the common methods of deep learning: after-event local interpretation and feature correlation techniques. According to different types of deep learning methods, they are divided into multi-layer neural networks, convolutional neural networks (CNN) and cyclic neural networks (RNN) to introduce their interpretable methods.

The main results are as follows: 1) Multi-layer neural network: it is very effective to infer the complex relationship between variables, but the interpretability is very poor. Commonly used interpretable methods include model simplification, feature correlation estimation, text interpretation, local interpretation and model visualization [19] [20] [21].

2) convolution neural network: convolution neural network is mainly used in image classification, object detection and case segmentation. Although its complex internal relationship makes the model difficult to interpret, for humans, graphics will be easier to understand, so CNN will be easier to explain than other deep learning models. There are two general interpretable methods: one is to map the output to the input space to see which inputs affect the output, so as to understand the decision-making process of the model; the other is to go deep into the network and explain the external [22] [23] [24] from the perspective of the middle layer.

3) cyclic neural network: RNN is widely used in the prediction of inherent series data, such as natural language processing and time series analysis. There are few interpretable methods of RNN, which are mainly divided into two categories: one is to use feature correlation interpretation method to understand what RNN model has learned, and the other is to use local interpretation and modify RNN architecture to explain decisions [25] [26].

Under the future research direction of PART.03, this paper gives a brief introduction to the problems that need to be solved and the possible research direction of XAI in the future.

1. The tradeoff between model interpretability and performance often reduces model interpretability while improving model performance, because performance is often bound to algorithm complexity, and the more complex the model is, the worse the interpretability is. The relationship between accuracy and interpretability is shown in figure 9. Although the negative correlation trend of performance and interpretability cannot be reversed, we can slow down this negative correlation trend by upgrading interpretable methods to make them more sophisticated [27].

Figure 9 | relationship between interpretability and accuracy. 2. Unified interpretable indicators have been mentioned in Section 1.3. At present, there is no unified evaluation index in the field of interpretable artificial intelligence. This will be a major obstacle to the development of explainable artificial intelligence. If the field of XAI needs sustainable development, we must first unify the evaluation index. Happily, some scholars have begun to pay attention to this problem and began to study how to use a unified standard to judge interpretability [2].

3. The interpretability of deep learning model is mentioned in the interpretable technology of deep learning model in Section 2.2. Deep learning has always been regarded as a black box model. In practical application, a greater resistance is equivalent to the traditional machine learning method. Deep learning can be explained poorly. This not only limits the application of deep learning in the fields with more regulation, but also affects the optimization of the model. It is difficult to make good improvements when it is impossible to know the reasons for the deep learning model to make decisions. If we can give a good explanation to the deep learning model, it will make the development of deep learning faster.

4. Application of XAI in information security at present, there are few applications of XAI in information security, but this may be an important application scenario in the future. XAI can deduce the data and functions of the model through the input and output of the model, so it can be used to steal the data and functions of the model [28]. Of course, from another point of view, the information obtained through XAI technology can be added to the adversarial environment to attack the model more effectively, find out the poor security links of the model and repair them, so as to use XAI technology to improve system security.

5. XAI can support interdisciplinary information exchange. XAI can effectively explain model decisions to users without professional background, that is, accessibility mentioned in Section 1.3. XAI can also conduct key data research, that is, multidisciplinary integration and give explanations that different audiences need to know [29]. XAI can facilitate the exchange of information between different audiences and disciplines.

PART.04 summarizes that the goal of interpretable artificial intelligence is to enable people to understand the decision-making process of artificial intelligence model, so as to better use and improve the model, increase the trust of users, and expand the application scenarios of artificial intelligence technology. The uninterpretability of deep learning algorithms is an important issue that limits the development of deep learning, so interpretable research will be an important research direction of deep learning in the future. In addition, interpretable artificial intelligence can also be applied to the field of information security and promote interdisciplinary knowledge exchange. Interpretable artificial intelligence is just in its infancy and has a very broad research prospect. It is believed that in the near future, interpretable artificial intelligence will lead a new breakthrough in artificial intelligence technology.

References:

[1] Confalonieri R, Coba L, Wagner B, et al. Ahistorical perspective of explainable Artificial Intelligence [J]. WileyInterdisciplinary Reviews: Data Mining and Knowledge Discovery, 2021, 11 (1): e1391.

[2] Arrieta A B, D í az-Rodr í guez N, Del Ser J, et al.Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunitiesand challenges toward responsible AI [J]. Information Fusion, 2020, 58: 82-115.

[3] Huysmans J, Dejaeger K, Mues C, et al. An empiricalevaluation of the comprehensibility of decision table, tree and rule basedpredictive models [J]. Decision Support Systems, 2011, 51 (1): 141154.

[4] Caruana R, Lou Y, Gehrke J, et al. Intelligiblemodels for healthcare: Predicting pneumonia risk and hospital 30-dayreadmission [C] / / Proceedings of the 21th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining. New York: ACM, 2015: 1721-1730.

[5] Miller T, Howe P, Sonenberg L. Explainable AI:Beware of inmates running the asylum [C] / / Proceedings of the InternationalJoint Conference on Artificial Intelligence: Vol. 36. Menlo Park: AAAI Press, 2017 Vol 36-40.

[6] Hosmer Jr D W, Lemeshow S, Sturdivant R X. Appliedlogistic regression: Vol. 398 [M]. New York: John Wiley & Sons, 2013.

[7] Mood C. Logistic regression: Why we cannot do whatwe think we can do, and whatwe can do about it [J]. European Sociological Review,2010, 26 (1): 67-82.

[8] Quinlan J R. Simplifying decision trees [J]. International Journal of Man-Machine Studies, 1987, 27 (3): 221,234.

[9] Maimon O Z, Rokach L. Data mining with decisiontrees: Theory and applications: Vol. 69 [M]. Singapore: World Scientific, 2014.

[10] Rovnyak S, Kretsinger S, Thorp J, et al. Decisiontrees for real-time transient stability prediction [J]. IEEE Transactions onPower Systems, 1994, 9 (3): 1417-1426.

[11] Nunez H, Angulo C, Catala A. Rule-based learningsystems for support vector machines [J]. Neural Processing Letters, 2006, 24 (1): 1-18.

[12] Langley P, Simon H A. Applications of machinelearning and rule induction [J]. Communications of the ACM, 1995, 38 (11): 54-64.

[13] Ribeiro M T, Singh S, Guestrin C. "Why shouldI trust you?" Explaining the predictions of any classifier[C] / / Proceedings of the 22nd ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining. New York: ACM, 2016: 1135-1144.

[14] Lundberg S M, Lee S I. A unified approach tointerpreting model predictions [C] / / Proceedings of the Advances in NeuralInformation Processing Systems. Cambridge: MIT Press, 2017: 4765-4774.

[15] Ribeiro M T, Singh S, Guestrin C. Nothing elsematters: model-agnostic explanations by identifying prediction invariance[C] / / Proceedings of the 30th Conference on Neural Information Processing Systems.Cambridge: MIT Press, 2016.

[16] Shrikumar A, Greenside P, Kundaje A. Learningimportant features through propagating activation differences [C] / / Proceedingsof the 34th International Conference on Machine Learning. New York: PMLR, 2017 3145-3153.

[17] Sharma S, Henderson J, Ghosh J. Certifai:Counterfactual explanations for robustness, transparency, interpretability, andfairness of artificial intelligence models [J]. ArXiv preprint arXiv:1905.07857,2019.

[18] Szegedy C, Zaremba W, Sutskever I, et al.Intriguing properties of neural networks [C] / / Proceedings of the 2thInternational Conference on Learning Representations. New York: Curran Associates, Inc., 2013.

Che Z, Purushotham S, Khemani R, et al.Interpretable deep models for ICU outcome prediction [C] / / American MedicalInformatics Association (AMIA) Annual Symposium: Vol. 2016. New York: AMIA, 2016371-380.

[20] Montavon G, Lapuschkin S, Binder A, et al.Explaining nonlinear classification decisions with deep taylordecomposition [J]. Pattern Recognition, 2017, 65: 211,222.

[21] Kindermans P J, Sch ü tt K T, Alber M, et al.Learning how to explain neural networks: Patternnet and attribute [C] / / Proceedings of the International Conference on Learning Representations. New York: Curran Associates, Inc., 2017.

[22] Bach S, Binder A, Montavon G, et al. On pixel-wiseexplanations for non-linear classifier decisions by layer-wise relevancepropagation [J]. PloS One, 2015, 10 (7): e0130140.

[23] Zhou B, Khosla A, Lapedriza A, et al. Learning deepfeatures for discriminative localization [C] / / Proceedings of the IEEEConference on Computer Vision and Pattern Recognition. New York: IEEE, 2016 2921-2929.

[24] Zeiler M D, Taylor G W, Fergus R. Adaptive deconvolutionalnetworks for mid and high level feature learning [C] / / Proceedings of the 2011International Conference on Computer Vision. New York: IEEE, 2011: 2018-2025.

Arras L, Montavon G, M ü ller K R, et al. Explainingrecurrent neural network predictions in sentiment analysis [C] / / Proceedings ofthe 8th Workshop on Computational Approaches to Subjectivity, Sentiment andSocial Media Analysi. New York: Association for Computational Linguistics, 2017 159-168.

[26] Choi E, Bahadori M T, Sun J, et al. Retain: Aninterpretable predictive model for healthcare using reverse time attentionmechanism [C] / / Proceedings of the Advances in Neural Information Processing Systems.Cambridge: MIT Press, 2016: 3504-3512.

[27] Gunning D. Explainable artificial intelligence (xAI) [R]. Arlington: Defense Advanced Research Projects Agency (DARPA), 2017.

[28] Orekondy T, Schiele B, Fritz M. Knockoff nets:Stealing functionality of black-box models [C] / / Proceedings of the IEEE/CVFConference on Computer Vision and Pattern Recognition. New York: IEEE, 2019 4954-4963.

[29] IliadisA, Russo F. Critical data studies: An introduction [J]. Big Data & Society,2016, 3 (2): 2053951716674238.

This article comes from the official account of Wechat: Hao Zhen popular Science talk (ID:sigsxskxjsxh). Picture and text: Zhang Mingrui, du Binghang, Liu Ziqi, Wang Yuyang, Huang Yiping, typesetting: Zhang Mingrui, Review: Hao Zhihan, an Xiuyao, Zhao Runze, du Binghang

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.