Quantitatively Assessing Explainable AI for Deep Neural Networks – a Crash Course

bg-new
Author/s

M. Zulllich

About the resource/s

Uploaded by AUTH

Deep Neural Networks (DNNs) are increasingly pervasive into society, especially in decision-making, in applications involving humans or in high-stake applications. This prompts the need for transparency, which is one of the cornerstones of the EU guidelines for Trustworthy AI. For DNNs, it is often unfeasible to attain human-understandable interpretations of the predictive dynamics, hence approximate post-hoc explanations are used instead. These range from feature importance, to generating counterfactual data, and identifying important concepts and training data points. The approximate nature of these tools, though, raises critical questions concerning the quality of the explanations: are these explanations really faithful to the models inner dynamics? How sensitive are they to variations in input or model? Are they really interpretable to humans? Normally, due to the absence in explanation ground truths, many works applying Explainable AI (XAI) tools either fail to evaluate their quality, or rely on anecdotal evaluations, e.g., with user studies, which often results in poor evaluations due to subjectivity or difficulty in defining what constitutes a “valid explanation” for a given task.

This tutorial aims at exploring the recent developments in the topic of quantitative assessment of XAI tools. The first part will be dedicated to introducing the main concepts and methods for XAI applied to DNNs. Next, the axes of evaluation of explanations will be described. Finally, the main metrics and experimental settings for a sound evaluation of XAI tools will be illustrated, providing insights into what constitutes a “good metric” for evaluating explanations. The tutorial will conclude with an outlook on recent trends in the field.

Other Sources