Explainability versus Interpretability

Mingrong Gong

Last updated on Jun 28, 2023 1 min read notes

Brief Description

Machine learning explainability and interpretability are both concepts that aim to shed light on how a machine learning model makes decisions, but they approach the goal from slightly different perspectives. Here’s a breakdown of the differences between the two:

Interpretability pertains to the degree to which causal relationships can be discerned within a given system. Stated differently, it encapsulates the extent to which one can prognosticate outcomes in response to alterations in input variables or algorithmic parameters. It entails the capability to inspect an algorithm and readily comprehend its operational dynamics.
Explainability denotes the magnitude to which the internal mechanics of a machine or deep learning system can be expounded using comprehensible human terminology. While the subtle divergence from interpretability might evade immediate notice, it is prudent to regard it in this manner: interpretability facilitates the discernment of mechanics without necessarily elucidating the underlying rationales, whereas explainability entails the capacity to elucidate the processes in a literal manner.

Think of it this way: say you’re doing a science experiment at school. The experiment might be interpretable insofar as you can see what you’re doing, but it is only really explainable once you dig into the chemistry behind what you can see happening.

Academic 开源

Mingrong Gong

Graduate Student in SIAT, University of Chinese Academy of Sciences.

My research interest mainly includes trustworthy AI, machine learning, especially domain generalization, and reinforcement learning.