Build Better Machine Learning Models By Understanding Their Decisions With SHAP

Hello, and welcome to podcast.init,

the podcast about Python and the people who make it great.

When you're ready to launch your next app or want to try a project you hear about on the show, you'll need somewhere to deploy it. So take a look at our friends over at Linode.

With the launch of their managed Kubernetes platform, it's easy to get started with the next generation of deployment and scaling powered by the battle tested Linode platform, including simple pricing, node balancers, 40 gigabit networking,

dedicated CPU and GPU instances, and worldwide data centers.

Go to python podcast.com/linode,

that's l I n o d e, today and get a $100 credit to try out a Kubernetes cluster of your own. And don't forget to thank them for their continued support of this show.

Your host, as usual, is Tobias Macy. And today, I'm interviewing Scott Lundberg about SHAP, a library that implements a game theoretic approach to explain the output of any machine learning model. So, Scott, can you start by introducing yourself? My name is Scott. I work at Microsoft Research right now. And my

primary area of investigation is, you know, model transparency,

model debugging, and explainability in general in the AI ML space. And do you remember how you first got introduced to Python? I don't actually know how I first got involved with Python, but when I first started using it, you know, in a meaningful way, it was a research company where a lot of stuff was either done in MATLAB, but Python was the open alternative, and I was pretty drawn to that. I feel like

libraries have matured over the years, but, you know, when compared to MATLAB, it was a clear sort of, like, easy to take my code wherever I needed to and take the expertise wherever I needed to. So Yeah. It's definitely a very common theme that I've come across, especially for people who are working in research or academia saying, MATLAB is great until I don't have a university paying for the license.

Yes. Exactly. Or a company in this case. So but, yeah, that's how I got engaged in, you know, Python, Jupyter, that kinda ecosystem kinda caught me, and I stayed. And so now you're working on the SHAP library, as I mentioned, at the open. And I'm wondering if you can just describe a bit about what it is and some of the story behind how it came to be and why it is that you are

focusing your time and energy on this specific area of effort and sort of why you decided that it was useful to have this as an open source library for people to be able to use. And then the purpose, as you say, is, like, to explain the predictions of machine learning models based on game theory. But how did it come to be? Well,

actually, I was doing my PhD in comp bio. So ML

applied to, you know, biology and medicine and things like this. And 1 project in particular that we were working on when CHAP got developed was an adverse event prediction during procedures in hospitals.

And,

you know, there's a lot of interesting data that's collected during those types of procedures. You can imagine,

you know, it's 1 of the most measured

points of your life if you're going in for, you know, a surgery or something.

But if you just turn on a red flag that says, I think this bad thing might happen

as a doctor. Like, what are you supposed to do? Right? Like, we're in the room talking to doctors. They're like, okay. The red light turns on, you mentioned, hypothetically, in the system. Do you just look around? You're like, you know, something's gonna fall out of the air? Like,

is the patient you just have to check everything.

Right? Because there's no understanding other than the outcome. That was the need, I guess, that drove initial interest in, you know, developing something concrete for transparency.

And so

as I got into the transparency side of things,

I dug into a lot of the work that was going on. And actually, a collaborator and friend of mine, Marco Barrow Berton Lime, which wasn't actually written at that point, but he was working on it. And so I was, like, trying to use some of the stuff that he originally developed

in some of these local models. And then when I connected that in with like, we tried to train some linear models, so, like, some inherently interpretable

approaches.

And then I got really interested when I connected all those things back to some approaches that had been introduced before on Shapley values for explainability, but hadn't, you know, been widely applied in my experience. So, yeah, I got into that, got really interested,

found a lot of interesting connections, and at least what I found engaging.

And then I actually wrote it all up in Julia and in time just for speed purposes, but then some other grad students convinced me that that would be probably useful as an open source library. It's like, you know, hey. I wanna use that too. You know, you should probably really push this out, and I think it was a good advice.

In terms of the sort of nature of explainability

and as you mentioned, when you have the output of a machine learning model, you know, in some contexts, it can be useful and sufficient to just say, you know, this is the answer. But for the case that you were giving, you know, in the medical context of this is the output, but now what do I do with it, and why is it saying this?

The real value comes in understanding

the

reasons that a given model gave that particular output. And I'm wondering what are some of the sort of contexts that create that need to be able to understand the reasoning behind the outputs, and what are some of the evolutions

in the

sort of architectures and approaches to machine learning that have made explainability

such a challenging problem? Obviously, we just described 1, which was the motivating factor. And that was actually, you know, maybe less typical for most use cases today, because I would say model debugging is actually

the easiest

off the shelf use of explainability tools today, particularly by data scientists, people familiar with coding.

You know, as soon as you build a model,

almost your first question is, what did I build? Right? You know, not just, like, I know the name of it. I know, like, the form that the parameters may take, but I'm curious, like, what is the actual statistical

signal that I pulled out of my data? It's really what interpretability is trying to get at, both at the local level of, like, what is the signal that is driving an individual prediction,

and what is the signal? What's the information content that's really driving, you know, the model as a whole, either by representing some global pattern of, like, this feature as it, you know, changes its value. How does that affect model output, kind of partial dependence kind of views, or or global measurements?

Yeah. So all of those are kind of like an intuitive need that I think most people feel when they build the model. I at least feel it when I got down this road. But then there's also, like, specific,

you know, outputs that are almost quantitative. Right? So you could

say, I want to know how much my model depends on features that I believe may be spurious correlations.

You know, 1 potential,

just so that I can understand

how my generalization may or may not suffer when that happens.

If you have a medical model, for example, like, being able to understand

the structure and the behavior, that I think is quite important because

often

your downstream action will depend on the reason the prediction was made. You know, that was the case we ran into with the doctors, for example. So 1 thing that we found was lots of spurious correlations in a hospital setting. So let's say, for example, that I'm a doctor, and I give you treatment for

something that's supposed to make your risk less.

Now the only reason I would have given you that treatment is because I, as a doctor, am concerned. Right? Actually, you need this. Right? Otherwise, I wouldn't be giving you this drug. So as soon as the model sees that you got dosed with this drug, what does it learn? It learns that you got dosed with the drug, but also learns doctor concern. Turns out doctor concern is is actually a much more significant signal

than the drug's risk reducing effect. And so overall, it turns out to be, you know, the model is

actually increasing your risk. So if you ask the model, like, what's the best way to make this person look healthy? They would say, well, don't treat them, and don't do anything that would you would do if they were sick. That's the kind of stuff that you learn by looking inside a model that you just wouldn't if you weren't. You know, maybe outside the medical setting, if you were in, like, a sales type call, like someone called and you're tasked to call up someone because they're likely to, you know, cancel their service.

Well,

if you were told that they're likely to cancel their service because they're a kid,

it just turned 18, and for some reason, the model thinks that's really likely for them to cancel service at that point, like, that's a lot of context. It's valuable when you walk into a conversation, so you're not coming in blind.

Or if, like, the model's recommending pricing for a particular customer for a variable pricing scheme, like, why

that pricing is recommended really helps you have some,

you know, context for the negotiations as you get started. Or if your approval like, if you're there trying to approve applications for for whatever insurance loans, imagine whatever, oftentimes is a threshold where, like, you know, really small ones, you just use let the computer do it, but, like, higher risk things. People are involved.

That borderline,

you know, that's another case where these types of systems get involved at the model consumer stage and not just at the model at the bugger stage. And then another

sort of interesting element of the problem of explainability

is the rise of deep learning as the sort of

first stab that most people will take at trying to create some interpretation of their data where they want to be able to just say, you know, throw a neural network at it, and it will come out with some output. And then I'll determine, you know, what is the level of accuracy? How do I wanna tune it from there without necessarily understanding

what the sort of statistical significance of their inputs are to the output? That's where sort of a lot of the lack of understanding comes into sort of machine learning as a

discipline.

Whereas if you were building the model based on all of these sort of statistical theoretical approaches of, I have these inputs. I want to be able to, you know, do k nearest neighbor clustering because I want to understand what are the sort of populations of data that I'm dealing with, then you have a better understanding of, like, why a model gave a particular output because you, you know, used this clustering approach. Whereas if it's just throw it into a deep neural network and hope that the, you know, different layers are

attaching

to the appropriate signals and not just, you know, the background color that happened to be present in all of these sort of correlated pictures.

And so I'm wondering sort of what you have seen as the impact of, you know, the evolution of machine learning approaches and architectures

in

the sort of utility and

challenge of being able to create these explanations of how these outputs are generated?

Well, certainly, deep learning is prone to create opaque systems.

I think everyone would agree with that. Not that you couldn't create

things that are not opaque if you're careful enough. Right? They're just linear models

stacked up in in many cases. But

I would say that

well, first of all, they probably ignited

a significant amount of interest in transparency editing, not at an agnostic level. And I think that may actually have a side effect of being beneficial

for traditional statistical models that you might consider otherwise interpretable,

Because often, it's easy to conflate interpretability with implementation,

if that makes sense. You know, if I have a k nearest neighbors

model,

I may and rightly say that there are ways to interpret that. You know, I understand exactly what happened. But it turns out that there are many questions that might be difficult to answer with a k nearest neighbors model, which might actually be the real question that you have for whatever downstream task. So let's say, for example, you're supposed to rank your features

by most informative. Not saying it's impossible with the k nearest Nader's model, but it doesn't come out,

you know, right away. You don't just look at it, and you're like, oh, well, this is the most important feature that's really driving what's happening.

And same for a decision tree. Like, I could write a small little decision tree, and I could say, oh, I know exactly what's happening because I can execute it in my mind. But it doesn't mean I know what's the most important feature. Like, I have no idea what the most important feature is.

Hopefully, it's the top 1 because that's how it splits, but, like, there's no guarantee.

And so I think there's actually a side benefit that neural networks have positive and agnostic view. And this decoupling of what does it mean to have a good explanation, I think, should first be posited separate from what you have as a model. That's your goal. Right? And your goal, it lives separately from your implementation to get there. But that being said, going back to, like, the the different structures of the neural nets themselves, I would say that as they get larger and larger and larger well, there's really 2 challenges. 1, I would say that in the nitty gritty, there's, like, explanation methods that work by back propagating

stuff through the network. Right?

Those,

I have found, get harder and harder as networks hit bigger and bigger. And they do just because the dependence of the outputs on the inputs is just more and more abstract. So if you take a, you know ResNets are a good example of really deep networks. If you were to try and use

things I've put in JAP even, like the deep explainer, the gradient explainer,

from the pixel level all the way to the output of a ResNet 50.

You just look the results are not good. Right? And it's just scrambled by the time you do 50 layers. If you go into, like, you know, maybe from the last layer to, like, something closer to the head, well, then you're gonna get more reasonable. You know, maybe at least go 10 layers in, and now you're you're okay. So I think that's 1 thing is, like, the farther and farther away you get from the inputs,

the more sort of implementation specific methods begin to have trouble because they're approximations, and those approximations get worse as they get stacked

more and more. You could say, okay. That's fine because agnostic methods, which simply rely on executing the whole thing, don't care about the input to output distance, so to speak. They just care about the complexity. In there, I would say that the some of the biggest challenges for these neural net models is the fact that they are best. Right? Neural net models are best for distributed feature representations.

So things like text, pictures, sound waves, you know, like, all these things where deep learning is

state of the art, and there's not really any competition for it even,

are areas where

individual features are not independent from each other. They should not be interpreted very independent from each other. And it's

very much about extracting

what we would consider

sort of a separable signal, but from a large area of the input.

And that, I think, causes a lot of challenge for interpretability methods because if you use as a language pixels, or as a language,

you know,

slices of a waveform.

There's only so much you can get from sort of the heat map view of the world. And so I think the actual,

you know, creation of features that are a good language to use for explanations

becomes almost an equivalent

challenge to the sort of summarization task. Right? Because Chap does a lot of summarization.

You know, like, here's how you can summarize the interaction effects. But if you don't have good starting terms, then

you're limited.

And are there particular sort of

applications of machine learning and sort of data formats that you're dealing with that increase the sort of complexity? You mentioned things like working in the level of pixels or in the level of, you know, segments of a waveform.

But are there any differences in terms of, you know, deep learning approaches to, like, natural language processing where you might be using a transformer model or, you know, doing transfer learning using some, you know, large multibillion parameter model that you then wanna just strip out and replace the last few layers of or, you know, some of the different sort of, like, neural network architectures where, you know, a CNN versus an RNN, if there are any of those factors that will influence some of the complexities of being able to actually understand what was the reasoning that went through this. Yeah. Well, so if you're using a CNN, pretty much everyone who's been for months has almost always looked at last layers. Right? The last conv layer before you go into a flat feedforward or whatever you have up in your head. And,

you know, that has

still some spatial

encoding in it, and that I think is

unique, obviously, to the structure of the data and is something that will always make explanations better. Right? So whenever you're thinking of explaining,

you know,

networks like that, you always have to think what is this spatial proximity. And if I want to look at the internals of the network,

then I can always do better by looking at it. But if you think about it, what we're really leveraging there is the fact that we sort of downsample.

Right?

And we've also shortened the model, so maybe we could do, like, gradient propagation. You know, like, Grad CAM is gradients come back. Like, that works pretty well once you've got close enough to the output.

1 other thing that we've tried in SHAP is to try and leverage the structure of the input data

to minimize the number of perturbations that happen because the complexity of doing an agnostic explanation is the fact that it can only execute a model so many times before it becomes computationally intractable.

Right? So you could say, oh, it's really easy to understand the importance of every pixel. Just toggle it on and off in the context of all other pixels being on and off. You know, know, what is this? 2 to the 256 times 256, like, it's, you know, more than the number of molecules in the universe. So, like, you can't do brute force type approaches,

but you can leverage the structure of the image. And even simple things like we built, like, our current image 1 in there, I think is like a recursive partitioning tree. Right? So it literally cuts the image in half, toggles both halves. Whichever half mattered more, it recurses on that, sort of does an a star search. So, like,

that is very dependent on the input structure. Right? The fact that it's, you know, structured like that. I think the same thing for transformer models, except transformer models are even

my experience can be, you know, even bigger depending on which language model you're talking about. So what that means is the number of evaluations

again has to be

even

more sensitive. So, obviously, you have attention maps, attention masks that can give you some

deal for, like, what heads are doing. But my experience is that you're always better off just actually perturbing the input and seeing what it does. And with the right heuristics,

you get better at that point. So maybe the big takeaway there is as you get bigger and bigger models, it's easy to look inside them and find pieces that you think behave.

I find myself more and more leaning

towards just changing the input and seeing the behavior because I've anchored myself across the whole system, and I don't have to make as many assumptions.

Digging deeper into

sort of the concept of explainability,

we've talked about sort of being able to understand the context and the reasoning behind a given output. And we've talked about being able

to gain some sort of heuristics and try to have an intuitive understanding of where the model is,

you know, taking a certain direction based on the given set of inputs. But, you know, more broadly in the concept of explainability,

are there sort of different ways that that term is used sort of in industry and particularly in the context of, like, regulatory systems such as GDPR or CCPA where

you need to be able to say, you know, with some level of confidence or definition that this output was given from this machine learning model and now the consumer needs to understand what was the motivation for that decision, you know, for, like, in the context of an insurance claim, for instance,

And sort of what are some of the

degrees of specificity and accuracy that are necessary for being able to

establish these explainability

thresholds?

Isn't that a question worth a lot of thought? Yeah. There's somebody's thesis right there.

Yeah.

I think that 1 piece that's important to remember about explainability,

particularly for technical folks, is that there are people involved.

And honestly, I would really like this sort of theoretical side of things and, like, axioms that show uniqueness and stuff like this, but their assumptions.

And whenever you have explainability

to a person right now, I personally think explainability could be consumed by a down stream machine as well. Right? There's no reason that, like, the signals that are produced by these explanation methods are not also valuable for downstream processing. We'll talk about that later. But but when you're communicating to a person,

what really matters is the mental model that the person arrives at at the end of the day, and how that mental model impacts, you know, how their mental model of the world shifts because of your explanation,

and then how it affects their behavior. So maybe,

you know, I won't try and define

for for governments what they mean by the laws that they put out, but I do think that there are a couple of things to keep in mind. 1, when you say specificity, like, how much specificity is important, I think what you mean is, how many details can we leave out

versus how much do we need to communicate when we're writing explanations?

And I think that explanations come in various forms. So first of all, like, when I we wrote up SHAP, like,

we basically said, like, an explanation is a model, right, formally.

And so it's a simple model that represents

the more complex model in some way,

that is that is meaningful. So for SHAP, that's just an additive model. So, like, a set of feature attributions that add up to what it would have been for 1 person versus what it would have been for another. And and maybe it's good to be concrete here. So, like, if I have 2 people who both applied for an auto loan, like you said, one's named Bob, one's named Sue. Bob gets a score of, like, 520.

Sue gets a score of 600.

Sue gets the loan, Bob doesn't. Right? Explaining to Sue and Bob, like, what happened

like, if there was no 1 but Sue ever in the history of the universe, then explaining what happened

doesn't really make any sense. Right? There's always has to be alternative. Right? Whenever you say, like, this happened, it's always because this happened instead of

something else.

So for Shafty values, that's like the background distribution or the alternative. It's some way of masking the features. So

if I was explaining Sue and Bob, let's say they filled out a 10 point form,

I could literally take the input form values for Bob and, 1 at a time, replace them with Sue's values

and observe, as I did, each 1 how the score changed. And, eventually, I'll get to Sue's score because I filled out the whole the whole thing.

So if I have a very simple model like a point system, right, like, if the true approval process was just a point system, plus 1, minus 1, plus 2, point 2,

then I would just read off those points when I did that. And so then I would give that as my explanation,

and people would call that maybe an inherently interpretable model, a model with,

know, it's just, I can read the whole thing off. I don't need an explanation. Or maybe a better way to say it is that the model is its own explanation.

Or maybe it's not even a point, maybe it's a linear model. Well, if it's a linear model, I can still just do that thing, and I'll read off sort of the, you know, the difference between

the, you know, beta times my

x I for Sue and my x I for Bob, and that'll be my explanation of each feature, and that makes sense. But once I get into nonlinear models, then the order that I fill out the form actually matters. Right? Because, you know, if 2 things are sort of in an and relationship, then whenever the second 1 gets put in, that's when I see the bumps, which

changes the order.

And that's fundamentally where Champlain values come in and say, how do we distribute the credit for these, what we'll call, interaction effects, right, these non additive effects between the input features on the output? So when you say, how much specificity do we need? My mind immediately goes down to, well, how much are we hiding from the user mathematically? Because that's different than the psychological question, but at least on the mathematical side, we can say, how much are we hiding from the user?

And I would call that, like, I've been using the word explanation error. Right? So, like, what is the error in explanation? In other words, how much does my explanation model

deviate from the real model over the set of perturbations that I'm considering, which in our case are just toggling between either a background state or the current state.

So I would say that whenever you're providing an explanation, you need to consider the explanation error. Right?

And

if you apply, for example, this sort of additive point based explanation

to a linear model or a generalized additive model or a point based additive model, like, if you were to apply SHAP to any of those, you would end up with 0 explanation error. Right? In the sense that there's no deviation,

so everything would be great.

As soon as you start adding more and more interactions, like some deep multilayer neural net, you're gonna get nontrivial explanation error potentially. And then, I mean, you can get the standard deviation of the discrepancy, and you wanna look at that and be like, okay. Well, if I'm hiding 10% of the variability, what's going on? How much is that hiding from the user? And those are the trade offs I think people should keep in mind when you're in this sort of high stakes, if you will, decision making system.

So that's all on the side of, like, math. How much do we hide? How specific can we be? But I I'd be amiss if I didn't mention another question, which is what is the purpose of explanation?

So we actually

had the opportunity to be a part of, like, a hackathon for the CFPB, which is a regulator here in the US on consumer financial protection. They had a hackathon on action codes, and a bunch of people participated. And so it's a fun time to sit down for a week with experts

who are in this field, and and ask the question, how do we explain?

You know, right now, this is how we explain

rejections for, like, a, you know, a mortgage application.

How could we potentially improve that, or or what's different?

And

1 key thing that I think

that there are actually multiple reasons that people want explanations.

1

is

they want sort of the ability to see if there's, like, you know, bad data on their credit. Right? Like, if you rejected my application because of something that's wrong, right, the typo or, like, you've got some identity theft or something like that, like, that's a clear thing that you would want to know. You might also want to

put a little extra pressure for fairness and accountability on providers, so that if they say, no. The most important reason I declined you was

your music taste, you know, then people will be like,

You know, that would rightly make people uncomfortable.

Now that's more of like an informational thing. Right? Like, what is the most important feature? What was the feature that if I didn't know it would have changed the prediction a lot? You could think of measuring that in bits almost.

But there's another very valid reason, and that is I might want to fix my score.

Right? Like, I might actually want to have a better score, and that is a very different question. Right? That's a question of what is the most actionable.

So turns out the current status of things is they provide something called a reason code, which leave a lot to be desired, but are essentially the first. And SHAP is also very much the first. It's like, what's the most informative?

But you could also say, like, what is the most impactful

counterfactual I could do? What is the thing I could change that would be the easiest to change and have the most effect on my score? And in a very rough analogy,

it's kind of like the difference between how far am I from the mean on a partial dependence plot. Right? So, like, make a partial dependence plot of your model, take the mean of your data, pick that part of the partial dependence plot, and then pick the person you're interested in, you know, me, and see at this point on the y axis and my partial dependence spot, if I had moved to the mean, how far would have I gone down? Right? That's how far I am from typical

for this feature, and that's actually the shot value of a GAM model, and also, basically, what's currently in reg from what I understand here in the US. But if you ask, what's the most effective thing I could do? Well, that's different. That's more like the gradient. Like, I don't care how far from the mean I am. I wanna know the slope.

Like, presumably that the x axis is easy to change, you know, in a uniform way. Like, the slope tells me how much bang for my buck do I get if I don't open more credit cards or if I don't do something else. So I guess that's a 2 pronged question. 1 side is consider explanation error. When you're doing it, you're producing your explanations. And the other side is ask the right question. And oftentimes, you may need different explanations for different outcomes. Like, I think people should have action codes just like the codes because, you know, it makes sense. But, you know, that that's for others to decide. Well, that's definitely a very fascinating answer to that question, so I appreciate the level of depth and rigor you gave there. And so now digging more into SHAP specifically, I'm wondering if you can talk through some of the sort of implementation details of how you're building it, some of the sort of workflows that you are designing around it, and sort of the ways that it is sort of architecturally

structured to be able to integrate with all these different sort of various machine learning workflows and frameworks and model types. Learned a lot through the last 4 years, I suppose, that it's been a while with people interacting with it.

So

maybe a good way to start with just how things are structured in sort of a class hierarchy, if you will. Right? So inside Chef, there are things called

models. Right? And that's what you give to Chef. Right? You're gonna give Chef a model, and, presumably, you're gonna, at some point, give it some data that you want that model to run on and get explanations for. So

what it is that takes a model

and the data and then produces explanations is what we call an explainer. An explainer

is, you know, the algorithm for summarizing this interaction effects.

The tricky part for Shapley values, they're very easy to write down, but they happen to be NP hard to compute. And so that makes for all the interesting parts.

And you can address NP

or hard problems, I'm not sure if it's complete, in 2 ways, I would say. 1,

statistical approximation. Right? Random sampling,

you know, and just take

a convergent estimate. Or 2, restrict your domain. Right? So it may be a bit hard to do it in general, but it might not be NP hard in a certain class.

So Chef has examples of both in the package, and there are various explainers that are tailored

for different scenarios.

So

maybe 1 of the most popular ones is the tree explainer, and that 1 is targeted

at ensembles of trees. So this is random forest, gradient boosted trees. Trees are a superset of, you know, additive things. So that would be any

generalized additive model, anything like that. And

that is a nice scenario because there, we can

solve it analytically

in lower polynomial time. So it's got, like, its own c code, and we can go into details and inflation later. But, you know, that's, like, a nice little bundle, and you would definitely wanna use that for any tree based model. And, in fact, that code lives inside LightGBM and XGBoost as well

inside their packages to deliver, you know, explanations even if you don't have shop installed.

But if you don't have that type of model, it doesn't help you at all. Right? So there's other methods that are entirely agnostic. So 1 is like a permutation explainer, where it just literally walks through, like we talked about the form, fills them out 1 at a time, and that will apply to anything. But it doesn't account for the things that we talked about for images or text. Right? So, you know, in images, there's a lot of structure and spatial things.

So we have other explainers that are targeted

specifically on very structured inputs, like text or images, and those actually come with direct integration with well, the text ones at least come with direct integration with Hugging Face. You can just pass a Hugging Face pipeline

directly to the explainer, and it just automatically produces all this stuff. So there's this whole list of explainers inside there, and what we found is that users often and rightly would be like, oh, well, which explainer am I gonna use? Right? Like, it's sometimes it's obvious

because I have a XGBoost model, and here's trees, and, you know, I understand what I wanna do. But sometimes, I have this particular deep learning model, which of the options should I use?

So what we end up writing was just a shop dot explainer, which is a class that will essentially automatically subclass to whatever

explainer is most appropriate. So you just pass it whatever, and it'll do its best to guess.

So, yeah, that's kind of like the hierarchy of explainers. I would say that an important piece of this, though, is that there is also something called a masker. And this really determines how do we actually perturb the inputs.

So maybe a visual way to do this is to talk about images. Right? Like, how would I mask out an image? Right? Like, I could replace it with black.

Is that what I mean when I, like how important is this half of the image? What does that mean? Is, like, a versus it being black, versus it being, like, imputed by a GAN, like, versus it being

infilled, etcetera. And so different people have different answers to those questions. Do I want to integrate over whole distribution of possible things that it could have been? Same for tabular data. For most tabular data, you wanna integrate over distribution, and so that's the masker. The masker determines how we're masking things, and we have a whole set of those for masking text, you know, replacing with mask tokens,

you know, deleting the text, like, you know, setting the attention layer to 0. You know, these are all different ways that you might want to mask things, and they're important choices, so we try and expose that as masters and let people

use a set of them or write their own. Right? And then we also have some utilities to help people merge models together. Right? So if I have a model that takes 2 inputs and then somehow they get merged and then they get added to another thing,

you can essentially create composites of these masters until eventually you have, like, something that can handle a list of tensors that is whatever goes into your model with the appropriate way of masking it. And so for people who are using SHAP and they are trying

to build a sort of deep neural network that is, you know, doing transfer learning on a BERT model to be able to understand the sentiment of a piece of text, what would actually be the process of saying, I'm now going to take SHAP. I'm going to integrate this into my workflow, and then I'm going to try to understand what are the most important features for determining the sentiment given this piece of text that I have contributed to this model? So let's say you have your model in a notebook, working in a Jupyter notebook,

and you

so using Hugging Face,

you download your pretrained

sentiment analysis

from

wherever. You have your dataset. So now you have your model, and you're gonna have a tokenizer. Right? Because you need to be able to tokenize your input text before you throw it into the model. You can pass the Hugging Face model directly to shap.capitaleexplainer

as the first argument to the constructor, and the second argument would be the tokenizer.

And the tokenizer becomes the master, and so ShoutOut detects that you've passed the tokenizer from Hugging Face, and it says, oh, I know how to turn that into a masker. And so it wraps it up. And then it sees that you passed it a model from Hugging Face, and it says, well, this isn't just a standard Python function. This is a this is a Hugging Face model, and I know about those, so I'll automatically wrap that as a model that I know how to deal with.

And then what comes out of that, what you constructed is an explainer, which is a callable object which takes as arguments the same arguments that your model takes.

So in this case, be a list of strings. So you pass a list of, you know, 10 strings,

and then you would get back out an explanation object, which is essentially a array of shaft values for each sample. Now it's a bit trickier for text because, of course, the dimensions don't line up the same way that they would if you had passed a pandas data frame in a nice tabular data set, like everything's just sort of nice parallel arrays. You essentially still do have parallel arrays, but now they're more ragged.

And it was about a year ago that we actually switched over to a new explanation object that kind of bundled all that together and allows for parallel slicing. It just made things a lot easier to keep track of because you have

data,

you have feature names, you have the explanation values, You have, your base values, which is like what would happen if I just give you a blank input. I have to have start at some base value. And a variety of other stuff, like explanation error or whatever else you might wanna spit out, those are all parallel with each other. Right? They're all like a set of parallel arrays or tensors that align with each other. So this object is just a convenient way to slice them all together. So I can say, I have this explainer. I apply it to my 10 things. I get out an explanation object. I can say, bad explanation,

you know, bracket 0, that's gonna be the first row's explanation, you know, or colon comma 0. It's gonna be, I guess, the first word of each thing of the little odd, I guess, in text. But then you could plot it. So we have a whole plotting,

you know, subtree of the library where it says shap.plots.all

sorts of stuff. So if you're trying to plot text, there's a text plot in there. And so if you plot the explanation for sentiment, for example, as a text, you can just say explanation 0, pass that to shab.pos.text,

and you will get

a plot that gives sort of an interactive JavaScript overlay heat map that has tokenized all your text

according to the tokenizer that you passed and visualizes

the SHAP values as computed by the partition explainer. And then you can mouse over it and see for each output

which input, you know, drove it. Now in this case, there's only maybe 1 output, which is a positive or a negative. But it could be there's, like, 3 outputs. Right? Positive, negative, neutral. That's pretty common. And then the same thing happens when you're doing you know, let's say you're doing translation instead of something like that.

Well, it turns out that it's a bit more tricky under the hood, but in theory, for the user, it should be the same. Right? You just pass the exact same thing in, and now

when you plot

your explanation, it's now an explanation of a whole bunch of outputs where each output is the output token, right, coming out of this translation model. And so then you can just mouse over, click on any of those outputs, and you can see, okay. For this word in the output, this is the input effect that it had. And

so digging deeper into some of the typical workflows that people have built around SHAP and some of the potential ways that it can be applied, you mentioned that

because it's just

another model that you're computing to be able to determine the explanation

for the model that you are trying to understand,

you might be able to potentially feed that into another downstream system and be able to potentially create a feedback loop. And I'm wondering what are some of the sort of interesting workflows that you've seen people build where they are trying to automate some of the feedback from this explanation to say that, no, this is not what I'm actually trying

to achieve as an output based on these features. I actually want something different and maybe do sort of like a meta training aspect of using these 2 models to play, you know, maybe as like a a GAN style approach.

On the tabular side of things, I've seen people use these downstream

things for fairness computations. And I've written about it too, but I've seen people do it even before I talked about it, where they essentially take these values and then compute fairness metrics on them individually in order to essentially run all the downstream analysis that you might be interested in for because essentially, what are the units of an explanation? That's important to keep in mind. The units of the explanations in CHOP are the same as the units of your output in your model. Right? So if the output of your model is in bits or log odds or whatever

probabilities, that's the same units of your explanation. So that means, normally, that whatever analysis you're doing to the output, if it's not like a binary kind of thing, can be applied to the input per feature. That could be helpful. So that's 1 automated, I guess, way of doing it. So automatically

create summary statistics that you wouldn't maybe expect or apply it to the loss of the model or something like that. For the explanations,

like, actually trying to guide the model itself in sort of a GAN style. So the earliest work I've seen that's roughly close to this is, like, there's some work called Right for the Right Reasons, or a paper like that, a while back where basically talking about

constraining

why a model is doing something and not just what it's doing. We tried to extend on that a bit, and this is in nature of my paper, I think, recently, on how to do that for deep learning models. The real trick is the differentiability.

So if you don't have gradients, it's really hard to do the end to end sort of loop training. Like, that's the key aspect of GANs is, you know, you're able to get a differentiable end to end system out of the whole thing. That restricts the type of explanation that you're able to do, and so we use sort of an expected gradients, which is sort of an, you know, an expected version of the integrated gradients explanation method, which I won't get into all the depth of. But, basically, it relies

on, you know, an infinite game extension of Shapley values, but it boils down to relying on the gradients of your model. Unfortunately,

in my experience, it doesn't work as well for really deep neural nets.

So

I think it's a great solution for

sort of shallow

approaches,

and we did show that it can be useful

for regularizing

anything you like. It's a really cool concept, right, if you think about it. Right? Just like your parameters, you might wanna regularize parameters. But maybe the parameters aren't really in the right form. Like, maybe maybe it might be challenging to write down

an l 1 regularization of the parameters of a model where if you squish 1 parameter, it just pops up somewhere else. Right? Like, you know, if there's a 100 outputs. What do I really want small where you know? But if I just say, oh, I want the feature attribution to be small, then the implementation is sort of abstracted away. Right? I can say, like, it doesn't matter. Or what does it mean to be smooth?

Because I can always, like, up sample in the next layer. Right? Like, I could make it smooth in the first layer and then just have a times a 100 in the next layer, and I'd get a great So but all that goes away if I regularize explanations. So we we've done some of that. I think that it's challenging primarily because you have to rely on differentiable explanations,

and I think those come with a lot of dangers.

But they're certainly valuable and worth looking at, particularly in in domains where you're willing to spend a good bit of time. 1 thing I will note though

is

so there's been some work on, like, how to trick explanations.

Right? How do you intentionally deceive explanations, if you will, adversarial explanation attacks?

Almost always, those depend on providing

fake data

in very

special ways,

such that whenever you're explaining the model, doing your perturbations, I tell you 1 thing. But whenever I give you real data, I tell you another thing. Right? That depends on there being strong dependencies in the data structure that you're explaining. And if there are, then you can obviously just sort of have a little switch that says, pretend I'm such and such when you're perturbing me, and do something else when real data hits me. I can never tell the difference. I can do that. But, like, that's something that a person engineered it to do. But as soon as you start putting explanations in the loop,

you're incentivizing your model to lie. Right? Like, if you ask the model, like, what? Can models lie? Right? Thought about writing something up on this, but I've gotten around to it. So, like, can models lie? Like, models

don't have, like, the emotions that would connotate the typical incentives that people would have to lie, But models are very good at following incentives.

And what would the incentive to lie be? It would be to tell me 1 thing while achieving another goal. Right? So essentially, if you put the constraint of an explanation

on a model and you say,

make my explanation look like this,

but have high accuracy on this potentially sensitive task.

Well,

what's the best way to do it if, you you know, like, there's a reason you're putting this constraint here. It's because, apparently, it's not an imperfect match with the task you have. And so you're essentially incentivizing your model to lie to you. Right? Please tell me this while doing this. Like, that's essentially what you're saying to your model. So I'd be very

cautious. As fun as it sounds to set up a GAM like that, you have to realize that you are incentivizing

deceptive behavior. It's definitely a very interesting insight and something that I hadn't really thought of, but that is pretty hilarious to consider. Just like this completely sort of emotionally agnostic system

being incentivized to tell you what you want to hear regardless of its applications to reality.

Yeah. Yeah. So it's it's definitely 1 of those things where, yeah, you wanna be careful. Yes. Please tell me what I wanna hear. Well, okay. But remember, it will. Right.

And so

in terms

of the ideas and assumptions that you had as you first began exploring this overall area of explainability and building the SHAP project. I'm wondering what were some of the

assumptions that you had that have been challenged or updated as you have dug deeper into the problem domain and as more people have started using SHAP in their own work? Well,

I would say that

the breadth of use was a bit surprising at first. Like, I didn't really expect how many ways that it would

be useful, which has been fun and scary in certain times. You know? Like, you wanna make sure that it's used in the right ways, not the wrong ones. So I guess that was a assumption.

I don't know if it was spoken or not, but it was a surprise.

I would also say that

there's, like, a lot to maintaining a repo that you don't imagine when you start off.

So

I've probably not done the best on staying on top of everything, but at certain points, I just there's an investment in maintenance that is a 0 sum game with an investment in future research as well. And so that's been a blessing and a challenge, I would say, to figure out how best to both be helpful and be effective at the same time. So that's been good.

I also learned a lot about what it was like to depend on other modules.

Know, just how does it live in the Python ecosystem. And,

you know, yeah, I think it's meant to start out in Julia, then I went to Python, and then I was like, oh, I gotta run some of these things faster. So then I wrote a c module, and then I

found NUMA worked out pretty well, and so I used that. And then NVIDIA was kind enough last year to contribute some CUDA code for the tree explainer, so you can use GPU

explanation acceleration.

And so that's kinda opened my eyes to all the ways that compilation can fail

in in all these systems in Python, and I underestimated, I think, how much that takes. I also underestimated what it was like to depend

on large libraries like

TensorFlow or PyTorch. So

I remember I have a deep explainer that, like, reached inside and traversed the graph of a TensorFlow or PyTorch graph, but TensorFlow went to 2. You know, suddenly moved everything from Python to c.

Everything just disappeared. It's like

yeah. So just an understanding of what is the maintenance debt that you take on when you put something out. So a real learning experience.

For people who are

interested in being able to apply the capabilities of SHAP to the work that they're doing. I guess, how much overall domain knowledge and understanding of the principles of explainability

is necessary, and what are some of the challenges that they face in being able to

use SHAP most effectively and

interpret and understand

the outputs that it is generating and how that may be best used to feedback into the next iteration of their sort of exploratory development cycles?

I think it's hard to know from the outside. Like, I've even seen studies on this, and and it's hard to like, sometimes, people like, oh, well, this is easier than this. It's really, like, some subtle detail

that that was really the whole issue. And so then you're like, well, what have I really learned about I've just learned that, you know, this particular aspect, but then I could change that, but then I don't know.

So I wouldn't wanna promise, if you know this, you're safe, or to say, like, you should never use it if you don't understand this book of knowledge.

But I would say that it's important to know,

first of all, what your features mean. If you don't know what your features mean, Shop is not gonna be very helpful. It's not gonna be effective because

it will tell you

something's going on with feature 5, and you're like, great. Okay.

PCA 5 means something, I guess. I don't know. And so that's the clear kind of limitation is if you don't have any domain or very little domain knowledge of your features, then what your model is doing won't mean anything to you. Like, I So, yeah, most people who are developing models have at least some understanding of what their features are doing, even if they aren't quote the domain expert. Like, they can still look at it and be like, oh, it's the loan rate that's doing I don't know what loan rate is, but, like so there's definitely some digging that will need to happen, but at least, you know, I think it's fair to use it, find out what the important ones are, and then learn about those. You don't have to, like, read all 100 first.

I would also say that it is important to know what the alternative is. So whenever you're doing an

explanation, people implicitly in their head work with alternatives all the time. Right? Like, when you go to work, and you say I was late, you say it's because I hit a red light, everyone assumes it's because everyone assumes the alternative is a green light. Right? They don't assume the alternative was that I didn't get airlifted. Like, you know, the cops didn't come and, like, ride me in a caravan. You know? Like, there's lots of alternatives on how you could have gotten to work, but, like, people always have, like, a, you know, a typical alternative in their head. And you may have 1 in your head, and it's important to make sure that that's roughly what the model has and, you know, what the explainer has actually implemented. Because if they're wildly different, you could lead yourself astray. And that's essentially understanding the master. So if you're running a tree model, by default, it can actually pull out the background training set from the leaf counts, and so you're essentially comparing it to the average.

So that's a typical thing. But if you

forget that you're comparing to the average, then that could get you lost. Or if you're doing some really tricky model like text or images, you need to know, like, what is actually happening in the perturbations. Like, am I blurring chunks of the image? Like, that's important to know.

I guess the 2 last things I would say is, 1, you need to know the output of your model. Just like you have to know the inputs of your model, if you don't know what the outputs are, then you're in trouble. And know what they mean is fuzzy. Right? Because a lot of people work with log odds even though they don't really know what log odds are. They're just sort of the thing that came before the probability before they got squished.

And so

there's a lot of subtleties that can get washed over, and sometimes that's fine, but you just have to understand that, like, your understanding is limited by your understanding of the output units that you're working with. And, of course, you should always consider the explanation here. In other words, you if you're just plotting a scatter plot and there's, like, tons of vertical variability, that means there's tons of interactions, and you should just keep that in mind.

1 other thing I think will be helpful when people are, like, thinking of context,

oftentimes, when people train a model, they want to know how the world works,

but that's not what you learn. What you learn is how your model is working when you're explaining it, not necessarily how the world's working.

So if I am, like, someone doing sales and marketing, and I wanna know my ad spending,

how much is it worth to spend my ad spending? Like, I could train a model that predicts, like, how much I spend on ads, here's how much money I make. Like, the model says

x,

but SHAP isn't like a replacement for causal inference. Like, it's very useful to explain heterogeneous treatment effects inside causal inference packages, but it does not magically turn a observational model into a causal model just because you know what it's doing. And so that's important to keep in mind. There is a close connection between,

at least, CHAP and most other explanation packages and causality, where you actually are doing causal inference, but you're doing it on the model. It's not on the world. Right? Like, I am actually running a randomized controlled trial, but it's on my model's inputs. It's not on the actual world. So, you know, that's kind of an important distinction to keep in mind that people I have seen kinda go get a stray on. In terms of the sort of alternatives, you mentioned, you know, there might be other libraries people are using to determine the sort of explanations of their model outputs. And I'm wondering if you can just kind of briefly give a sort of landscape, sort of 30, 000 foot view

of some of the different projects or approaches that people might use for explainability,

you know, in lieu of or in addition to SHAP. So kind of categorize the field, I suppose. There's the model agnostic

section of things. So there's, you know, like, LIME, SHAP. Those are model agnostic approaches,

and they don't necessarily have to make assumptions about your input,

output, or model type. And,

yeah, I would say that there's been some game theoretic approaches. There's the local linear approaches, and then there's what I would call

sort of I don't know if classical or heuristic is the right word. I'm not sure what to use. Like, partial dependence plots, for example, fall under this, or, you know, standard

global feature importance

measures where you just, you know, permute a column in your data.

Many of those

correspond

to different

choices

that people make. And

if if you're thinking about feature attribution,

I was actually worked with a guy's student at UW. He did a nice overview of, like,

feature attribution and, like, all the different choices, and have, like, 30 different methods in there. And, like, made a big table of, like, if you make these choices, you get these methods. You make these choices, you get these methods. So people who want code details, I recommend that. And then there's also, like on the structure side of things, there's like the implementation from a variety of startups, and also, like, larger companies

who have sort of an ML life cycle in view.

And so they're interested in sort of engaging folks in a way that help them understand an ML life cycle. And some of those are very vertical focused. Right? So they could be like, I am going to build experts, you know, very well tailored system for financial services products.

They've output explanations of a certain particular type that

will work well for this domain. And I'd say that, normally, that comes with a significant benefit of people who know what they're doing in that domain. Of course, there's always lock in whenever you're considering

non open source stuff, but I think there's a lot of good people doing great work there. So, you know,

I I thought about going that route at 1 point, like, you know,

building stuff and commercially supporting it. And there's also,

I say, a very large emphasis on building interpretable models, and I think that that should always be where people look as well. So there's, like, really

great work. Rich Caruana is a coworker of mine who works on generalized additive models a lot, and interpret ML as a package. So I've been involved in there with Chefs integrated in there for some external models, but, like, the core

interpretable model offering there is a is sort of a, you know, a direction to have 0 explanation error, boosting machines, essentially. And I think those are excellent directions that people should consider. And

I think what else there's a lot of, I think,

model specific

maybe a good way to say this, there's a lot of perturbation methods.

So Marco is another coworker of mine. He's worked he did lime. He's also done 1 called Polyjuicers, kind of interesting,

and, like, other counterfactual based explanations.

And they

are

based on, essentially, perturbing text and showing you potentially surprising,

like, semantically equivalent changes

that change the output of the model. So SHAP is all about summarizing lots of the counterfactual perturbations,

but it could be that you actually just want to see specific ones.

Right? That's also a helpful thing. So feature attribution is about a summary of many counterfactuals,

but there are lots of packages out there, particularly in, like, the NLP space, which I think are good at finding really interesting counterfactuals. And then they just show you 1 of them. It's very concrete. It doesn't give you as, like, big a picture maybe, but it gives you very concrete information about

a specific behavior. Same thing for images. There's some fun stuff on, like,

finding the closest image that will change the class while remaining,

like, visually

sane. Like, I'm not adding noise to the model. Like, we're adding smooth, like, large scale changes to the image, and then you can see some really interesting sort of adversarial

explanations, if you will.

That's really fun.

There's stuff like Beam Kim, a while back, did some stuff on, like,

concept vectors and, like, sort of latent vector representational spaces. So that's a really helpful

concept, which isn't necessarily orthogonal to Shap or any of the explanation methods, but it's a way to come up with concepts from things that would otherwise be distributed. So we talked earlier about pixels maybe not being the right language.

Oh, this could be really helpful. Or I think

there is to work on bottleneck models, where you essentially it's kind of a interpretable model again, where you're trying to create features

that make sense from your data by either anchoring them to some external labeling or something like that. And so

in terms

of your own work and your own usage of SHAP, I'm wondering what are some of the applications that you have found useful in the research that you're doing, if you're able to speak to that. Yeah. It changes over time. I would say, at this point, and probably for the last few years, I don't really ever build a model without

applying SHAP to it at this point because it's just been yeah. It's a big part of how I understand what the model is doing and how it's worked.

So

I would say for,

you know, whenever when I was doing a lot of sort of medical stuff, whenever we were working with collaborators,

we'd be building models, and I would be, you know, trying to explain those models, and then that would inform downstream experiments or follow ups as they're essentially trying to understand the signal and the data.

So recently, I've been working more on NLP models, finding errors in them, and using them against each other, and all sorts of different things. And so in that case, whenever we're looking at, like, a overt,

mistake by 1 of these NLP systems, it's gonna be useful to just throw that into shop and and make sure that

the part of the sentence that that I think perturbing, that's really driving this, right, like, is actually driving it. That's helpful. I've also used it to substructure datasets.

So found it helpful to

to run SHAP,

compute, you know, the driving effect

on

either the loss, and then sort of, like, look into that per feature, or on the model output, and then just cluster by that. So then I find a chunk of data that has the same output for the same reason, and that's usually interesting. In your experience of using SHAP for your own work and in working with the community of people who have started using it for their own applications of machine learning models and understanding the sort of motivations that go into the outputs, What are some of the most interesting or innovative or unexpected ways that you've seen it used?

Well, let's see. There's always there's always, like, the applications that are surprising, you know, when people say, hey. Here's my paper on deep sea life diversity, and here's how Shap quantifies,

you know, the fish populations of the Pacific. That's really cool, and I would never have imagined that was

that was not what I had in mind when when you see us. That's kinda fun or, like, you know, I've seen I think I mentioned earlier, seeing fun applications where people are using it to kind of manage the boundary between automated and human decision making, you know, where

they have a whole system that could make the decision, but people make the decisions too. And so how do you inform the people with what the computer would have been doing? I've seen the the sales example, you know, where salesmen are informed by it, but that was kind of fun. I've seen, like, MBA teams use it for, like, modeling their system modeling, but, like, how can we improve? Whatever.

I think maybe 1 of the most creative uses was

1 where they built it a whole, like, data cleaning and anomaly detection system based on the feature attributions.

That's pretty cool.

People using it for, you know, the fairness, I thought was kinda interesting to look at, because you have a much more rich understanding of, like a lot of the fairness rules here in the US, at least, are based on just a straight up disparity.

Like, that's the first flash case,

because the output of your model is starkly different between 2 protected classes, and questions start being asked. And so I thought it was interesting to see

some variety of start ups do stuff where they would

segment

by feature. So you can be like, well, feature 5 is actually what we should be asking questions about. I thought that was kinda fun. In your experience

of building the SHAP project and using it to explore the overall space of explainability and machine learning and AI, what are some of the most interesting or unexpected or challenging lessons that you've learned in the process?

I've learned that theory is nice, but practice is a necessity,

if that makes sense.

So

getting both is kinda hard sometimes.

Sometimes

things have seemed like they would be maybe I'm just too optimistic, I guess, but, like, this is gonna work great, and then it doesn't. So it's been really good to have a good feedback loop, I think, that I've really valued. You already mentioned

the, you know, managing code debt, good ways.

It's a learning experience. I think I learned a lot about, like,

speed and how to do it correctly, you know, how to trade off performance and not when you're writing a Python package, you know. I'll write this whole thing in C plus plus and, like, put a little binder,

write the whole thing, and not care about speed. And and just seeing the different applications, because sometimes it just doesn't matter, and sometimes it does. So that was definitely a fun experience.

I've also been pleasantly surprised about, you know, it's great to work on open source code. Like, it's a helpful community. Like, people love to contribute stuff, and as bad as I am,

you know, being a full time maintainer, I think it's been, from my perception, a good thing for a lot of folks, and I appreciated that. And for people who are interested in being able to gain more visibility into the motivations for the decisions that their models are making, what are the cases

wrong choice and they might be better suited with choosing a different model architecture or using 1 of the other sort of libraries or approaches that you mentioned earlier? I would say that

so going back to that example of that hackathon where there was 2 different goals. Right? There was a reason or at least as I saw it, I thought there was 2 there's a value for sort of the information content of a feature, you know, how informative is this on my prediction. There's also the gradient. What's the closest action to increase my score?

Those are different goals.

The first 1 I say, you know, SHAP is as good as many others for that application, probably better than mine, at least was.

And

for the other 1, that would be just not the right answer, just not the question it's answering. So it would be the wrong explanation method if your goal is to find the nearest counterfactual that will lead to something.

So I would say there are other methods, like, let me share my has some dice method, I think, that's here at Microsoft. Marco's done 1 on, anchors, I believe. These are all counterfactual based things. Like, if I change so much, you know, what would need to change or not change in order for value to stay the same? So if you're looking for specific counterfactuals,

then it's not the purpose of feature attribution, so I think you should look somewhere else.

I think I also mentioned this before, but it's the wrong tool if it causes you

to fail to use another tool, even if they could be used well together, if that makes sense. So if SHAP makes you think if you don't need to think about causal inference when you really do,

then that would be a mistake. Not because it's not useful for your problem, but because it doesn't solve that question. That would be another case.

And I think it I don't know if it's the wrong tool question, but, like, you could certainly use it wrongly if you don't consider the explanation error and the impact that that could have on your system, your debugging choices, or downstream applications.

So 1 of the upsides of having sort of inherently interpretable or a simple model

is that you're forced

to have

0 explanation error under some definition of that model being the explanation. We are very fixed to that model.

With a tool like SHAP, you are now free to stick anything you like in there. And if you were to

not care at all about the complexity of those models, and hence the explanation error of the, you know, what you're interpreting,

You could hide a lot in there if you aren't careful.

I think there's a lot of value in giving that flexibility, but you have to recognize it is flexibility, and there is some cost to going up into complexity.

There's also some cost, I should note, in going too simple, right? So we demonstrated sometimes in our papers. If you do use a linear model when your data is not linear, that linear model will then begin to depend on

weird cancellation effects between your features in order to become slightly nonlinear

because those features are slightly nonlinear related, and so it can do some sort of weird stuff. And suddenly your model is interpretable

because, you know, it's only got 10 coefficients.

Those coefficients, like, you have no idea what they really mean because they're, you know, cancellation of 3 nonlinear effects with the 4th 1, and you didn't even know they were nonlinear to start with. It was just a dataset. And so there is a danger to going too simple if that simplicity does not match the real world, because then it forces your model to do weird things. And then the discrepancy between the model and the world becomes so large that that becomes problematic.

And as you continue to use SHAP and support the community that's grown up around it, what are some of the things that you have planned for the near to medium term? And are there any particular areas of help or contribution

that you're looking for? Yeah. So this last year, here at Microsoft, we did their rotation program, and they were generous enough to have a whole team help out for a little bit on some of these NLP

explanations and some image stuff too. And so we have a lot of nice code in there. I 1 of my plans is to write up a paper to document, you know, and share that. And then also, maybe write up a blog or something. Like, just document and, like, share that with folks. So I think there really is, you know, a very nice

way to dig into a lot of NLP models. There's a lot of other great alternatives out there too. If you're looking for game theoretic approaches, I think this is, you know, 1 of the most accessible ones out there. So I think really getting getting that well documented would be good. We had talked about trying to create this explanation object, which is sort of this parallel arrays, really applies to the explanations of all sorts of things. Like, it applies when you're talking to interpreter ML, people realize it applied directly to the output of their explainable models because

their explanations are actually the same form as Shap, and so it's like, can we produce a common object? And so I think doing a little bit more of that would be great.

I also think, going forward, understanding your model is important, but so also is understanding the data that goes into the model. Right? Like, so there's a lot of structure in models, and it's great to understand that. But if you don't understand the structure in your data as

well, you are flying a bit blind. So we added some stuff in CHAP where it actually, like the bar plot, you can, like, create a hierarchical clustering of tabular data, at least, and it will plot that and show you what is the statistical redundancy. So, like, what's the predictive power redundancy between 2 features? And I think that is really helpful because oftentimes,

a model is just arbitrarily choosing 1 feature over another because they're redundant, or there's this whole cluster that's really got a complex interaction.

So I think

going forward, I'd like to incorporate that more into the explanation process, so that people are not only looking at the model, but they're looking at the model with the data structure and context. I think it'd be really helpful. In terms of

assistance and help on things here, I feel like there are

PRs are wonderful, and people have been doing a lot of those. And so if anyone's interested in doing sort of more long term engagement where they could end up reviewing PRs and, you know, being, you know, the touch point for a chunk of the code. That's, I think, what would be the most valuable going forward. Well, for anybody who wants to get in touch with you and follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And so with that, I'll move us into the picks. And this week, I'm going to choose a movie I watched recently called reminiscence.

That's

a recent 1 that came out with Hugh Jackman, and it's set in the near future where global warming has caused more waters to rise. And a lot of people have decided to kind of descend into living in their memories as their sort of way of passing the time and not having to deal with the realities of the current world and just a lot of interesting storyline that comes up around that. So definitely worth taking a look if you're looking for something to keep you entertained. So with that, I'll pass it to you, Scott. Do you have any picks this week? I read Augustine's Confessions,

and I was pleasantly just

engaged with what it was like

to kinda walk through intellectually with someone in a very different world, in a very different time frame. So I'll pick that. Well, thank you very much for taking the time today to join me and share the work that you're doing on SHAP and trying to help support the challenge of explainability

for machine learning. It's definitely a very

interesting problem area, and I appreciate all of the insight and thought that you put into it. So I appreciate you taking the time on that, and I hope you enjoy the rest of your day. Thanks, Bryce. Appreciate it.

Thank you for listening. Don't forget to check out our other show, the Data Engineering Podcast at data engineering podcast.com

for the latest on modern data management.

And visit the site of pythonpodcast.com

to subscribe to the show, sign up for the mailing list, and read the show notes.

And if you've learned something or tried out a project from the show, then tell us about it. Email host@podcastinit.com

with your story.

To help other people find the show, please leave a review on Itunes and tell your friends and coworkers.

The Python Podcast.init

Summary

Announcements

Interview

Keep In Touch

Picks

Links

The Python Podcast.__init__

Summary

Announcements

Interview

Keep In Touch

Picks

Links

The Python Podcast.init