Why is base value in shap score plot different for different inputs?

Question:

I am trying to implement shap score from shap python package following the example in the same link.

data[‘text’][:3] gives me tbree examples:

[‘i didnt feel humiliated’, ‘i can go from feeling so hopeless to so
damned hopeful just from being around someone who cares and is awake’,
‘im grabbing a minute to post i feel greedy wrong’]

I run the said emotion classifier and get the shap plots:

shap plots

My question is given I have selected "sadness" class in all 3 plots, why is the base value different in all 3 plots?

I wanted to understand how base value is obtained and went through following links:

  1. https://datascience.stackexchange.com/questions/73553/how-is-the-base-value-of-shap-values-calculated
  2. https://medium.com/@makcedward/shap-will-provide-both-base-value-and-output-value-bfe2339edd44

My understanding is base value (for a given class) is the average prediction score for that class across all training samples. Now, given that, should it be not same across the 3 test samples I have shown in image, as training data is fixed for model.

I want to understand why the base values are different here. Thanks!

Asked By: Prasanjit Rath

||

Answers:

Base value is obtained by masking all the tokens of the input sentence. For example a sentence "I love go karting" will be made into "[MASK] [MASK] [MASK] [MASK]" (assuming 4 tokens are formed after tokenization) and passed into model. The logit or probability score obtained for each class will become the base value for that example for that class.

In this sense, base value is the value that is the output when nothing is known.

Why is it different in different examples? Because 2 examples can result in masked sentences with different number of MASKs. For example, the text "I know" can lead to "[MASK] [MASK]". Now, model will produce different numbers as "[MASK] [MASK]" is a totally different string from "[MASK] [MASK] [MASK] [MASK]". Since input isn’t same, output isn’t same (it will be slightly different though).

base value

base value2

Answered By: Prasanjit Rath
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.