Is there a reason people don't use SHAP [1] to interpret language models more often? The in-context attribution of outputs seems very similar.<p>[1] <a href="https://shap.readthedocs.io/en/latest/" rel="nofollow">https://shap.readthedocs.io/en/latest/</a>
Maybe I’m not creative enough to see the potential, but what value does this bring ?<p>Given the example I saw about CRISPR, what does this model give over a different, non explaining model in the output ?
Does it really make me more confident in the output if I know the data came from Arxiv or Wikipedia ?<p>I find the LLM outputs are subtlety wrong not obviously wrong
This is very interesting. I don't see much discussion of interpretability in day to the day discourse of AI builders. I wonder if everyone assumes it to either be solved, or to be too out of reach to bother stopping and thinking about.
Now this is something which is <i>very</i> interesting to see and might be the answer to the explainability issue with LLMs, which can unlock a lot more use-cases that are off limits.<p>We'll see.