Show HN: Steerling-8B, a language model that can explain any token it generates

(guidelabs.ai)

44 points by adebayoj3 hours ago

4 comments

brendanashworth1 hour ago
Is there a reason people don't use SHAP [1] to interpret language models more often? The in-context attribution of outputs seems very similar.[1] <a href="https://shap.readthedocs.io/en/latest/" rel="nofollow">https://shap.readthedocs.io/en/latest/</a>
great_psy37 minutes ago
Maybe I’m not creative enough to see the potential, but what value does this bring ?Given the example I saw about CRISPR, what does this model give over a different, non explaining model in the output ? Does it really make me more confident in the output if I know the data came from Arxiv or Wikipedia ?I find the LLM outputs are subtlety wrong not obviously wrong
- voidhorse29 minutes ago
 It makes the black box slightly more transparent. Knowing more in this regard allows us to be more precise—you go from prompt tweak witchcraft and divination to more of possible science and precise method.
 - great_psy15 minutes ago
 Can this method be extended to go down to the sentence level ?In the example it shows how much of the reason for an answer is due to data from Wikipedia. Can it drill down to show paragraph or sentence level that influences the answer ?
pbmango1 hour ago
This is very interesting. I don't see much discussion of interpretability in day to the day discourse of AI builders. I wonder if everyone assumes it to either be solved, or to be too out of reach to bother stopping and thinking about.
rvz57 minutes ago
Now this is something which is very interesting to see and might be the answer to the explainability issue with LLMs, which can unlock a lot more use-cases that are off limits.We'll see.