3 comments
Incidentally, Chroma also produced the single best study on long-context degradation that I've come across:<p><a href="https://research.trychroma.com/context-rot" rel="nofollow">https://research.trychroma.com/context-rot</a><p>Before that, I cited nolima (<a href="https://www.reddit.com/r/LocalLLaMA/comments/1io3hn2/nolima_longcontext_evaluation_beyond_literal/" rel="nofollow">https://www.reddit.com/r/LocalLLaMA/comments/1io3hn2/nolima_...</a>) constantly to illustrate how difficult tasks involving reasoning or multi-step information gathering degraded much faster than the needle-in-haystack benchmarks cited by the major labs. Now Chroma is the first stop. Nice job on the research!
Very cool!