1 comments

  • sdenton41 hour ago
    Depending on how &#x27;one-off&#x27; the query is, sequential read is the right answer. The alternative is indexing the data for ANN, which will generally require doing the equivalent of many queries across the dataset.<p>On the bright side, smart folks have already thought pretty hard about this. In my work, I ended up picking usearch for large-scale vector storage and ANN search. It&#x27;s plenty fast and is happy working with vectors on disk - solutions which are &#x2F;purely&#x2F; concerned with latency often don&#x27;t include support for vectors on disk, which forces you into using a hell of a lot of RAM.<p><a href="https:&#x2F;&#x2F;github.com&#x2F;unum-cloud&#x2F;USearch" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;unum-cloud&#x2F;USearch</a>