Show HN: ZSE – Open-source LLM inference engine with 3.9s cold starts

44 points by zyoralabs6 hours ago

2 comments

reconnecting8 minutes ago
Discussion on reddit: <a href="https://www.reddit.com/r/LocalLLaMA/comments/1rewis9/removed_by_moderator/" rel="nofollow">https://www.reddit.com/r/LocalLLaMA/comments/1rewis9/removed...</a>
medi_naseri4 hours ago
This is so freaking awesome, I am working on a project trying run 10 models on two GPUs, loading/off loading is the only solution I have in mind.<p>Will try getting this deployed.<p>Does cold start timings advertised for a condition where there is no other model loaded on GPUs?