4 comments

  • duskwuff3 hours ago
    Before you get too excited, keep two things in mind:<p>1) Using a single compression context for the whole stream means you have to keep that context active on the client and server while the connection is active. This may have a nontrivial memory cost, especially at high compression levels. (Don&#x27;t set the compression window any larger than it needs to be!)<p>2) Using a single context also means that you can&#x27;t decompress one frame without having read the whole stream that led up to that. This prevents some possible useful optimizations if you&#x27;re &quot;fanning out&quot; messages to many recipients - if you&#x27;re compressing each message individually, you can compress it once and send the same compressed message to every recipient.
    • adzm2 hours ago
      The analogy to h264 in the original post is very relevant. You can fix some of the downsides by using the equivalent of keyframes, basically. Still a longer context than a single message but able to be broken up for recovery or etc.
  • masklinn33 minutes ago
    Surely that is obvious to anyone who has compared zip and tgz? As long as you’re not affected by the drawbacks obviously.
  • lambdaloop3 hours ago
    Does streaming compression work if some packets are lost or arrive in a different order? Seems like the compression context may end up different on the encoding&#x2F;decoding side.. or is that handled somehow?
    • gkbrk2 hours ago
      WebSockets [1] run over TCP, and the messages are ordered.<p>There is RFC 9220 [2] that makes WebSockets go over QUIC (which is UDP-based). But that&#x27;s still expected to expose a stream of bytes to the WebSocket, which still keeps the ordering guarantee.<p>[1]: <a href="https:&#x2F;&#x2F;datatracker.ietf.org&#x2F;doc&#x2F;html&#x2F;rfc6455" rel="nofollow">https:&#x2F;&#x2F;datatracker.ietf.org&#x2F;doc&#x2F;html&#x2F;rfc6455</a><p>[2]: <a href="https:&#x2F;&#x2F;datatracker.ietf.org&#x2F;doc&#x2F;rfc9220&#x2F;" rel="nofollow">https:&#x2F;&#x2F;datatracker.ietf.org&#x2F;doc&#x2F;rfc9220&#x2F;</a>
    • dgoldstein03 hours ago
      I think the underlying protocol would have to guarantee in order delivery - either via tcp (for http1, 2, or spdy), or in http3, within a single stream.
    • duskwuff3 hours ago
      It sounds as though the data is being transferred over HTTP, so packet loss&#x2F;reordering is all handled by TCP.
      • dgoldstein056 minutes ago
        Yes, or by http3&#x27;s in order guarantees on the individual streams (as http3 is udp)
  • efitz2 hours ago
    When I worked at Microsoft years ago, me and my team (a developer and a tester) built a high volume log collector.<p>We used a streaming compression format that was originally designed for IBM tape drives.<p>It was fast as hell and worked really well, and was gentle on CPU and it was easy to control memory usage.<p>In the early 2000s on a modest 2-proc AMD64 machine we ran out of fast Ethernet way before we felt CPU pressure.<p>We got hit by the SOAP mafia during Longhorn; we couldn’t convince the web services to adopt it; instead they made us enshittify our “2 bytes length, 2 bytes msgtype, structs-on-the-wire” speed demon with their XML crap.