7 comments

  • Sean-Der21 minutes ago
    Amazing debugging, I loved reading that. HN doesn&#x27;t get enough good posts like this anymore :)<p>If <a href="https:&#x2F;&#x2F;github.com&#x2F;pion&#x2F;sctp&#x2F;issues&#x2F;12" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;pion&#x2F;sctp&#x2F;issues&#x2F;12</a> had happened (not just in Pion but across all implementations) this could have been fixed years ago. The hardcoding we all settle for is tragic.
    • syllogistic6 minutes ago
      Author here, thank you, that means a lot coming from you. Pion was the prior art I pointed the webrtc-rs maintainers at. And pion&#x2F;sctp#12 is super relevant. A known, proposed fix years before we hit it.<p>&quot;The hardcoding we all settle for&quot; might be the epigraph for the whole incident. webrtc-rs invited a PR for the configurable-MTU + better default half [webrtc-rs&#x2F;webrtc#806] to unblock folks today. Whether PMTUD gets implemented will be interesting to see.
  • inigyou43 minutes ago
    I don&#x27;t understand how a product as popular as Tailscale can get this far while dropping certain ordinary types of packets.<p>It is impossible to parse the UDP or TCP port number out of a fragment. This is surely the reason the ACL module entirely rejects them. TCP will adjust it&#x27;s segment size based on PMTUD so as to not require fragmentation. This is why it hasn&#x27;t been noticed so far. But fragmented UDP packets are a corner case of normal behavior and it boggles the mind that someone could just decide to completely drop them.<p>UDP fragment filtering could be implemented by a global fragments on&#x2F;off setting (works for &quot;allow everything&quot; = fragments on, cautious = fragments off) or by blocking the first fragment which includes the port number (and blocking it if the port number is split across fragments which I think is technically allowed but completely abnormal).
    • syllogistic28 minutes ago
      Author here,<p>Agreed. The port-number point is the most plausible rationale I&#x27;ve heard, more convincing than the RFC line in their source comment. The historical fix for &quot;can&#x27;t classify fragments&quot; was virtual reassembly or flow tracking [conntrack on linux, scrub in pf], so dropping them outright punts past known prior approaches. Even your lighter idea would have saved us: a first-fragment match would have let our pair through.<p>We&#x27;ve reported upstream to both projects, tailscale&#x2F;tailscale#20083 and webrtc-rs&#x2F;webrtc#806, and webrtc-rs already invited a PR.
      • inigyou17 minutes ago
        You are shadowbanned.
  • hylaride27 minutes ago
    I&#x27;m having flashbacks to 1990s-era PPPoE, where the slightly smaller MTU had issues with some server OS&#x27;s that had TCP&#x2F;IP stacks that didn&#x27;t support or ignored MTUs smaller than 1500 bytes and bulk data transfers would get messed up. I don&#x27;t remember which ones, but it was some commercial UNIX.
  • katericksonnow38 minutes ago
    MTU black holes are the worst because every health check is small enough to survive.
  • syllogistic1 hour ago
    Author here.<p>This started as a blank page on one device and ended two weeks later at the intersection of two bugs: webrtc-rs hardcodes INITIAL_MTU=1228 [never updated, no path probing, retransmits at the same size forever], and Tailscale&#x27;s packet filter classifies any IPv6 packet with a Fragment header as unknown protocol, so the default deny fires. On every platform, counted under reason=&quot;acl&quot;. Neither is unreasonable alone. Together: silent wedge, every health check green, because everything that tests the path is small and only the payload fragments. Two-command repro on any tailnet: ping -s 100 works, ping -s 1400 over the Tailscale IPv6 address is 100% loss. Full WebRTC repro and captures: <a href="https:&#x2F;&#x2F;github.com&#x2F;phact&#x2F;mtu-webrtc-bug" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;phact&#x2F;mtu-webrtc-bug</a>. We&#x27;ve reported upstream to both projects <a href="https:&#x2F;&#x2F;github.com&#x2F;tailscale&#x2F;tailscale&#x2F;issues&#x2F;20083" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;tailscale&#x2F;tailscale&#x2F;issues&#x2F;20083</a> and <a href="https:&#x2F;&#x2F;github.com&#x2F;webrtc-rs&#x2F;webrtc&#x2F;issues&#x2F;806" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;webrtc-rs&#x2F;webrtc&#x2F;issues&#x2F;806</a>. Happy to answer questions. Especially interested if anyone knows the history behind the IPv6 fragment decision in Tailscale&#x27;s filter.