The DLE-escaping problem the article mentions — where 0x10 bytes in PCM audio had to be doubled to 0x10 0x10 — is the same class of bug that plagued the Hayes +++ escape sequence in data mode, and it is striking that it was never properly solved. The fundamental issue is that the Smartmodem architecture multiplexed control and bearer on a single serial channel, and every framing scheme for doing that with an 8-bit-clean payload on an async serial link is either fragile (in-band escaping) or wasteful (COBS/SLIP-style byte-stuffing adds overhead proportional to the escape density). The multi-UART solution that modern cellular modems adopted — separate serial channels for control, bearer, GNSS, debug, etc. — is really the only correct answer, and it is interesting that it took nearly two decades to become standard practice. What Harald Welte describes in his 2017 post is not so much an "ugly hack" as it is the inexorable gravitational pull of the Hayes architecture: once you have committed to AT commands over serial as your control plane, every new capability including real-time voice must be shoehorned into that same channel, and the result is always going to look awkward. The persistence of this design through winmodems, ISDN terminal adapters, and into modern 5G USB dongles is a textbook case of interface lock-in outweighing architectural fitness. On the IVR side, it is worth noting that Asterisk's early versions (circa 1999) supported voice modems via TAPI before the project pivoted hard to VoIP and never looked back — Mark Spencer reportedly called the TAPI voice modem support the single most painful integration in Asterisk's history, which says something about the state of V.253 interoperability in practice.