5 comments

  • SirensOfTitan18 minutes ago
    I don&#x27;t really think this is at all at the quality bar for posts here. This is obviously AI-slop -- why should I invest more time reading your slop than you took to write it?<p>Even so, at what point do we consider the LLM-ification of all of tech a hazard? I&#x27;ve seen Claude go and lazily fix a test by loosening invariants. AI writes your code, AI writes your tests. Where is your human judgment?<p>Someone is going to lose money or get hurt by this level of automation. If the humans on your team cannot keep track of the code being committed, then I would prefer not to use your product.
  • elteto53 minutes ago
    I think this is the first article that truly gave me “slop nausea”. So many “It’s not X. It’s Y.” Do people not realize how awful this reads? It’s not a novel either, just a few thousand words, just fucking write it and edit it yourself.
    • zeristor46 minutes ago
      I&#x27;m guessing they have a workflow for blog posts, with 100k workflows I was wondering something seems a bit weird.
  • simianwords1 hour ago
    interesting that they have an agent that is triggered on flaky CI failures. but it seems far too specific -- you can have pull request on many other triggers.<p>there doesn&#x27;t seem to be any upside on having it only for flaky tests because the workflow is really agnostic to the context.
  • zX41ZdbW1 hour ago
    Two problematic statements in this article:<p>1. Test pass rate is 99.98% is not good - the only acceptable rate is 100%.<p>2. Tests should not be quarantined or disabled. Every flaky test deserves attention.
    • lab141 hour ago
      a test pass rate of 100% is a fairy tale. maybe achievable on toy or dormant projects, but real world applications that make money are a bit more messy than that.
      • alkonaut21 minutes ago
        I definitely have 100% pass rate on our tests for most of the time (in master, of course). By &quot;most of the time&quot; I mean that on any given day, you should be able to run the CI pipeline 1000 times and it would succeed all of them, never finding a flaky test in one or more runs.<p>In the rare case that one is flaky, it&#x27;s addressed. During the days when there is a flaky test, of course you don&#x27;t have 100% pass rate, but on those days it&#x27;s a top priority to fix.<p>But importantly: this is library and thick client code. It should be deterministic. There are no DB locks, docker containers, network timeouts or similar involved. I imagine that in tiered application tests you always run the risk of various layers not cooperating. Even worse if you involve any automation&#x2F;ui in the mix.<p>Obviously there are systems it depends on (Source control, package servers) which can fail, failing the build. But that&#x27;s not a _test_ failure.<p>If the build it fails, it should be because a CI machine or a service the build depends on failed, not because an individually test randomly failed due to a race condition, timeout, test run order issue or similar
        • salomonk_mur18 minutes ago
          If one is flaky, then you are below 100% friend.
          • alkonaut3 minutes ago
            That&#x27;s not what I mean. I mean that anything but 100% is a &quot;stop the world this is unacceptable&quot; kind of event. So if there is a day when there is a flaky test, it must be rare.<p>To explain further<p>There is a difference between having 99.99% test pass every day (unacceptable) which is also 99.99% tests passing for the year, versus having 100% tests passing on 99% of days, and 99% tests on a single bad day. That might also give 99.99% test pass rate for the year, but here you were productive on 99&#x2F;100 days. So &quot;100.0 is the normal&quot; is what I mean. Not that it&#x27;s 100% pass on 100% of days.<p>Having 99.98% tests pass on any random build is absolutely terrible. It means a handful of tests out of your test suite fail on almost _every single CI run_. If you have 100% test pass as a validation for PR&#x27;s before merge, that means you&#x27;ll never merge. If you have 100% test pass a validation to deploy your main branch that means you&#x27;ll never deploy...<p>You want 100% pass on 99% of builds. Then it doesn&#x27;t matter if 1% or 99% of tests pass on the last build. So long as you have some confidence that &quot;almost all builds pass entirely green&quot;.
      • _heimdall14 minutes ago
        When I was at Microsoft my org had a 100% pass rate as a launch gate. It was never expected that you would keep 100% but we did have to hit it once before we shipped.<p>I always assumed the purpose was leadership wanting an indicator that implied that someone had at least looked at every failing test.
    • YetAnotherNick32 minutes ago
      Even something as simple as docker pull fails for 0.02% of the time.
    • rkomorn1 hour ago
      On top of 2., new tests should be stress-tested to make sure they aren&#x27;t flaky so that the odds of merging them go down.
      • lionkor1 hour ago
        I can run flaky tests on my machine a thousand times without failure, whereas they fail in CI sometimes.
    • hristian1 hour ago
      [dead]
  • IshKebab1 hour ago
    &gt; Every commit to main triggers an average of 221 parallel jobs<p>Jesus, this is why Bazel was invented.