14 comments

  • Deeds673 hours ago
    To be honest, the official superpowers/brainstorming skill already does TDD so well, I don't see that much of a need for this. TDD is definitely the way to go with agentic development.
  • shruubi4 hours ago
    Two questions<p>1) Do you not feel self-conscious or weird about calling this &quot;EvanFlow&quot;? Seems like a lot of people these days are naming their AI tools&#x2F;skills&#x2F;whatever after themselves which seems self-absorbed. Either that or they hope that if their thing takes off like OpenClaw did then they&#x27;ll grab the fame that comes along with it.<p>2) Why does your TDD flow miss the refactor step of TDD?
    • toyg1 hour ago
      I initially thought it was a pun on Pearl Jam&#x27;s classic &quot;Even Flow&quot;, then I read your comment and noticed the username... Sad.
    • wenc4 hours ago
      I feel like 1 is a self correcting problem. If this goes nowhere it will soon be forgotten.<p>I can think of one example that did go somewhere: Linux.
    • normie30004 hours ago
      Ref 1, he should have called it Daughter.
    • ButlerianJihad1 hour ago
      &quot;Evenflo is a hundred year old infant feeding brand.&quot; Probably named to market its baby bottles and accessories.<p>Everybody who grew up to listen to Pearl Jam had seen or used an Evenflo pacifier, baby bottle, or car seat. That&#x27;s one reason the song already sounded so familiar.
  • s20n5 hours ago
    EvanFlow - thoughts arrive like butterflies?
    • sbseitz5 hours ago
      Oh, he don&#x27;t know, so he chases them away
    • ge962 hours ago
      Seeeethinnggg tests failing not complete... again
    • __mharrison__3 hours ago
      Someday soon he&#x27;ll begin his life again
  • nghnam54 minutes ago
    superpowers&#x2F;brainstorming is doing TDD as well.
  • evanklem20046 hours ago
    Built this as an opinionated Claude Code development flow based on evidence based practices and what has been working for me while developing professional code.<p>EvanFlow is a single TDD-driven loop. Say &quot;let&#x27;s evanflow this&quot; and it walks brainstorm → plan → execute → tdd → iterate → STOP. Real checkpoints at design and plan approval. Never auto-commits, never auto-stages, never proposes integration - every git op is your call.<p>The three things that actually changed how I work:<p>1. Vertical-slice TDD. One failing test → minimal impl → next test. Watch each test fail before writing the impl that passes it. (Sounds obvious. Almost no agent does it by default. ~62% of LLM-generated test assertions are wrong per HumanEval research, so testing TDD discipline matters more than the impl discipline.)<p>2. Embedded grilling at decision points. Before locking a plan: what breaks if a user does X? What&#x27;s the rollback? What&#x27;s explicitly out of scope? Catches design flaws while they&#x27;re still cheap.<p>3. Iterate-until-clean (hard cap of 5 rounds). Re-read the diff against dead code, naming, the deletion test, assertion correctness, and a Five Failure Modes pass (hallucinated actions, scope creep, cascading errors, context loss, tool misuse). For UI: screenshot via headless Chromium.<p>For bigger plans with 3+ independent units sharing types, it forks into a parallel coder&#x2F;overseer orchestration. Integration tests at touchpoints ARE the cohesion contract.<p>Three install paths: Claude Code plugin marketplace, npx skills add, manual copy. MIT.
    • girvo3 hours ago
      Please don’t post AI generated comments :(<p>Just write it yourself. I promise it’s worth it
    • dpark3 hours ago
      I’ve thought of going down the TDD model for LLMs as a way of providing constraints on their behavior. I would think that “vertical slice” TDD would encourage the LLM to start tailoring the tests to the implementation rather than establishing the invariants up front, though. I was considering “horizontal” TDD to force the agent to implement constraints before coding to them.
    • lukewrites3 hours ago
      Curious, In the repo you mention<p>&gt; Several rules come from 2025-2026 industry research on agentic coding failure modes<p>What are some of the papers you read?
      • esperent2 hours ago
        With no disrespect intended because this is also how I would do it (but I wouldn&#x27;t publish and name it after myself!) - they didn&#x27;t read the research. They had the AI that actually created this do that for them.
    • esperent2 hours ago
      &gt; execute → tdd<p>How are these separate steps?<p>TDD is <i>how</i> you execute, not something you tack on afterwards.
  • sdevonoes1 hour ago
    TDD in 2026? Besides, TDDs main benefit is to come up with a decent architecture for your system… LLMs can already do that if instructed. I don’t see the point of TDD
  • here2learnstuff3 hours ago
    Not bad, but also, forgive how mean this is going to come across: not using a product from someone who just started their undergrad.
    • fragmede1 hour ago
      Linus started Linux when he was 21, an undergrad at the University of Helsinki. You&#x27;re entirely welcome to use whatever filtering function for products you use, but it doesn&#x27;t seem like soley using this particular product&#x27;s creator&#x27;s age as a disqualifier comes from a place of sound reasoning, to me.
    • avyjit2 hours ago
      This is such a BS take. If you feel the product is immature or not great - that&#x27;s valid criticism. This is not
  • jtfrench5 hours ago
    How does this handle “dumb zone” evasion while looping?
  • cratermoon4 hours ago
    <a href="https:&#x2F;&#x2F;www.evenflo.com&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.evenflo.com&#x2F;</a>
  • enesz24 minutes ago
    [dead]
  • tommy29tmar2 hours ago
    [dead]
  • youwangd1 hour ago
    [dead]
  • jonahs1973 hours ago
    [dead]
  • marsven_4223 hours ago
    [dead]