8 comments

  • naultic22 minutes ago
    I'm working on something a little similar but mines more a dev tool vs process automation but I love where yours is headed. The biggest issue I've run into is handling retries with agents. My current solution is I have them set checkpoints so they can revert easily and when they can't make an edit or they can't get a test passing, they just restart from earlier state. Problem is this uses up lots of tokens on retries how did you handle this issue in your app?
    • jawiggins20 minutes ago
      Generally I've found agents are capable of self correcting as long as they can bash up against a guardrail and see the errors. So in optio the agent is resumed and told to fix any CI failures or fix review feedback.
  • denysvitali1 hour ago
    FWIW, a "cheaper" version of this is triggering Claude via GitHub Actions and `@claude`ing your agents like that. If you run your CI on Kubernets (ARC), it sounds pretty much the same
  • MrDarcy3 hours ago
    Looks cool, congrats on the launch. Is there any sandbox isolation from the k8s platform layer? Wondering if this is suitable for multiple tenants or customers.
    • jawiggins2 hours ago
      Oh good question, I haven&#x27;t thought deeply about this.<p>Right now nothing special happens, so claude&#x2F;codex can access their normal tools and make web calls. I suppose that also means they could figure out they&#x27;re running in a k8s pod and do service discovery and start calling things.<p>What kind of features would you be interested in seeing around this? Maybe a toggle to disable internet connections or other connections outside of the container?
  • antihero3 hours ago
    And what stops it making total garbage that wrecks your codebase?
    • jawiggins2 hours ago
      There are a few things:<p>a) you can create CI&#x2F;build checks that run in github and the agents will make sure pass before it merges anything<p>b) you can configure a review agent with any prompt you&#x27;d like to make sure any specific rules you have are followed<p>c) you can disable all the auto-merge settings and review all the agent code yourself if you&#x27;d like.
      • kristjansson1 hour ago
        &gt; to make sure<p>you&#x27;ve really got to be careful with absolute language like this in reference to LLMs. A review agent provides no guarantees whatsoever, just shifts the distribution of acceptable responses, hopefully in a direction the user prefers.
        • jawiggins1 hour ago
          Fair, it&#x27;s something like a semantic enforcement rather than a hard one. I think current AI agents are good enough that if you tell it, &quot;Review this PR and request changes anytime a user uses a variable name that is a color&quot;, it will do a pretty good job. But for complex things I can still see them falling short.
    • upupupandaway2 hours ago
      Ticket -&gt; PR -&gt; Deployment -&gt; Incident
  • conception2 hours ago
    What’s the most complicated, finished project you’ve done with this?
    • jawiggins1 hour ago
      Recently I used to to finish up my re-implementation of curl&#x2F;libcurl in rust (<a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=47490735">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=47490735</a>). At first I started by trying to have a single claude code session run in an iterative loop, but eventually I found it was way to slow.<p>I started tasking subagents for each remaining chunk of work, and then found I was really just repeating the need for a normal sprint tasking cycle but where subagents completed the tasks with the unit tests as exit criteria. So optio came to my mind, where I asked an agent to run the test suite, see what was failing, and make tickets for each group of remaining failures. Then I use optio to manage instances of agents working on and closing out each ticket.
  • hmokiguess1 hour ago
    the misaligned columns in the claude made ASCII diagrams on the README really throw me off, why not fix them?<p>| | | |
  • rafaelbcs2 hours ago
    [dead]
  • QubridAI2 hours ago
    [flagged]
    • knollimar2 hours ago
      I don&#x27;t want to accuse you of being an LLM but geez this sounds like satire