Congrats on the launch. One complaint: RPA this, non-RPA that, but you never explain what it means. I would write down the acronym fully once at the first mention on the landing page.
For those confused like I was, RPA stands for Robotic Process Automation.
The underlying mechanism is different from something like computer use? Where can I find more details about how it works ?
Biggest question is how much of this can be stored / processed on our own infra and with our own lifecycle rules? For example, this can touch a lot of PHI. Screenshots, videos, JSON inputs/outputs etc.
Does this only revert back to LLM Vision when it catches an error? I.e once the RPA / workflow is built once, it’s efficient for running multiple times (until it catches an error state)?
yes effectively, but we use LLM vision in multiple places - for context, there are multiple ways an RPA can fail:<p>1. RPA code breaks (ex: throws an exception if a window does not exist)
2. RPA reports success but was clicking / typing in the wrong place
3. Underlying system breaks (virtual machine / legacy software)<p>the skill we have in our MCP is to build the RPA code to throw exceptions where possible so an LLM can understand the context and recover<p>to avoid false success states we add LLM vision steps in the workflow itself to error out if it sees that the system is in the wrong state<p>and for the underlying system breaking it can be as simple as having a CRON job that checks the status of the process / the health of the VM and running a script to reboot the system<p>it depends on the system but the pattern we've seen with RPAs is you can catch maybe 80% of the edge cases in the first week it's been rolled out
Can you compare Minicor to Convey? They seem very similar. We had a product demo of Convey wherein they showed us how you could train the agent to use legacy software using a simple shared screen capture and verbal instructions.
Small website nitpick: I feel like the "In production with" section's companies logos should be a bit darker, I could barely tell there was something there.
i'm curious: how does the steady state error rate of a stochastic automated system like this compare with the downtime and errors that come from a (brittle) deterministic bridge that can fail with upgrades? what does the observability look like? (i'm guessing one feature is that the execution log including images/screenshots for each transaction gets saved, which is probably a huge improvement.)
it’s a good q - we experimented a lot with computer use / agentic automation and found that at scale a hybrid solution where the automations run as deterministic code with agents for recovery is the best - running automations as code is faster & cheaper & when you’re doing critical tasks (like updating patient records) you don’t want an agent to potentially mess something up.<p>previously writing RPA code used to take a long time - using AI (and its infinite patience) we can write more durable code that covers more edge cases<p>And since they’re code based it’s pretty straightforward to an agents monitor them and update their code when upgrades to the underlying system happen etc…<p>for observability - we have workflow execution logs that store text, videos and screenshots so an agent or a human can debug them - lots and lots of webhooks when things break ! (:
Congrats on the launch! Legacy system users are also one of the slowest to adopt AI. How do you navigate that?
I've found that legacy system users (or at least the execs) are pretty excited about AI because they hate their legacy systems but can't really do anything about it (ERP changes are an extreme nightmare, and often no better system exists with all the capability they need). They want to wrap it in AI to automate stuff without changing out the core system.<p>This seems like a good approach to me, I work with a lot of legacy ERP-using companies in the manufacturing sector and can immediately see how we could put this to use for our customers.<p>I especially like that it's not doing computer use for everything which so far doesn't really seem to be working, especially outside the browser.
Legacy system users are also the one who pays the most for tools and services. We sell to enterprise, I can attest to that. If it is relevant usecase and positioning for the market, it should be fine.
100% right - we support the AI companies who are selling to the legacy end users - for ex: we don’t sell directly to hospitals, but an AI scribe for doctors that already has a hospital as a customer, we help them integrate to the hospital’s EMR
Could you use this to test new releases of software for bugs? A bit like TDD but for GUI interactions
Is the cloud LLM the judge based on screenshots with patient/customer data included ? That seems like a no-go for many countries given privacy concerns ?
So AI companies would install this on their customer (practices) computers?
How does this compare with CyberDesk (also YC)?
congrats on launch!
Please make your trust center public if you are focusing on healthcare AI companies…the footer link is dead.
<i>Computer use agents that run on Windows VMs or in the browser. On-premise, cloud</i><p>I think you meant premises.<p><a href="https://brians.wsu.edu/2016/05/30/premise-premises/" rel="nofollow">https://brians.wsu.edu/2016/05/30/premise-premises/</a>
[flagged]
[flagged]
[flagged]
What the deuce is an "RPA"?
[flagged]