ublo
bogdan's (micro)blog

bogdan

bogdan » Ruriko: the manager’s office for agents

10:04 pm on Feb 22, 2026 | read the article | tags:

i’ve spent the last weeks in a strange role: the de facto architect for a small group of friends who all want the same thing, just with different costumes.

«i want automated trading». «i want e-commerce ops». «i want marketing and outreach». «i want customer support». each conversation starts the same way: a high-level ambition, spoken as if the internet is a vending machine and AI is the coin you drop into it.

and each conversation ends the same way too: reality.

reality looks like a vps vs macbook debate, a pile of api keys, some half-understood tooling (mcp, n8n, cron, webhooks), a security story that is mostly vibes, and an uncomfortable question nobody wants to say out loud:

if this thing can act on my behalf, what stops it from doing something stupid?

that question is the seed. that question is why i started building Ruriko.

the problem isn’t AI. it’s autonomy.

most people don’t actually want «an autonomous agent».

they want leverage.

they want something that thinks faster, reads more, watches markets while they sleep, drafts messages, summarizes news, spots anomalies, and tells them what matters. but when it comes to execution, they become conservative in a very human way. they want a system that acts like an analyst first, executor second.

this isn’t irrational. it’s honest.

we’ve all seen «helpful» systems hallucinate. you can call it 10% error rate, you can call it «edge cases», you can call it «model limitations». the name doesn’t matter. the outcome does: if 1 out of 10 actions is wrong, you don’t let it place trades. you don’t let it refund customers. you don’t let it email your clients at scale. you don’t hand it the keys to your house and then act surprised when the tv is missing.

so the problem isn’t that agents are weak.

the problem is that agents are powerful in all the wrong ways.

they are powerful at producing text. and increasingly powerful at calling tools. but they are terrible at being accountable. they don’t naturally come with a manager’s office: rules, approvals, budgets, audit logs, scoped access, and a clean separation between «talking» and «doing».

that manager’s office is Ruriko.

the second problem: the complexity gap

the other pattern i kept seeing was not fear. it was confusion.

people want an agent like they want a new app. click, install, done.

but agent reality is a small DevOps career.

  • where does it run? local machine? vps? always-on?
  • how does it talk to me? whatsapp? telegram? email? slack?
  • where do the keys live? who can see them?
  • how do i add a web scraper? how do i add a market data provider?
  • what happens when something crashes at 3 am?
  • what happens when a tool or integration changes?
  • what happens when the model costs $6–$12/hour and i don’t notice until the invoice arrives?

most people don’t want to learn «the plumbing.» they want to focus on «the strategy.» but strategy doesn’t execute itself. and every missing abstraction becomes another fragile script, another copy-pasted yaml file, another secret stuffed into an env var, another bot that runs until it doesn’t.

Ruriko exists to collapse that complexity into something you can operate.

Ruriko is a control plane for agents.

you talk to Ruriko over chat. Ruriko provisions, configures, and governs specialized agents. each agent runs in a constrained runtime, with explicit capabilities, explicit limits, and scoped secrets. dangerous actions require human approval. everything gets logged. everything has trace ids. secrets are handled out of band, never pasted into chat.

i like to describe it like this:

if an AI agent is an intern with a lot of enthusiasm and no sense of consequences, Ruriko is the manager’s office: the desk assignment, the keycard permissions, the expense limits, the incident log, and the «come ask me before you touch production.»

the architecture: separate the planes

agent systems fail when everything lives in one place. conversation, control, execution, secrets, and logs get mixed into a soup, and the soup eventually leaks.

Ruriko is designed around separation. not because it’s elegant, but because it’s survivable.

1. the conversation layer (Matrix)

Ruriko uses Matrix as the conversation bus. you have rooms. you have identities. you have a chat client. you type commands. you can also talk naturally, but the system draws a hard line between «chat» and «control».

this matters because chat is a hostile environment in disguise. it’s friendly and familiar, which makes it easy to do unsafe things. like pasting secrets. or running destructive actions without thinking. or letting an agent interpret «sure, go ahead» as «delete everything.»

2. the control plane (Ruriko)

Ruriko itself is deterministic. that’s not a marketing line. it’s a design constraint.

  • lifecycle decisions are deterministic
  • secret handling is deterministic
  • policy changes are deterministic
  • approvals are deterministic

the model never gets to decide «should i start this container» or «should i rotate this key». it can help explain. it can help summarize. it can’t be the authority.

the control plane tracks inventory, desired state, actual state, config versions, and approvals. it runs a reconciliation loop. it creates audit entries. it can show you a trace for a whole chain of actions.

3. the data plane (Gitai agents)

agents run in Gitai, a runtime designed to be governed. they have a control endpoint (for Ruriko), a policy engine, and a tool loop. they’re allowed to propose tool calls, but the policy decides whether those calls are permitted.

this is where «agent» becomes a practical, bounded thing, not a fantasy.

4. the secret plane (Kuze)

secrets are the first thing that makes agent systems real, and the first thing that breaks them.

Ruriko treats secrets as a separate plane: Kuze.

humans don’t paste secrets into chat. instead, Ruriko issues one-time links. you open a small page, paste the secret, submit. the token burns. the secret gets encrypted at rest. Ruriko confirms in chat that it was stored, without ever seeing the value again in the conversation layer.

agents don’t receive secrets as raw values over the control channel either. instead, they receive short-lived redemption tokens and fetch secrets directly from Kuze. tokens expire quickly. they’re single-use. secrets don’t appear in logs. in production mode, the old «push secret value to agent» path is simply disabled.

this is not paranoia. this is the minimum viable safety story for anything that can act in the world.

5. the policy as guardrail (Gosuto)

agents are useless without tools. agents are dangerous with tools.

Ruriko’s answer is a versioned policy format called Gosuto. it defines:

  • trust contexts (rooms, senders)
  • limits (rate, cost, concurrency)
  • capabilities (allowlists / denylists for tools)
  • approval requirements
  • persona (the model prompt and parameters)

the key idea is boring and powerful: default deny, then explicitly allow what’s needed. and version it. and audit it. and be able to roll it back.

approvals: analyst first, executor second

in real use cases, the gap between «analysis» and «execution» is the whole point.

  • trading: «i think we should enter here» is not «place the order»
  • support: «this looks like a refund case» is not «refund it»
  • marketing: «this copy might work» is not «blast it to 20k people»

Ruriko models this explicitly. operations that are destructive or sensitive are gated behind approvals. approvals have ttl. they have approver lists. they can be approved or denied with a reason. and they leave an audit trail that you can inspect later when you’re trying to understand why something happened.

this is how you get autonomy without surrendering agency.

cost and performance: the missing dashboard (and why it matters)

high quality thinking is expensive. latency is real. and the worst kind of cost is invisible cost.

a control plane is the natural place to make this visible:

  • how many tokens did this agent burn today?
  • what’s the average latency per task?
  • which tool calls are causing spikes?
  • which model is being used for which capability?
  • what happens if i cap this agent to a budget?

i’m not pretending this is solved by default. but i built Ruriko so it can be solved cleanly, without duct-taping metrics onto a chat bot. when you have trace ids and a deterministic control channel, you can build a real cost story.

you can also do something simple but important: make «thinking» and «doing» different tiers. use slower, more expensive models for analysis. use cheaper, faster ones for routine tasks. or use local models for sensitive work. a control plane lets you swap those decisions without rewriting everything.

how this differs from assistant-first systems

there’s a class of tools that are trying to be your personal assistant. they’re impressive. they’re fun. they’re also, by default, too trusting of themselves.

Ruriko isn’t trying to be «the agent». it’s trying to be the thing that makes agents operable.

the difference sounds subtle until you run anything for a week.

assistant-first systems optimize for capability and speed of iteration. control-plane systems optimize for governance and survivability. once you accept that agents will fail sometimes, you start building around blast radius, audit trails, and recovery.

you stop asking «how do i make it smarter?» and start asking «how do i make it safe enough to be useful?»

what Ruriko can do today, and what comes next

the foundation is in place:

  • you can run the stack locally with docker compose
  • you can talk to Ruriko over Matrix
  • you can store secrets securely via one-time links
  • you can provision agents, apply configs, and push secret tokens
  • you have approvals for sensitive operations
  • you have audit logging with trace correlation
  • you have a reconciler loop that notices drift
  • you have a policy engine that constrains tool usage

what comes next is the part that makes it feel alive: the canonical workflow.

i think in terms of specialist agents, like a small team:

  • one agent triggers work periodically (a scheduler)
  • one agent pulls market data and builds analysis
  • one agent pulls news and context
  • the control plane routes tasks, enforces policy, and decides when to notify the human

the dream is not «a single magical assistant».

the dream is a system where agents collaborate under governance, like adults.

the point of all this

every time i explain agents to friends, the conversation eventually reaches the same emotional endpoint:

«ok, but i don’t want it to do something dumb»

Ruriko is my answer to that fear. not by pretending the fear is irrational, but by treating it as a specification.

it turns AI from a talkative intern with admin credentials into a managed system:

  • scoped access
  • explicit permissions
  • explicit limits
  • explicit approvals
  • versioned policy
  • separate secret plane
  • auditability
  • the ability to stop and recover

if you want to build something real with agents, this is the unglamorous work you eventually have to do anyway.

i just decided to do it first.

and if you’re one of the people who wants «an agent» but doesn’t want a DevOps apprenticeship, this is the bet:

give the intern a manager’s office. then let it work.

aceast sait folosește cookie-uri pentru a îmbunătăți experiența ta, ca vizitator. în același scop, acest sait utilizează modulul Facebook pentru integrarea cu rețeaua lor socială. poți accesa aici politica mea de confidențialitate.