Skip to content

← Writing

engineering

Guardrails and Trust

· Jerwin Arnado · 1 min read ·

The power and the danger are the same feature: it can run commands. Trust isn’t blind — it’s built from guardrails you can name.

The blast-radius mindset

  • Reversible + local (edit a file, run a test) → let it fly.
  • Hard to reverse or shared state (force-push, drop table, deploy) → confirm first.
  • Match the action to what was actually asked.

Permission modes

  • How approval gates work; auto-allow lists for safe, repetitive calls.
  • Tuning prompts down without going fully unsupervised.

Reviewing the diff, not the promise

  • The agent’s summary is intent, not proof. Read the actual change.
  • Where I trust tests vs where I read every line.

Shipping to production safely

  • The live Synology deploy: bin/deploy, perf invariants, what I never let an agent touch unattended.
  • Backups and the “measure twice, cut once” rule.