Guardrails and Trust
· Jerwin Arnado · 1 min read ·
The power and the danger are the same feature: it can run commands. Trust isn’t blind — it’s built from guardrails you can name.
The blast-radius mindset
- Reversible + local (edit a file, run a test) → let it fly.
- Hard to reverse or shared state (force-push, drop table, deploy) → confirm first.
- Match the action to what was actually asked.
Permission modes
- How approval gates work; auto-allow lists for safe, repetitive calls.
- Tuning prompts down without going fully unsupervised.
Reviewing the diff, not the promise
- The agent’s summary is intent, not proof. Read the actual change.
- Where I trust tests vs where I read every line.
Shipping to production safely
- The live Synology deploy:
bin/deploy, perf invariants, what I never let an agent touch unattended. - Backups and the “measure twice, cut once” rule.