Skip to content

← Writing

engineering

Stable Diffusion: AI Images on Your Own GPU

· Jerwin Arnado

Archive note: this is a backdated post, written years later while rebuilding this site. It’s dated to the moment it covers, but the hindsight is real.

In the DALL·E 2 post I wrote: “assume open-weight versions of this capability are coming — the only question is the delay.” The delay was four months. On August 22, Stability AI released Stable Diffusion — a text-to-image model in DALL·E 2’s weight class — and instead of a waitlist and an API, they shipped the actual model weights. A few gigabytes, downloadable, runnable on a consumer GPU with roughly 8–10GB of VRAM.

The capability everyone spent spring marveling at from behind OpenAI’s velvet rope now runs on a gaming PC. Locally. Offline. Free.

Why open weights changes the physics

The difference between “API access” and “weights on disk” is the difference between visiting and owning:

  1. No gatekeeper. No waitlist, no content policy, no per-image billing, no company that can revoke access or change terms. Also — and this is the same fact wearing its other face — no safety filter anyone can enforce. Both edges of that sword arrived simultaneously, and the discourse has noticed.
  2. No metering, so experimentation is unbounded. When each image costs API credits, you prompt carefully. When generation is free-after-hardware, you iterate hundreds of times, build scripts around it, batch-generate, fine-tune your judgment. Volume of play is how skills and ecosystems form.
  3. It’s hackable. Within days of release the community produced optimized forks running in less VRAM, web UIs (the AUTOMATIC1111 interface is evolving almost hourly), img2img workflows, prompt-weighting syntax, Photoshop plugins. A closed API gets users; open weights get a bazaar. Watching it assemble itself in real time is the most open-source thing I’ve seen since the early Linux days I only know from lore.
  4. It will get embedded. An API call is a feature; a local model is a component. Image generation will now show up inside tools, games, and pipelines whose authors would never have built on a metered third-party endpoint.

The homelab angle

For the self-hosting crowd, this is a genuinely new species of service: heavy, GPU-hungry, and worth it. Notes from getting it running:

  • The GPU drought chose a funny time to matter again — though with Ethereum’s mining era ending, the secondhand GPU market is finally thawing. A used 8GB+ card now has a new reason to exist.
  • This is the first mainstream workload that makes “GPU in the home server” a reasonable sentence. File that under trends to watch: if image models run locally today, it’s hard to believe text models stay API-only forever.
  • Practical stack: Python environment, the weights, and a web UI in front. Treat it like any self-hosted service — containerize it, keep it off the open internet, version your configs.

The part that stays uncomfortable

Everything contested about DALL·E 2 — training data scraped without artists’ consent, style mimicry, what “made by AI” does to creative labor — is now contested without a gatekeeper to petition. OpenAI could at least be lobbied; weights on a million hard drives cannot. I don’t think the genie metaphor is even right. Genies grant three wishes. This grants unlimited ones, to everyone, simultaneously, and the wishes conflict.

What I’m sure of: the closed-API era of generative AI lasted one season before an open alternative arrived. Whatever comes next in this field — remember that interval.