Vision Pro Ships + Sora Stuns Video
· Jerwin Arnado
Archive note: this is a backdated post, written years later while rebuilding this site. It’s dated to the moment it covers, but the hindsight is real.
February delivered a perfectly staged contrast in how the future arrives: one product you can buy and barely justify, one demo you can’t touch and can’t stop thinking about.
Vision Pro: the marvel in the drawer-to-be
The Vision Pro shipped February 2, and the reviews converged with unusual speed on exactly the verdict I banked at announcement: praised and niche. The consensus, paraphrased: the most impressive consumer hardware in years; the eye-and-pinch input works eerily well; the movie-watching is genuinely spectacular — and after two weeks, an unsettling number of reviewers admitted they’d stopped reaching for it. It’s heavy, it’s lonely, the killer app hasn’t introduced itself, and early-adopter forums are already documenting the return-window soul-searching.
None of this is failure; it’s a concept car doing concept-car things. The input model is the keeper — watching normal people instantly understand look-and-pinch confirms it’s the first new cursor since multitouch. The product, meanwhile, waits for its reason and its price drop, in that order. Prediction stands; check back in two years.
Sora: the one that rearranged my week
Then on February 15, OpenAI showed Sora: text-to-video generating up to a minute of coherent, often photorealistic footage. Woolly mammoths in snow. A woman walking through neon Tokyo with reflections in her sunglasses tracking correctly. Drone shots of coastlines that do not exist.
I’ve had a front-row seat to this genre — DALL·E 2’s shock, Stable Diffusion’s democratization — and Sora still moved the needle, because video was supposed to be years harder. Temporal coherence — objects persisting, lighting staying consistent, physics roughly holding across hundreds of frames — was the moat. The demos (curated, unreleased, post-Gemini-video skepticism duly applied) suggest the moat is draining on schedule with everything else.
Two filed observations:
- The disinformation clock started. Images were already contested ground; convincing minute-long video of events-that-never-happened, available eventually to everyone, lands directly on a country that votes in fifteen months and lives on Facebook. The provenance problem — proving what’s real, not detecting what’s fake — just became the decade’s infrastructure project, and nobody has started building.
- Every pixel industry heard the same sound. Stock footage, b-roll, animatics, product visualization — the DALL·E paragraph about creative labor now applies to motion. The pattern from images will likely repeat: not replacement overnight, but the floor of “good enough” rising under everyone’s rates.
The split screen, read together
Here’s what the juxtaposition actually teaches: Apple shipped atoms — a real object, bounded by weight, optics, and supply chains, iterating on hardware time (years). OpenAI shipped a claim about bits — unbounded by manufacturing, iterating on training-run time (months). Both are “the future arriving,” but they compound at different rates, and the gap between those rates is becoming the defining texture of this decade. The headset will get lighter on a schedule you can roughly draw. The video model will get better on a schedule that keeps embarrassing predictions.
Mine included, probably. Filed, as always, for the December audit.