• Alibaba’s new Token Hub unit dropped Happy Oyster, a world model that generates navigable 3D environments from text—not just video clips you sit and watch.
  • The model topped Artificial Analysis’ text-to-video and image-to-video rankings under the pseudonym “HappyHorse-1.0” before anyone knew it was Alibaba’s.
  • Happy Oyster offers two modes—Directing for structured video creation and Wandering for explorable open worlds—with up to three minutes of continuous output at 720p.

Alibaba Group launched Happy Oyster on Wednesday through its newly formed Token Hub unit, releasing an AI world model that doesn’t just generate video—it generates places you can walk through. The model supports two distinct modes: Directing, which creates a running physical world where lighting, gravity, and character motion stay consistent across time, and Wandering, which lets users build and explore an infinitely extendable 3D environment from a single text or image prompt.

Both modes handle multimodal inputs and produce full audio and video outputs. Directing mode can generate up to three minutes of continuous video at 480p or 720p resolution. Wandering mode is where things get more interesting—users can roam in first person, change direction and camera movement on the fly, and move beyond the original frame while the world keeps generating and stays coherent.

What makes the launch worth paying attention to is what happened before it. Happy Oyster went viral earlier this year under the name HappyHorse-1.0, topping both text-to-video and image-to-video rankings on Artificial Analysis without anyone connecting it to Alibaba. reported that the model’s appeal lies in its departure from the typical one-shot workflow: write a prompt, wait, receive a clip. Happy Oyster builds persistent spaces instead.

How Alibaba’s Happy Oyster Competes in the World Model Arms Race

The competitive landscape here is crowded. Tencent just open-sourced HY-World 2.0, a multimodal world model focused on 3D asset generation compatible with Unity and Unreal Engine. Google DeepMind’s Genie learns from unlabeled video. And until OpenAI killed Sora last month, that was the consumer-facing video generation standard.

Alibaba’s play is different from all three. Tencent’s HY-World exports static 3D meshes. Genie learns environments but can’t render them at high fidelity. Sora made pretty videos that you couldn’t interact with. Happy Oyster combines real-time navigation with video-quality rendering—users can actually move through the generated world, not just look at it.

The use cases Alibaba is pitching are broad: real-time film production, rapid storyboarding, interactive short series where viewer choices create unique experiences, game world generation for prototyping and testing, and VR environment creation from text descriptions. For gaming specifically, Alibaba says the model can generate explorable game worlds without manual design.

Alibaba is backing this with serious commercial ambition. Benzinga noted that Alibaba is targeting a fivefold increase in cloud and AI revenue—reaching $100 billion over five years. The company’s Taobao and Tmall commerce arm is already demonstrating AI tools for merchant operations and consumer discovery.

Alibaba shares rose 3.53% to $137.97 in premarket trading Thursday. Happy Oyster is currently available on early access through a waitlist at happyoyster.cn. No public API or open-source release has been announced yet—a notable contrast with Tencent’s decision to put HY-World 2.0 on GitHub.

Leave your vote