TL;DR

Meta shipped Muse Spark today, the first model from Meta Superintelligence Labs (the division Alexandr Wang has been running since last summer). It’s a natively multimodal reasoning model that ranks 4th on the Artificial Analysis Intelligence Index, behind Gemini 3.1 Pro, GPT-5.4, and Claude Opus 4.6. The bigger story: Meta launched this as a closed model, breaking from years of open-source-first strategy with Llama.

What Happened

Meta released Muse Spark on April 8, 2026 (internally codenamed “Avocado”), built over nine months by the team Alexandr Wang assembled after Mark Zuckerberg recruited him as Chief AI Officer. The model powers Meta AI across the Meta AI app and meta.ai website starting today, with rollouts planned for Facebook, Instagram, and WhatsApp.

Wang’s announcement on X laid it out plainly: “Nine months ago we rebuilt our AI stack from scratch. New infrastructure, new architecture, new data pipelines. Muse Spark is the result of that work.”

The model accepts voice, text, and image inputs but only produces text output. It runs in a fast mode for casual queries and offers multiple reasoning modes for harder problems. All free through Meta AI, though rate limits may apply.

Why the Open-Source Pivot Matters

For the past three years, Meta’s AI identity was Llama. Open weights, Apache-style licensing, community fine-tuning. Llama became the default foundation for startups, researchers, and anyone who didn’t want to pay OpenAI per token. Llama 3, 3.1, and 4 built a developer community that gave Meta influence without charging for API access.

Muse Spark breaks that pattern. The model’s design and weights aren’t public. Meta says it plans to release “a version” under an open-source license eventually, but the flagship model ships proprietary.

The timing makes sense if you look at what happened with Llama 4. The series underperformed expectations. Developer traction stalled. Meta spent billions on training infrastructure and watched OpenAI and Google pull ahead on benchmarks that enterprise buyers actually care about. Meta’s not the only one building in-house: Microsoft launched three competing AI models just last week. Zuckerberg’s response was to bring in Wang, gut the AI org, and build something that could compete head-to-head with GPT-5.4 and Claude Opus 4.6.

The result is a dual-track strategy: Llama stays alive for community adoption. Muse becomes the closed, competitive product line. Whether Meta can sustain both without confusing developers and enterprise buyers is an open question.

How Muse Spark Actually Performs

Muse Spark scores 52 on the Artificial Analysis Intelligence Index. The benchmarks tell a mixed story:

BenchmarkMuse SparkClaude Opus 4.6GPT-5.4Gemini 3.1 Pro
Intelligence Index52
GPQA Diamond89.5%92.7%92.8%94.3%
Humanity’s Last Exam39.9%41.6%44.7%
HealthBench Hard42.8%40.1%20.6%
Figure Understanding86.465.382.880.2
MMMU-Pro (Vision)80.5%82.4%
Terminal-Bench 2.059.075.168.5
ARC AGI 2 (Thinking)42.576.176.5

Muse Spark dominates multimodal benchmarks (figure understanding and health-related queries) but falls behind on coding and abstract reasoning. Terminal-Bench 2.0 shows a 16-point gap versus GPT-5.4. ARC AGI 2 is even worse: nearly half the score of the top models.

One standout detail: Muse Spark used just 58 million output tokens to complete the full Intelligence Index evaluation. Claude Opus 4.6 used 157 million. GPT-5.4 used 120 million. Whatever Meta did with their architecture, it’s remarkably token-efficient.

The Shopping Mode Angle

Meta’s not hiding what this model is really for. Muse Spark powers a new “shopping mode” in Meta AI that pulls from your Instagram and Threads activity (brands you follow, styling choices, content you engage with) and turns it into personalized product recommendations.

Every post becomes a potential storefront. You can compare items, get pros-and-cons breakdowns, and follow links to buy directly. If you’ve used ChatGPT’s shopping features, the experience is similar, but Meta has something OpenAI doesn’t: your social graph and years of behavioral data across three billion monthly users.

Meta doesn’t need to charge for API access or sell subscriptions. Muse Spark makes Meta AI better at keeping you on-platform and pushing you toward purchases. The model pays for itself through ad and commerce revenue.

Contemplating Mode and Multi-Agent Orchestration

Meta also previewed a “Contemplating” mode that won’t ship immediately but is coming soon. The idea: for complex queries, Muse Spark spins up multiple sub-agents that reason in parallel, then synthesizes their outputs into a single response.

Meta is positioning it against Google’s Gemini Deep Think and OpenAI’s GPT-5.4 Pro. The framing differs (Meta calls it “orchestration” rather than extended thinking) but the goal is the same: throw more compute at hard problems when the user asks for it.

I’m skeptical about the branding. “Contemplating” feels like a marketing name searching for a product. But if the multi-agent approach actually works, it could help close the gap on reasoning benchmarks where Muse Spark currently trails.

On GDPval-AA (a benchmark for real-world work tasks), Muse Spark posts 1,427 ELO. Claude Sonnet 4.6 hits 1,648 and GPT-5.4 reaches 1,676. That’s a big gap for agentic use cases. Contemplating mode is probably Meta’s best shot at closing it without retraning the base model.

What This Means for the AI Industry

Alexandr Wang shipped fast. Nine months from a ground-up rebuild to a competitive frontier model is aggressive by any standard. Wang’s background running Scale AI, where the whole job was making AI teams move faster, shows in the timeline. The question is whether the second model closes the coding and reasoning gap, or if Meta stays a “great at vision, meh at code” shop.

The open-source era at Meta might be winding down. Meta says Llama continues, but the company’s best talent, biggest compute allocation, and leadership attention are all pointed at Muse now. If you’re still getting value from open models, Gemma 4 is worth a look — it’s the best open model I’ve tested recently. If Muse Spark’s closed successor pulls further ahead of Llama, developers will face a choice between the free-but-slower Llama and the closed-but-better Muse.

Then there’s the monetization angle. OpenAI and Anthropic charge for API access and enterprise contracts; Google bundles AI into cloud and search. Meta’s play is different: use AI to sell ads and products. Muse Spark’s shopping mode makes this explicit. The model exists to make Meta’s commerce engine spin faster, and if it works, Meta could end up with the best unit economics in the business because the customer never pays for the model directly.

FAQ

Is Muse Spark open source?

No. Muse Spark launched as a closed model. Meta says it plans to release “a version” under an open-source license later, but no timeline has been given. The flagship model powering Meta AI is proprietary.

How does Muse Spark compare to GPT-5.4 and Claude Opus 4.6?

It ranks 4th on the Artificial Analysis Intelligence Index. Strong on multimodal reasoning and health queries, weaker on coding (Terminal-Bench 2.0: 59.0 vs GPT-5.4’s 75.1) and abstract reasoning (ARC AGI 2: 42.5 vs 76+ for top models).

What happened to Llama?

Llama still exists and Meta says it’s staying. The dual-track idea is Llama for open-source community adoption, Muse for competitive closed products. I’d watch where the compute budget goes over the next two quarters to see which one Meta actually prioritizes.

Who is Alexandr Wang?

Former CEO of Scale AI, recruited by Zuckerberg in summer 2025 to lead Meta Superintelligence Labs as Chief AI Officer. He was 27 when he took the job. Muse Spark is the first model released under his leadership.

Can I use Muse Spark via API?

Not yet publicly. Meta is offering private preview API access to select partners, with paid API access planned for a wider audience later. Pricing hasn’t been announced. Right now, you can use it through the Meta AI app and meta.ai website for free.

Bottom Line

Muse Spark ranks 4th and trails the top three on most benchmarks. But Meta built it to sell products to three billion people through Instagram and WhatsApp, and on that metric, benchmark rankings matter less than conversion rates.

The real test comes in six months. If Meta releases the promised open-source version and it’s competitive, the dual-track approach works. If the open version is a neutered afterthought while Muse keeps improving behind closed doors, Meta will have traded its biggest community asset for a commerce engine. I think it’ll be the latter. Zuckerberg didn’t pay $14 billion for Alexandr Wang to ship open weights.