Meta Launches Muse Spark — Its First Model from Superintelligence Labs, and It's Not Open Source

Meta released Muse Spark on April 8 — the first AI model from its restructured Superintelligence Labs (MSL) under the leadership of Alexandr Wang. The model represents a ground-up rebuild of Meta’s AI stack, a dramatic improvement over the Llama 4 family, and a notable break from the company’s open-source tradition.

Muse Spark is proprietary. After years of releasing models with open weights under permissive licenses, Meta is keeping this one in-house. That choice signals a strategic shift in how the company thinks about its AI investment.

The Model

Muse Spark was built over nine months by a team led by Wang, the former CEO of Scale AI, who joined Meta as Chief AI Officer in a deal reportedly worth $14.3 billion. The model accepts text, voice, and image inputs but produces text-only output.

On the Artificial Analysis Intelligence Index v4.0, Muse Spark scores 52 — placing it 4th overall behind Gemini 3.1 Pro (57), GPT-5.4 (57), and Claude Opus 4.6 (53). That ranking undersells the progress. Llama 4 Maverick, Meta’s previous best, scored just 18 on the same index. Moving from 18 to 52 in nine months is not an incremental improvement — it is a generational leap.

Benchmark	Muse Spark	GPT-5.4	Claude Opus 4.6	Gemini 3.1 Pro
AI Intelligence Index v4.0	52	57	53	57
HealthBench Hard	42.8%	40.1%	—	—
Humanity’s Last Exam (Contemplating)	50.2%	43.9%	—	—
CharXiv (chart understanding)	86.4	—	—	—
Terminal-Bench (coding)	59.0	75.1	—	—
ARC-AGI-2 (abstract reasoning)	42.5	76.1	—	—

The benchmark story is nuanced. Muse Spark leads every competitor on health and medical benchmarks — HealthBench Hard at 42.8% surpasses GPT-5.4’s 40.1%. Its multi-agent Contemplating mode beats both GPT-5.4 and Gemini on Humanity’s Last Exam at 50.2%. It also leads on chart understanding tasks.

But it trails significantly where it matters most for developers: coding (Terminal-Bench 59.0 vs. GPT-5.4’s 75.1), abstract reasoning (ARC-AGI-2 42.5 vs. 76.1), and agentic task completion. These are the capabilities that drive adoption among the technical users who build on AI platforms.

Why It’s Not Open Source

Meta’s Llama models defined a generation of open-weight AI. Llama 2 and 3 were adopted by thousands of developers and enterprises, and Meta’s commitment to open release was a genuine competitive advantage — it built an ecosystem that proprietary providers could not replicate.

Muse Spark breaks that pattern. The model is available through the Meta AI app and meta.ai, with rollout planned across Facebook, Instagram, WhatsApp, Messenger, and Ray-Ban Meta AI glasses. But there are no downloadable weights, no fine-tuning access, and no self-hosting option.

Meta has not explicitly ruled out open-sourcing Muse Spark in the future, but the initial approach is clearly different. The most likely explanation is commercial: Meta spent $14.3 billion to bring in Wang and rebuilt its entire AI stack. Open-sourcing the result immediately would give competitors free access to capabilities that cost billions to develop.

There is also a competitive dynamic. As models become more capable, the gap between what is appropriate for open release and what poses dual-use risks narrows. Anthropic’s decision to gate Claude Mythos Preview through a controlled partnership rather than releasing it publicly reflects a similar calculation.

The Product Strategy

Muse Spark’s deployment strategy reveals how Meta intends to use AI differently than its competitors. Where OpenAI sells subscriptions and API access, and Anthropic builds enterprise relationships, Meta is embedding AI directly into consumer products used by billions of people.

The model will power the Meta AI assistant across the company’s entire product portfolio:

Meta AI app and meta.ai — available immediately in the US
Facebook, Instagram, WhatsApp, Messenger — rolling out in coming weeks
Ray-Ban Meta AI glasses — integrated into wearable devices

This distribution advantage is Meta’s strongest competitive asset. Even if Muse Spark is not the most capable model on every benchmark, it will interact with more users daily than any competing AI product simply because of where it lives. The ChatGPT super app is a dedicated application users must choose to open; Muse Spark is embedded in applications people already use throughout their day.

The health benchmark leadership is particularly interesting in this context. If Muse Spark genuinely provides the best AI-assisted medical information, Meta’s billions of users gain access to health guidance that outperforms what any other AI platform offers. The regulatory and liability implications are significant, but the user value is real.

What This Means

Muse Spark narrows the gap between Meta and the frontier labs but does not close it. The coding and reasoning deficits matter — they determine whether developers build on a platform, and developers drive ecosystem growth. Meta acknowledges these gaps and has positioned Muse Spark as the first in the Muse series rather than a finished product.

The open-source question will define Meta’s AI strategy going forward. The company’s previous open releases created enormous goodwill and ecosystem adoption. Keeping Muse Spark proprietary risks alienating the developer community that made Llama successful. Whether Meta returns to open weights for future Muse models — or whether Muse Spark signals a permanent shift — will shape the open-source AI landscape for years.

For businesses evaluating AI platforms, Muse Spark’s free availability makes it an easy option to test, particularly for health-adjacent and consumer-facing applications. But its proprietary nature and Meta-only deployment mean it cannot replace the flexibility that open models like Gemma 4 or self-hosted alternatives provide for organizations that need control over their AI infrastructure.

Meta Launches Muse Spark — Its First Model from Superintelligence Labs, and It's Not Open Source

The Model

Why It’s Not Open Source

The Product Strategy

What This Means

Want to discuss this topic?