Mark Zuckerberg announces Muse Spark: what you need to know about the new Meta AI model: How to try it, benchmark results | Find a Way

byMatthews Martins -April 08, 2026

119

Mark Zuckerberg announces Muse Spark: what you need to know about the first AI model from Meta

Nine months after founding Meta Superintelligence Labs, Zuckerberg is ready to show his cards.

Matthews Martins

on April 8, 2026

Credit: David Paul Morris/Bloomberg via Getty Images

Mark Zuckerberg announced Wednesday that Meta Superintelligence Labs has reached its first major milestone: a new family of AI models called Muse, with the debut model, Spark, available now. In a Facebook post, Zuckerberg said that Muse Spark now powers an updated version of Meta AI, which users can access online at meta.ai or in the Meta AI app.

"Muse Spark is the first step on our scaling ladder and the first product of a ground-up overhaul of our AI efforts," a Meta announcement stated.

Spark is designed to be particularly capable in areas tied to everyday personal use — tasks like visual understanding, health, shopping, and social content. Looking ahead, Zuckerberg said Meta is building products that go beyond answering questions, toward AI that acts as agents "that do things for you."

Future AI models in the Muse lineup will also include new open-source releases.

Muse Spark is the first big product from Meta Superintelligence Labs

The announcement marks the public debut of work that has been underway — and at times turbulent — since last summer. When Zuckerberg first laid out his vision for "personal superintelligence" in a July 2025 manifesto, the ambition was an AI that helps people pursue their own goals rather than one controlled from the top down.

To build it, Meta went on one of the most aggressive hiring sprees in recent memory, personally recruiting more than 50 researchers from rivals including OpenAI, Anthropic, and Google, and bringing in former Scale AI chief Alexandr Wang to lead its new superintelligence research group.

Then, just as quickly, Meta froze hiring altogether — citing routine budget planning — and restructured the team into four smaller units focused on research, superintelligence development, products, and infrastructure. Zuckerberg explained the pivot by saying he believes breakthrough AI work is best done by compact teams who can hold the full picture in their heads, rather than sprawling organizations.

The whiplash raised eyebrows amid broader market jitters about whether the AI boom is sustainable. An MIT study circulating at the time found the vast majority of companies deploying AI were seeing no financial return.

In his original manifesto, Zuckerberg drew a sharp philosophical line between Meta and its competitors, arguing that some AI labs want to concentrate superintelligence and pipe its output to humanity like a utility. Meta sees it differently, he said.

In Wednesday's Muse Spark announcement post, he once again framed the lab's founding goal as "putting personal superintelligence in everyone's hands" — with the underlying belief that empowering individuals, not centralizing intelligence, is how humanity moves forward.

Wednesday's Muse announcement will be the first concrete product to emerge from these multi-billion-dollar investments. (Meta allocated $72 billion in AI development in 2025 and is expected to spend up to $135 billion in 2026.)

Muse Spark: Benchmark performance

So far, Meta's Llama family of AI models has lagged far behind its rivals on AI leaderboards. Whether Spark lives up to the superintelligence branding remains to be seen, but after months of hiring drama, restructuring, and big-picture theorizing, Meta has finally put something on the table.

As Zuckerberg put it: "I'm looking forward to sharing more soon."

As part of its Muse Spark announcement, Meta Superintelligence Labs released its scores on popular AI benchmark tests such as Humanity's Last Exam (HLE), ARC AGI 2, and GPQA Diamond. These scores could not be independently verified at this time, but Meta did release information on its testing methodology for Muse Spark.

Overall, Meta reported mixed results when comparing Muse Spark to frontier models such as Claude Opus 4.6 Max, Gemini 3.1 Pro High, GPT 5.4 Xhigh, and Grok 4.2, with Muse Spark outperforming on some benchmarks and underperforming on others.

Meta released a table comparing benchmark performance for Muse Spark.

tablet with muse spark benchmark performance compared to competitors

Meta released this benchmark comparison table. Credit: Meta

How to try Muse Spark from Meta

Muse Spark is available online now. Desktop users can access the new AI model online at meta.ai. Mobile users can also try Muse Spark in the Meta AI app. Additionally, Meta said that select users will be able to access a private API preview.

To compete with reasoning models from OpenAI, Anthropic, and Google, Meta is also releasing a "Contemplating" mode for Muse Spark, "which orchestrates multiple agents that reason in parallel."

"This allows Muse Spark to compete with the extreme reasoning modes of frontier models such as Gemini Deep Think and GPT Pro. Contemplating mode provides significant capability improvements in challenging tasks, achieving 58% in Humanity’s Last Exam and 38% in FrontierScience Research."

Contemplating mode is not yet available; Meta said it will be released gradually at meta.ai, but did not provide a timeline for its release.

Topics Artificial Intelligence Meta

119 Comments

Stay informed!

AnonymousApril 10, 2026 at 10:57 PM
So cheeky to highlight every score in blue so people who aren't paying attention think they've scored higher on every single benchmark.

What's the point of the column header if the blue is saying "this is our one"?
ReplyDelete
Replies
AnonymousApril 10, 2026 at 10:57 PM
Muse Spark is very good meta new super intelligence ai what we will see from now on AI muse spark .
ReplyDelete
Replies
AnonymousApril 10, 2026 at 10:57 PM
I tried it and it's the better experience then Gemini 3.1 in Daily tasks
ReplyDelete
Replies
AnonymousApril 10, 2026 at 10:57 PM
considering it’s completely free, muse spark is pretty good.
ReplyDelete
Replies
AnonymousApril 10, 2026 at 10:58 PM
Seems kind of skeezy that they are putting the self reported numbers on there if they are lower. Just give us the numbers on a fair playing field.
ReplyDelete
Replies
AnonymousApril 10, 2026 at 10:58 PM
Benchmarks arent the moat; deployment latency, inference cost, and safety evals decide whether this is real or theater.
ReplyDelete
Replies
AnonymousApril 10, 2026 at 10:58 PM
Did you try it though? Because it's absolute trash
ReplyDelete
Replies
AnonymousApril 10, 2026 at 10:58 PM
They have a history with benchmarks, don’t they?
ReplyDelete
Replies
AnonymousApril 10, 2026 at 10:58 PM
The key to winning is simple: no censorship, support for NSFW, and no quantification of LLM; always deploy a fully accurate version.
ReplyDelete
Replies
AnonymousApril 10, 2026 at 10:59 PM
Got to play around with it, pretty unimpressed. It feels benchmaxxed for sure, can handle these but definitely lacks the general competence and ability to understand context and cut a bit deeper like Opus 4.6
ReplyDelete
Replies
AnonymousApril 10, 2026 at 10:59 PM
When will Apple get in the game too
ReplyDelete
Replies
AnonymousApril 10, 2026 at 10:59 PM
343 Muse Spark, descendent of 343 Guilty Spark from Halo
ReplyDelete
Replies
AnonymousApril 10, 2026 at 10:59 PM
Impressive
ReplyDelete
Replies
AnonymousApril 10, 2026 at 10:59 PM
Is it open source?
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:00 PM
It's time Meta played upto its billions of "investment" into AI through poaching talent left and right.

Its sad that they pioneered the Llama series and then lost it all in the middle of the race and went for a total overhaul.

Talks cheap, but Meta definitely has to step up the game now. This is a race to bottom for price and race to the top for intelligence.

Gotta go, my Claude Pro subscription is getting its limit reset at 3 AM in the morning....can't miss the tokens.
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:00 PM
It's one of those weeks isn't it
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:00 PM
Kudos to Meta for not giving up. It looked hopeless.
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:00 PM
Thank God they highlighted the entire first column. I wouldn't be able to tell which scores correspond to their model.
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:00 PM
Finally a real fourth contender.
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:00 PM
Nice, another model we can't actually use.
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:00 PM
So, where is Apple? Siri seems stuck in the xx century
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:01 PM
Would this be the first blackwell model? I imagine it is right can't imagine them still using hoppers.
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:01 PM
i like it
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:01 PM
Who said scaling laws were dead?
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:01 PM
Is this avocado?
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:01 PM
Look like ass model from the benchmark
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:01 PM
Good. Disappointing GDPVal score.

Is there a mythos GDPVal score anywhere?
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:01 PM
How many parameters is this model?
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:02 PM
Pretty funny that it is better then Grok. Zuck can finally teabag Elon after failing so hard.
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:02 PM
I’m glad their lab didn’t just implode and actually made something out of all those resources thrown at it
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:02 PM
i assume this one isn't oss...?
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:02 PM
Reminder: Meta just lied about all their benchmarks last time with Maverick.
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:02 PM
“Spark” sounds like it’s a relatively small model, maybe similar“Flash”
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:03 PM
I guess, bro
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:03 PM
We need product built around it. Claude is Claude because of its product; not just because of thier model.
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:03 PM
Considering that this would've been SOTA a bit ago, it's highly impressive that they still were able to ship (what seems to be) a good model. Hopefully this isn't a case of benchmaxxing.
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:04 PM
This looks like something competitive.
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:05 PM
"Meta isn’t positioning Muse Spark as a top-of-the-line model, but is instead highlighting its efficiency and “competitive performance” on various tasks." https://www.cnbc.com/2026/04/08/meta-debuts-first-major-ai-model-since-14-billion-deal-to-bring-in-alexandr-wang.html
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:05 PM
Doesn't beat mainstream models from 2 months ago, if it isn't Open sourced nobody should even care about this model
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:05 PM
Okay i got to say, i was dubious about Alexandr, but maybe Zuck saw something. Like, i think Zuck's thing is ruthless execution. He moves forward no matter what. That's how he built the empire. Often of course messing up things, but he fucking moves.

Anyways i digress, Alexandr probably has the same energy. And they both learn shit fast. They might actually understand about the problem and it's solution space enough so they know how to hire and manage some actual experts who have now built, in a relatively short time a pretty decent model. Most likely benchmaxxed and wont replace my Opus4.6, but still good job guys lol
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:06 PM
https://imgur.com/a/CnPWDrh
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:06 PM
Given the supposedly 60 trillion tokens Meta spent on Claude tokens last month, we know that whatever this model says on benchmarks, it's like a generation behind for actual work.

I suppose the only question is, is it actually better than the Chinese models? But not sure if it matters if they don't open weight it in comparison
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:07 PM
Especially after the delay to get this right, this seems quite underwhelming. They are just now barely catching up to what others have delivered last quarter.

I ll put them in the grok pile for now.
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:07 PM
But can it pass the carwash benchmark
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:07 PM
Remember how they benchmaxed last time and actual experience was garbage. Let's hope this one is not like that.
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:07 PM
Pretty solid numbers. So all five big players are in the game.
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:07 PM
Impressive but Genini and Claude already scored that 2 months ago so regardless I won't bother with it
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:07 PM
Will it be on openrouter?
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:07 PM
They put the most impressive number on top, while the rest are either not that good, or just marginally better.
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:09 PM
Someone tell me how to feel about this
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:13 PM
Goddamn, I thought Meta was down and out. Guess they were just gathering themselves.
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:14 PM
Looks like ARC AGI 2 was released just past the benchmaxxing deadline.
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:17 PM
Interesting, seems like Meta is back to the frontlines. Not SOTA leading, but definitely breathing behind the top labs necks now if the benchmarks are representative of the experiences of the users....

Competition is good, bring in more.
ReplyDelete
Replies
AnonymousApril 10, 2026 at 11:24 PM
That arc-agi 2 score is rough. Will have to test it to know more though.
ReplyDelete
Replies

Add comment