Will Grok 3 beat DeepSeek r1 in LiveBench?
Will Grok 3 beat DeepSeek r1 in LiveBench?
Plus
51
Ṁ14kresolved Apr 11
Resolved
N/A1D
1W
1M
ALL
Resolves as soon as Grok 3 has a rating in https://livebench.ai. DeepSeek-r1 currently has a global average of 71.57, which Grok 3 would have to beat for this market to resolve as YES.
Credit to @ChaosIsALadder for the market format.
Update 2025-03-01 (PST) (AI summary of creator comment): Resolution Criteria Update:
The market will resolve based on the first global score of a Grok 3 model.
Get Ṁ1,000 play money
Sort by:
at this point it seems sort of likely that a very different model will be the first with api access... can we NA this now?
What is this?
What is Manifold?
What is Manifold?
Manifold is a social prediction market with real-time odds on wide ranging news such as politics, tech, sports and more!
Participate for free in sweepstakes markets to win sweepcash which can be withdrawn for real money!
Are our predictions accurate?
Yes! Manifold is very well calibrated, with forecasts on average within 4 percentage points of the true probability. Our probabilities are created by users buying and selling shares of a market.
In the 2022 US midterm elections, we outperformed all other prediction market platforms and were in line with FiveThirtyEight’s performance. Many people who don't like trading still use Manifold to get reliable news.
Why should I trade?
Trading contributes to accurate answers of important, real-world questions and helps you stay more accountable as you make predictions.
Trade with
Sweepcash (𝕊) for a chance to win withdrawable cash prizes.
Get started for free! No credit card required.
What are sweepstakes markets?
There are two types of markets on Manifold: play money and sweepstakes.
By default all markets are play money and use mana. These markets allow you to win more mana but do not award any prizes which can be cashed out.
Selected markets will have a sweepstakes toggle. These require sweepcash to participate and allow winners to withdraw any sweepcash won to real money.
As play money and sweepstakes markets are independent of each other, they may have different odds even though they share the same question and comments.
Learn more.Related questions
What is this?
What is Manifold?
What is Manifold?
Manifold is a social prediction market with real-time odds on wide ranging news such as politics, tech, sports and more!
Participate for free in sweepstakes markets to win sweepcash which can be withdrawn for real money!
Are our predictions accurate?
Yes! Manifold is very well calibrated, with forecasts on average within 4 percentage points of the true probability. Our probabilities are created by users buying and selling shares of a market.
In the 2022 US midterm elections, we outperformed all other prediction market platforms and were in line with FiveThirtyEight’s performance. Many people who don't like trading still use Manifold to get reliable news.
Why should I trade?
Trading contributes to accurate answers of important, real-world questions and helps you stay more accountable as you make predictions.
Trade with
Sweepcash (𝕊) for a chance to win withdrawable cash prizes.
Get started for free! No credit card required.
What are sweepstakes markets?
There are two types of markets on Manifold: play money and sweepstakes.
By default all markets are play money and use mana. These markets allow you to win more mana but do not award any prizes which can be cashed out.
Selected markets will have a sweepstakes toggle. These require sweepcash to participate and allow winners to withdraw any sweepcash won to real money.
As play money and sweepstakes markets are independent of each other, they may have different odds even though they share the same question and comments.
Learn more.Related questions
What will be true of Grok-2?
Will DeepSeek's next reasoning model be called R3?
4% chance
Grok 3 MMLU Benchmark Score
Will DeepSeek R2 be open source?
84% chance
When will Deepseek R2 be released?
-
When will DeepSeek release R2?
Will there be an open replication of DeepSeek v3 for <$10m?
55% chance
Will Grok 4 Top the Chatbot Leaderboard?
51% chance
What will be true of DeepSeek's r2 model?
Grok 3 MATH Benchmark Score