Will Gemini outperform GPT-4 at mathematical theorem-proving?
15
Ṁ298Jan 1
56%
chance
1D
1W
1M
ALL
Based on speculation from https://youtu.be/tkqD9W5U9F4?t=468
To operationalize this, this question will resolve based on the LeanDojo benchmark (https://leandojo.org/), in particular the Pass@1 metric, where "The prover is given only one attempt and must find the proof within a wall time limit of 10 minutes."
GPT-4 is reported to achieve an accuracy of 28.8% on the "random" split of the test data in Table 2 of the LeanDojo paper (https://arxiv.org/pdf/2306.15626.pdf).
This question closes when an evaluation of Gemini's performance on this task is brought to my attention.
Get Ṁ1,000 play money
Related questions
Related questions
Will Gemini achieve a higher score on the SAT compared to GPT-4?
58% chance
Will Google Gemini perform better (text) than GPT-4?
35% chance
Will Gemini exceed the performance of GPT-4 on the 2022 AMC 10 and AMC 12 exams?
72% chance
Will Gemini Ultra outperform GPT-4V on visual reasoning by the end of 2024?
65% chance
Will "Gemini [Ultra, 1.0] smash GPT-4 by 5x"?
18% chance
Will an open source model beat GPT-4 in 2024?
49% chance
Will Google Gemini do as well as GPT-4 on Sparks of AGI tasks?
76% chance
Will Gemini 2 ship before GPT-5?
74% chance
Will an open-source LLM beat or match GPT-4 by the end of 2024?
81% chance
Will a 15 billion parameter LLM match or outperform GPT4 in 2024?
24% chance