What organization will have the highest ELO score in the LMSYS Org Chatbot Arena Leaderboard at the end of March, 2024?
➕
Plus
50
Ṁ45k
resolved Apr 1
Alphabet launches Gemini Pro 1.5.
Anthropic releases Claude 3.
+36%
on
100%99.4%
Anthropic
0.1%
Alphabet (Google)
0.0%
Meta (Facebook)
0.4%
OpenAI
0.0%
Mistral AI
0.0%Other
Get Ṁ1,000 play money

🏅 Top traders

#NameTotal profit
1Ṁ2,052
2Ṁ998
3Ṁ923
4Ṁ422
5Ṁ392
Sort by:
bought Ṁ717 Anthropic YES

I'm going to call it now. Let me know if you think I incorrectly resolved this question. I'm looking at the Chatbot Arena Leaderboard (last updated: March 29, 2024). Anthropic is at the top with Claude 3 Opus having an Arena Elo of 1255.

@JacobPfau I thought about weaseling this in, but I think "on top" should definitely account for ELO rather than the shared fiest place.

@JacobPfau I'll break ranking ties with ELO scores. If there's still a tie then I'll resolve them 50:50.

Would a “Bing Chat” model powered by an OAI model but with addition features (internet) or fine tuning still count as OAI?

@WillSorenson I might count that as partially Open AI and partially Microsoft but leaning towards Open AI. How likely do you think that is to be the model at the top of the leaderboard?

@HankyUSA Since Google has done it and it scores quite well, pretty good chance! Vanilla pro scores worse than gpt3.5 but pro with bard out scores better than 2 of the 3 gpt4s

Looks like the "Bard" version of Gemini is doing a lot better in the arena?

Seems like Gemini kinda sucks on chatbot arena. Pro is supposed to be significantly better than GPT-3.5 from Google's internal benchmarks, but it's actually a little bit worse than GPT-3.5 on chatbot arena. I wouldn't expect Ultra to top the chart.

Mistral AI
Mistral AI

Who is Mistral? What models do they own?

Edit: I just couldn't find Mistral AI on Wikipedia. They developed the Mistral 7B and Mixtral 8x7B models.