What organization will have the highest ELO score in the LMSYS Org Chatbot Arena Leaderboard at the end of March, 2024?

Plus

Ṁ45k

resolved Apr 1

ALL

Alphabet launches Gemini Pro 1.5.

Anthropic releases Claude 3.

+36%

100%99.4%

Anthropic

0.1%

Alphabet (Google)

0.0%

Meta (Facebook)

0.4%

OpenAI

0.0%

Mistral AI

0.0%Other

I'm referring to this Chatbot Arena Leaderboard.

Next quarter: /HankyUSA/who-will-own-the-model-at-the-top-o-5527f82db47f

️ Technology AI OpenAI LLMs Anthropic Google Meta (Facebook)Chatbot Arena Leaderboard ELO Ratings

Get Ṁ1,000 play money

🏅 Top traders

#	Name	Total profit
1		Ṁ2,052
2		Ṁ998
3		Ṁ923
4		Ṁ422
5		Ṁ392

11 Comments

Sort by:

bought Ṁ717 Anthropic YES

I'm going to call it now. Let me know if you think I incorrectly resolved this question. I'm looking at the Chatbot Arena Leaderboard (last updated: March 29, 2024). Anthropic is at the top with Claude 3 Opus having an Arena Elo of 1255.

How does this resolve on ties? https://twitter.com/lmsysorg/status/1767997086954573938/photo/1

@JacobPfau I thought about weaseling this in, but I think "on top" should definitely account for ELO rather than the shared fiest place.

@JacobPfau I'll break ranking ties with ELO scores. If there's still a tie then I'll resolve them 50:50.

Would a “Bing Chat” model powered by an OAI model but with addition features (internet) or fine tuning still count as OAI?

@WillSorenson I might count that as partially Open AI and partially Microsoft but leaning towards Open AI. How likely do you think that is to be the model at the top of the leaderboard?

@HankyUSA Since Google has done it and it scores quite well, pretty good chance! Vanilla pro scores worse than gpt3.5 but pro with bard out scores better than 2 of the 3 gpt4s

Looks like the "Bard" version of Gemini is doing a lot better in the arena?

Seems like Gemini kinda sucks on chatbot arena. Pro is supposed to be significantly better than GPT-3.5 from Google's internal benchmarks, but it's actually a little bit worse than GPT-3.5 on chatbot arena. I wouldn't expect Ultra to top the chart.

Mistral AI

Who is Mistral? What models do they own?

Edit: I just couldn't find Mistral AI on Wikipedia. They developed the Mistral 7B and Mixtral 8x7B models.

🏅 Top traders

Related questions

Related questions