Breaking News from the LLM Arena
So some breaking news from the LLM arena – if you’re not aware, there’s this place called The Chadut Arena benchmarking LLMs in the wild. This place pits multiple different LLMs against each other, and people from all over the world test them with their favorite prompts to see which one performs better.
You don’t know which one’s which – you just give it a prompt and you vote whether Model A or Model B performed better. For the longest time, the top spots were held by GPT 4 models like GPT 4 Turbo and GPT 4 0314 from March of 2023 or 0613 from June of 2023.
Google Bard Surpasses GPT 4
Today, the people behind Chadut Arena posted some breaking news – Google Bard just made a stunning leap, surpassing GPT 4 to the second spot on the leaderboard. A big congrats to Google for their remarkable achievement. The race is heating up like never before, and I’m super excited to see what’s next for Bard and Gemini Ultra release.
Update on Rankings
As of now, Google Bard has received over 3,000 votes, whereas the other models that have been there for a longer period have around 30,000 votes. The current ranking shows GPT 4 Turbo with a rating of 1249 and Bard coming in at a close second with 1215. But take this with a grain of salt as this is very new, and things might shift around quite a bit.
Initial Impressions of Google Bard
I ran a quick experiment on the Chadut Arena, testing Bard January 24th Gemini Pro against GPT 40613. The responses I received were impressive, showcasing the capabilities of Bard. The model seems to have a good understanding of complex questions and provides insightful answers.
Google’s AI Ambitions
There have been recent announcements regarding Google’s AI ambitions for 2024. Google CEO Sundar Pichai outlined key goals for the company, focusing on delivering advanced, safe, and responsible AI, improving knowledge, learning, creativity, and productivity, and fostering innovation on Google Cloud.
Challenges Faced by Google
However, Google faces challenges internally, with reports of layoffs and employee cynicism plaguing the company. The prevalent sense of burnout and disillusionment points to the urgent need for Google to prioritize employee well-being and foster a positive work culture.
AI-Powered Features and Collaborations
Google has been rolling out AI-powered features for education, classroom management, accessibility, and more. The recent collaboration between hugging face and Google aims to make open-source AI models more accessible to the research community.
Future of Google’s AI
With the upcoming release of Gemini Ultra and the potential of Google Bard, the competition in the AI arena is heating up. Google’s focus on AI innovation and collaboration with key players in the industry signals a promising future for the company.
What do you think about Google’s AI advancements and challenges? Share your thoughts in the comments below. Stay tuned for more updates from the LLM arena!
tested gemini extensively. it gave me so many wrong replies and forgot everything and lots of bad issues. its crap. they have to train for years before its close to useful.
Just noticed bard was updated but Feb edition still seems worse than 24 jan.
how is it destroys gpt-4? when it can't produce image and not supported all languages? and add to that its constantly forget topics.
It's def better. I have a very locallized, creative and specific brazillian way to test LLMs and for the first time I voted for Bard instead of GPT4 turbo. Impressive. It even added a layer of localized cultural expression to the test.
Let see if chatgpt4 can fold proteins..like alphafold..that's far more impressive
It looks like the 'bard-jan-24-gemini-pro' api might be using RAG with internet access, unlike 'gemini pro dev' or the GPT-4 api. This could help explain some of the huge jump in rankings and it's a bit worrying for fairness in rankings. Why compare models with internet access to models without? There's also likely a secret update to the Bard model that increases performance, and I tested the "bard-jan-24-gemini-pro" api and it definitely seems better in terms of an llm itself compared to the 'gemini pro dev' api, showing that it might be both a newer updated model and with internet access. I'm not sure how much internet access would help though in a regular conversation.
I can’t stand Bing Microsoft Gemini. I know its popular with Bing users but meh. Its like Android vs Iphone, Liberal vs Republican, Communist bs Fascist. A bunch of false dichotomies.
Interesting. I routinely use free ChatGPT and free Bard for coding. ChatGPT was always better, but a few days ago surprisingly Bard solved a problem for me where GPT3.5 failed. Then I played around with it some more and it seemed smarter than usual. Hope Google keeps it up, great to have more options.
Google invented modern AI 😂
They didn't want to get blamed for it so they leaked it and blamed Open AI for opening that can of worms.
Red flag for Google, because no vision!
I been using Bard and Copilot GPT 4 at the same time and Bard has had better responses. I have found both to alucinante. Bard many time just refusing to answer things about health or controversies. GPT 4 has always an answer. I also use both for coding. I had a very difficult time with ChatGPT, just kept giving me the wrong code, but Bard was always on the right track.
I've been using Bard for months and it's significantly better, more informative, and intelligent than the free ChatGPT.
Google search and LLM is a good innovator's dilemma. Google can either milk their search business model until they become irrelevant due to chatGPT and others. Or they step into the chat bot game themselves which might speed up the decline of their "golden goose" even more
I've been in machine learning for over a decade. Some perspective. Google has been at the bleeding edge of AI for many years, since before OpenAI was founded. Same thing for Microsoft Research. There are always ups and downs but at the end of the day it's worth remember this: whatever breakthrough tech one of these tech giants has, within 6 months all the other tech giants have it too.
Bard still seems stupid and only really good for coding.
Open source will win the race. There's no effort that Google can do without embracing open source.
Google has a track record of deception, so we should be very skeptical and verify thoroughly.
Google still has 90% of the search traffic, but what is the overall total search traffic these days? My Google usage is way down these days, as I use GPT4 for almost everything. (I do still use search when I need to check something critical – but Google’s search results quality has really declined of late.
The biggest difference is that I spend so much more time with GPT: There’s an entirely new usage dynamic that Google’s missing out on.
Let's not forget google have that quantum computer lab, what will happen if they manage to put gemini on it?
some robotic voice in here??? 🤔
If you want a reality check on current Llama ask them about something from not popular but searchable fictional universe. Like something that has Wikia (maybe on fandom) but doesn't have Wikipedia page. Ask about something that is easily searchable within said Wikia but is not available directly from front page.
As of this week no LLM available for free was capable of retrieving such info for me even when I tried to help it within prompts as much as I could.
For me it was huge disappointment, and cold shower as far as rapid coming of AGI. This is something that any 10yo human with knowledge of internet searchcan easily do while best LLMs quickly start googling some ridiculous stuff and either hallucinate answers or just admit to be unable to help.
I like the question you asked related to DOTA 😀
Gemini pro is garbage! I used it last time a few says ago. It hallucinate so bad and make up stuff like crazy. You can not trust anything it says. ChatGPT is WAY less bad at this.
Good's gemini pro model sucks. bard himself is mid but somewhat useful but google is still struggling. They could of had the best AI I mean they own the entire web, youtube I mean they could of trained the AI on the entire web and gave bard which I know isn't gemini pro but they could of gave it real time api acsess to watch videos with vison, read the transcript with the data of the entire web to help you with you whatever. the potential is huge but they just don't see it. so far openAI is the winner.
Thanks for getting rid of clickbaity video naming, Wes! The video is superb, by the way! 🙂
The options isn't blind for models name, this could be leading to bias
All the people saying this are clearly not using Gemini Pro for real applications!! Its worse than GPT3.5. Try do a RAG solution with it and comeback to tell me it beats the others!!
Bing chat gpt4 gives me better results than bard, but maybe bard isn't using the Gemini pro model?
Either way bing gpt4 is free so that's what I'm gonna use
If Google wants to be the dominant player in AI, they will. They will buy up other companies and shut them down. If they can't buy them, they will sue them out of existence. That has been their track record.