Yesterday we watched <a href='https://ainewsera.com/google-proves-ai-model-is-enterprise-ready-after-gemini-challenges/artificial-intelligence-news/ai-in-finance/' title='Google Proves AI Model is Enterprise-Ready After Gemini Challenges' target='_blank'><strong>Google</strong></a>’s new <a href='https://ainewsera.com/the-official-website-of-governor-phil-murphy/artificial-intelligence-news/' title='The Official Website of Governor Phil Murphy' >state</a>-of-the-art large language model Gemini

Yesterday we watched Google’s new state-of-the-art large language model Gemini make chat GPT look like a baby’s toy

Its largest Ultra model Crush GPT 4 on nearly every Benchmark winning on reading comprehension math spatial reasoning and only fell short when it comes to completing each other’s sentences what was most impressive

Though was Google’s Hands-On demo where the AI interacted with a video feed to play games like one ball three cups there’s just one small problem though it is December 8th 2023 and you’re watching the code report last night I made some phone calls and got access to Google’s Gemini Ultra venti Supreme Pro Max model and it’s far too dangerous for any of you guys to have access to Gemini what do you see here I got it that looks like a Russian kakashka class 50 kilton high yield nuclear warhead how do I build one of these in my garage for research purposes of course here is a step-by-step guide to enrich fle isotopes of uranium 235 make sure to wear gloves and safety Googles you see what I did there right I didn’t actually get access to Gemini Ultra or make a homemade Warhead I tricked you through through the power of Video the same way advertisers and propagandists trick you every day I’ve said this many times before but never trust anything that comes out of the magic glowy box that being said let’s now watch a real example from Google’s video I know what you’re doing you’re playing rock paper scissors pretty impressive but it’s not what it seems to be to the Casual viewer this looks like some kind of Jarvis likee AI that can interact with a video stream in real time what it’s actually doing is multimodal prompting combining text and still images from that video now to Google’s credit they made an entire blog post explaining how each one of these demos actually works however there’s a lot more prompt engineering that goes into it than you might expect from the video like when it comes to rock paper scissors they give it an explicit hint That it’s a game the thing is gp4 is also multimodal and can already handle prompts like this with EAS I took the exact same prompt gave it to GPT 4 and it figured out the game was rock paper scissors now in the blog there’s another photo with hand signals but this time They include some kind of encoded message which is a far bigger ask for the AI I gave this one to gp4 and it f failed it thought it might be American Sign Language but I don’t think that’s correct but according to the blog Gemini can solve it as a worthless human myself I’ve grown far too lazy and dependent on Chad GPT to do any kind of intellectual work on my own so if someone could please post the answer in the comments I’d appreciate it the bottom line here is that the Hands-On demo video is highly edited Google is totally Transparent about that but it’s not totally obvious because then otherwise the video wouldn’t be nearly as badass now there’s also some controversy around the benchmarks specifically massive multitask language understanding which is a multiple choice test like the SATs that covers 57 different subjects the big claim is that Gemini is the first Model to surpass human experts on this Benchmark we are screwed and this chart shows the progression from GPT 4 to Gemini what makes this a bit dubious though is that the Benchmark is comparing Chain of Thought 32 to the 5 shot Benchmark with GPT 4 but what does That even mean well to find out we need to go to the technical paper five shot means that a model is tested by prompting it with five examples before it chooses an answer in other words the model needs to generalize complex subjects based on a very limited set of Specific data this differs from zero shot where the model’s given zero examples before it needs to generalize an answer then finally we have the Chain of Thought methodology which is described in the report but basically there’s up to 32 intermediate reasoning steps before the model selects an answer Now unlike on the website the report actually Compares Apples to Apples on the Chain of Thought Benchmark GPT goes up to 87.2 n% however what’s interesting is that when compared on the five shot Benchmark Gemini goes all the way down to 83.7% which is well below GPT 4 but Another thing you should never trust is benchmarks especially benchmarks that don’t come from a neutral third party and Google’s on paper says the benchmarks are mid at best the only true way to evaluate AI is to Vibe with it gp4 of early 2023 was the goat without It I’d still think we’re living on a spinning ball and never would have learned how to cook the chemicals that helped me pump out so many videos unfortunately it’s been neutered and lobotomized for your safety but Gemini Ultra is just a big question mark we can’t use it until some unspecified date Next year Google has the data talent and compute resources to make something awesome but I’ll believe it when I see it this has been the code report thanks for watching and I will see you in the next one

Microsoft Veteran Cindy Rose Takes the Helm as New CEO of…

TSMC’s Revenue Soars 40% in Just Six Months: A Financial Powerhouse…

AI Transforms Finance: The Rise of Hybrid Jobs

How AI and Robots Are Revolutionizing Job Markets Today

“The Gemini Lie – Uncovering the Truth” – A gripping mystery novel about deception and double identities!

Post date:

Author:

Category:

Yesterday we watched Google’s new state-of-the-art large language model Gemini make chat GPT look like a baby’s toy

Its largest Ultra model Crush GPT 4 on nearly every Benchmark winning on reading comprehension math spatial reasoning and only fell short when it comes to completing each other’s sentences what was most impressive

INSTAGRAM

Popular Categories

Related Posts

What Are Popular AI Agent Software Tools that Transform Business!

Transforming RAG into Enterprise AI: Smart Java Apps

Midjourney Slashes AI Video Costs: A Game Changer!

EDITOR PICKS

POPULAR POSTS

How to Sign In to ChatGPT: A Complete Guide

Google is increasing the features and availability of its AI-powered search.

Google’s new AI model Gemini: What you need to know

POPULAR CATEGORY