Elon Musk’s AI company, xAI, has released Grok 3, after detailing what it can do last night, and it’s already making history. The AI model smashed records by scoring over 1400 points in the LM Arena, making it the top-ranked AI model right now, outperforming OpenAI’s best systems. When it comes to math, science and coding benchmarks, it has already surpassed the latest Google Gemini and OpenAI GPT models.
One of the first people to test Grok 3 was Andrej Karpathy, the former director of AI at Tesla and part of the founding team at OpenAI. His early tests showed that Grok 3 is extremely advanced in logical reasoning, coding, and problem-solving.
When he asked it to generate a webpage that mimicked the Settlers of Catan board game, it completed the task successfully—something only a few AI models have been able to do.
grok 3 is the world’s smartest AI
now available to all Premium+ subscribers
— Grok (@grok) February 18, 2025
It also did well in math-heavy challenges, like estimating the amount of computing power needed to train GPT-2, a previous-generation AI model. On the other hand, Grok 3 struggled with creativity and humor, failing to generate strong jokes or answer certain tricky ethical dilemmas. It also couldn’t solve Karpathy’s “emoji mystery” challenge, where hidden messages were embedded in Unicode symbols.
GROK 3: SOLVING PHYSICS, GAMES, AND THE UNIVERSE
Full presentation and demo of xAI’s latest model
0:00 xAI’s mission: Understand the universe
1:20 Team presentation
2:01 Grok means to profoundly understand
2:29 From Grok 2 to Grok 3
6:30 Grok 3 benchmarks
9:07 Grok 3 improves… https://t.co/7qbB6O16Yb pic.twitter.com/BomGwAOa1I— Mario Nawfal (@MarioNawfal) February 18, 2025
Despite these weak spots, Grok 3 shows huge potential. It’s now available to X Premium+ subscribers (no word on when Premium users will get access, but they should later), while Palantir is already working on bringing it into businesses for enterprise applications. xAI today also increased the price of Premium+ in the USA to coincide with the launch of Grok 3 (what a better way to drive subscriptions for those seeking access right?), but in Canada, the price increase has not yet applied based on our checks of the website.
The AI wars are just heating up. It will be pretty crazy to see how far we will continue to go even in just one year. Skynet is becoming a reality. Who’s ready?