Grok 3 is Impressive

A Comparison of Three Models

Feb 25, 2025

Grok 3, the latest model from xAI, is a serious contender among leading AI models, challenging established players like ChatGPT and Claude. Grok is a strange beast. It was originally positioned as being a more open large language model (LLM); in other words, it was less censored. Although I hadn't run into many problems with ChatGPT, Claude or Gemini refusing tasks due to safety restrictions, I've read that the controls are problematic for some users. While I've occasionally encountered safety restrictions, these instances were rare and seemingly random. In these cases, I was able to work past them by asking the chatbot if my request was really unsafe, which resulted in an apology and completion of the task. This hasn't happened in some time, so my guess is that they were still tuning the safety controls.

More recently, Grok has been garnering attention not for its relaxed censorship, but for its impressive performance. I kept reading about how well Grok was performing and decided to see for myself.

Grok can accessed through X (the social media platform formerly known as Twitter), which is a little odd, through the Grok app or website. You can also access it through the Grok app and other systems, such as Poe.com.

A few things are worth noting in the screenshot above. First, Grok has three modes, a default, DeepSearch, which is a deep research model, and Think, which is a reasoning model. There are variations on these, which is a little confusing so I won’t go into them here. Second, I’m using a free account and Grok 3, the most powerful model, is available, although there are usage limits. The third thing is covered by the “Grok 3 Enabled” bubble, but Grok can also code and create images. Others have told me that Grok is very good at images, but my personal experience is limited. Grok has a SuperGrok subscription for $30 per month, which gets you increased rate limits and access to Grok 3 Thinking and DeepSearch. This is a little confusing because these are also available in the free version, but with lower rate limits. Grok is a bit more expensive than the paid versions of Claude, ChatGPT, and Gemini, which run around $20 per month.

Putting Grok to the Test

To test Grok, I ran the same prompt into Grok, ChatGPT, and Claude. I thought three was enough of a test so I didn't include Gemini. My goal was to see how Grok stood up against the mainstream models. Here's the prompt:

Is higher education in the USA facing a polycrisis?

All three results were solid. The responses were reasonable and included some approaches I hadn't thought of before. In my subjective opinion, Grock's response was equal to the others, perhaps even slightly better in some ways.

For fun, (at least what I call fun), I compiled all three responses into a single document, which coded labels indicating the model and asked ChatGPT to critique the responses and to pick a winner. ChatGPT's assessment matched my own. It named Grok the winner by a narrow margin but thought all three responses were good. Here's ChatGPT's final assessment. I labeled Grok’s response as Response G in the test:

Response G stands out as the best response overall. It not only provides a thorough analysis with concrete data and a balanced narrative, but it also connects the dots between different crises effectively while acknowledging counterpoints. This blend of detail, structure, and nuance makes it the most robust answer to the question of whether U.S. higher education is facing a polycrisis.

If you're interested, you can see all three responses and ChatGPT's assessment here. (https://open.substack.com/pub/aigoestocollege/p/comparison-of-chatgpt-claude-and)

Practical Implications

Will I start using Grok? Probably not, at least not very often. For many tasks, I don't really care about having the very best possible response. I just want something good enough. Your uses may vary, but usually I use AI to help me think through or get started with something so having the absolute best response possible isn't really that important. Maybe a better way to put it is that the return on the investment in comparing multiple models isn't there for most of my uses. That being said, if I'm not happy with the responses I'm getting from one model, I absolutely will try others. Grok is now on that list of "others."

Key Takeaways

There are some larger messages here. First, competition is good for AI consumers. Sure, it's a little exhausting to keep up with all of the new models and tools, but that's what you have me for! As important new tools come out, rest assured that I'll check them out and report back. In fact, if there's something you'd like me to cover, comment on this article or email me at craig@AIGoesToCollege.com.

The other big message is that you probably don't need to test out every new model that comes along. Yes, there are use cases in which one model seriously outperforms others, but if you're happy with ChatGPT, stick with ChatGPT, or Claude, or Gemini, or Grok, or whatever. If one model starts to widen the gap, you'll read about it here. Now, if you enjoy playing with new AI models and tools as I do, keep exploring but do not feel like you have to chase every new development. (By the way, I know I've given this same basic advice before, but I think it's worth repeating since we're going to see some very interesting new models in the near future, including new Claude and ChatGPT models, at least that's the rumor.)

So, if you're curious, check out Grok. But if you're happy with your current model, just keep using it without fear of missing out. Seriously, it's fine to stay the course. If something amazing comes out, I’ll let you know.

Want to continue this conversation? I'd love to hear your thoughts on how you're using AI to develop critical thinking skills in your courses. Drop me a line at Craig@AIGoesToCollege.com. Be sure to check out the AI Goes to College podcast, which I co-host with Dr. Robert E. Crossler. It’s available at https://www.aigoestocollege.com/follow. Looking for practical guidance on AI in higher education? I offer engaging workshops and talks—both remotely and in person—on using AI to enhance learning while preserving academic integrity. Email me to discuss bringing these insights to your institution, or feel free to share my contact information with your professional development team.

Grok 3 is Impressive

A Comparison of Three Models

Putting Grok to the Test

Practical Implications

Key Takeaways

Discussion about this post