Thoughts on OpenAI's big day and GPT-4o

The most important development needs more attention

May 20, 2024

There were some big AI announcements last week. Most of the attention was on OpenAI and some interesting announcements, including the release of GPT-4 Omni (GPT-4o), the release of a desktop app (Mac only for now), Omni vision, an enhanced voice experience, and a bigger context window. Literally the next day, Google announced some big changes to its AI models.

Many of the new capabilities and improvements are exciting and potentially very useful for higher ed professionals (and others). I'm going to share some potential uses of these new features over the next few weeks. Why not now? Well, many of the new features aren't available yet. As of this morning, I still don't have access to the desktop app. GPT-4o is available to ChatGPT users and it's noticeably faster, but I haven't noticed much difference in output, although I need to use it more to tell how much better it might be than GPT-4.

Free users get access to more capabilities

To me, the most exciting announcement is that GPT-4o will soon be available to ALL users, free and paid. Paid users will have more have higher usage caps and will get access to new models when they come out. Some of the new features won't be available to free users, but many will. Here's OpenAI's list of what is now (or soon will be) available to free users. (This is taken from https://openai.com/index/gpt-4o-and-more-tools-to-chatgpt-free/.)

GPT-4 level intelligence (Previously free users didn't have access to GPT-4.)
Responses generated from the model and the web
Analyze data and create charts
Chat about photos (This is pretty awesome! I'll share a cute little example elsewhere in the newsletter.)
Upload files (a very useful feature)
Use (but not create) custom GPTs
Access to ChatGPT's memory feature

There will be limits to the number of GPT-4o messages. When the cap is hit, chats will revert to GPT3.5.

Why is this so exciting? Well, many people (including me) are very concerned about access inequities related to generative AI. The short version is that paid versions of AI tools are typically much more capable than the free versions. So, people that can afford the paid versions have access to better tools, which can create some pretty significant advantages. This is bad in many ways, but it's especially bad for education and it will get worse as education inevitably relies more and more on AI to enhance learning. Making the most advanced models available to free users is a step in the right direction, but it doesn't fully address the access inequities problem. I'll write more about this later, but for now, I applaud OpenAI's move.

Other enhancements

I'm also pretty excited about the new voice capabilities. The gist of the changes is that voice will feel much more like talking to a human. The speech will be more natural and, get this, you'll be able to interrupt responses just like we sometimes do in human conversations. On the surface, that sounds kind of silly, but I think it will go a long way to make voice conversations more satisfying. In fact, I'm seriously considering making a custom GPT that will act as an occasional co-host for my practical wisdom podcast, Live Well and Flourish (https://www.livewellandflourish.com/).

The other big improvement is in the context window. You may have noticed that performance sometimes degrades over long conversations. AI seems to forget things from early in the conversation and may appear "dumber." These irritations are typically caused by exceeding the context window. Basically, you can think of the context window as memory. When you exceed the context window, you essentially run out of memory and the chatbot seems to forget things. GPT-4o will have a 128K token context window, which is a big improvement over the current window. From what I can tell, this hasn't full rolled out yet, but when it does, you'll be able to have longer conversations and analyze larger documents. In short, it will be a good thing.

Another almost hidden new feature is the ability to change models in mid-conversation. This can be useful, especially if you tend to run up against use limits. For example, you might start off a chat using GPT-4, then switch to GPT-4o if you need its capabilities.

So, what should you do? If you have access to GPT-4o, I'd start playing around with it. But, I strongly suggest not going down the new feature rabbit hole just yet. As I mentioned, many of the new features aren't available yet. In the next few weeks, more features will be released and we'll learn more about the capabilities and limitations of the new model and features. That's why I think it's best to let the dust settle a little before fully jumping in to the new features. But hey, you're an adult, do what you want.

If what you want is to learn more about GPT-4o, head to https://openai.com/index/hello-gpt-4o/.

Thoughts on OpenAI's big day and GPT-4o

The most important development needs more attention

Free users get access to more capabilities

Other enhancements

Discussion about this post