Google’s Illuminate

The evolution of AI-generated audio summaries

Dec 10, 2024

Have you ever had to wade through complex documents and wished for a quick summary that you could listen to rather than read? Google NotebookLM’s audio overview justifiably created tremendous buzz because of its ability to create what sounded like an NPR-style podcast episode based on a set of documents that you provide. The conversation between female and male-sounding characters sounded like two humans talking. The results were impressive and provided a window into what generative AI (GAI) could accomplish.

As I wrote about recently, one problem with NotebookLM’s audio overview is that the male character typically dominated and directed the conversation, although it often made the female character sound more knowledgeable. There was no way to control this. You also had no control over the characters’ voices. They were what they were.

What is Google Illuminate?

Google recently released Illuminate (https://illuminate.google.com/). Essentially, Illuminate is NotebookLM’s audio overview but with more control over the conversation, but with a focus on scientific documents (computer science more specifically). Here’s how Google describes Illuminate on its welcome screen:

To use Illuminate, you first point the system to one or more documents in arXiv (https://arxiv.org/), which is a repository of over 2 million scholarly scientific articles. It’s important to know that these articles may not have been peer reviewed; anyone can post an article on arXiv, although there are some quality controls in place. Even more importantly, you are limited to arXiv documents, although Google stated that more sources will be added soon (whatever that means).

Dialogue styles and voice options

Once you’ve entered the links (URLs) for the source documents, you choose the style by selecting one of four dialogue styles and choosing host and guest voices. Two of the voices (Aura and Stellar) are female-sounding and the others sound male. The Free Form style gives you more control over the tone of the conversation and also lets you choose which characters serve as host and guest. The other three pre-select host and guest; unfortunately, the host is always a male character and the guest is always a female character.

Customizing your experience

Selecting one of the three pre-built dialogues provides a corresponding prompt that cannot be edited. One trick to get around this limitation is to first choose the desired dialogue style and copy the prompt. Then you can select Free Form and paste in the copied prompt and edit as you see fit. This sounds more complicated than it really is.

I used this trick to create the prompt below. The first two sentences were the pre-built prompt; I added the last two.

Here’s the full screenshot of the choices I made. I pointed Illuminate to two articles that provided overviews of how large language models (LLMs) work. Clicking on Generate creates the dialogue.

Results and performance

The results were decent. The dialogue, which you can check out here (Substack) or here (Google), was about 4 minutes long. It hit the most important points and sounded good. This is very subjective, but it didn’t sound quite as natural as what NotebookLM produces, but I can’t state that definitively. Dialogues are treated as drafts until you move them to your library. Drafts are deleted after 30 days.

Interactive features: The Q&A capability

There’s a very interesting, but kind of hidden feature in Illuminate — the ability to ask questions of the content while listening to the dialogue. If you click on Play to play a dialogue, you get a screen like this:

Notice the little hand at the bottom? It’s not very obvious, but it’s supposed to be like raising your hand in class (I think). Clicking on the hand brings up a screen that allows you to ask questions. I asked for some examples of hallucinations and got this in response.

Bottom line

Overall, Illuminate is impressive and interesting despite its current limitations. As Google adds more source document options, Illuminate’s usefulness will expand significantly. I can see creating all sorts of dialogues for my classes. It should also be useful for training and even disseminating policies (the bane of any campus). It has potential for all sorts of communication tasks.

The next big evolution will be the ability to use our own voices. Services like ElevenLabs already do a great job of cloning a voice but don’t create realistic dialogues based on source documents (yet). I think it won’t be long before these capabilities merge and we’re able to create dialogues between two cloned voices. I’m not sure if that’s a good thing, although I can see its usefulness. Time will tell, I suppose. Subscribe to AI Goes to College and I’ll let you know as I learn of new developments in dialogue creation.

Well, that’s all for this time. If you have any questions or comments, you can leave them below, or email me - craig@AIGoesToCollege.com. I’d love to hear from you. Be sure to check out the AI Goes to College podcast, which I co-host with Dr. Robert E. Crossler. It’s available at https://www.aigoestocollege.com/follow. Thanks for reading!