Deep research AI models represent a significant breakthrough in artificial intelligence—they can search through information, analyze it, and synthesize findings much like a human researcher would. These models are generating considerable buzz in the AI world, with major players like Google (Gemini) and OpenAI recently releasing their own versions. While I've been impressed with Gemini's Deep Research capabilities and use them almost daily, I haven't yet had the chance to try OpenAI's version (currently limited to Pro users).
Basically, deep research models take a user’s request and “think” through the best way to satisfy the request, including planning out its research and analysis strategies. This is the thinking stage. Once the plan is complete, the AI executes the research plan and compiles the result. This process has similarities to how a skilled human would go about creating a first draft of a report, although AI works much more quickly than a human could.
There's a lot of potential to these models. Their current capabilities are impressive. Andrew Maynard recently wrote a fascinating article describing how he used OpenAI's Deep Research to write an entire 400-page doctoral dissertation. (This was a test, it wasn't for an actual Ph.D. program.) Professor Maynard concluded that the dissertation wasn't great, but it was decent. If you're interested, check out the article. He provides the full dissertation and the prompts he used to write each chapter.
Two things make Deep Research models stand out. First, they create a plan for their research that includes sub-plans for searching, analyzing, and compiling the findings. That's pretty close to what a competent human researcher would do. The plan also considers the goal of the research, which is used to drive the entire process. We're still in the early days of these models, but we're getting a glimpse into the full potential of AI, and it's simultaneously awesome and terrifying. Mark my words, the nature of scholarship WILL change dramatically. I think we may even need to rethink what it means to be a scholar.
But, we’re not there yet. Perplexity.ai, one of the first AI tools that would actually cite its sources, recently released its own Deep Research function. What’s more, they made it available to free users. Here’s how Perplexity describes their Deep Research function.
Source: https://www.perplexity.ai/hub/blog/introducing-perplexity-deep-research
I love Perplexity, so I was eager to test it out. The results were disappointing. I asked Perplexity to create a report on prior research related to a topic I've been working on developing. Done well, Perplexity could have saved me quite a bit of time with some of the background work. (To be clear, it would have just provided a nice jump start to part of the research. Considerable human work is still required.)
My first attempt was a complete failure. The system got stuck on the planning stage and never proceeded past it to do the actual research. Since the capability had just been released, I thought the servers might be overloaded and decided to try again later. My second attempt was successful, or so I thought.
When reading Perplexity's report, I noticed a reference to a friend of mine. The citation mentioned a matrix she and a colleague created. The matrix looked like it could inform an important part of my work, so I was eager to track down the referenced paper. Perplexity provides links to the papers it cites, so I clicked on the link and went to a completely different paper, one that had nothing to do with my topic.
OK, that sort of thing happens with AI, so I started searching for the matrix using Google Scholar. That failed to turn up the paper. My next step was to try a regular Google search. No luck. She's VERY prolific, so I couldn't do a paper by paper search of her work. Finally, I emailed my friend to ask about the matrix. She had no recollection of the work at all. To be honest, very prolific researchers sometimes forget details of papers, so she had to think for awhile before finally, emailing me a paper that was sort of on the same topic, but did not contain the matrix in question. Perplexity made up the matrix, but attributed it to a citation that seemed to exist because the in-text citation listed authors that actually did publish a paper in the year cited. (In fact, that's the paper my friend sent, so I could sort of see how Perplexity created the hallucination.)
This little story illustrates both the potential and the danger in deep research models. Had it worked correctly, I could have very quickly had a nice report that provided some important background on some things that were useful for my work. This would have saved me a few hours of searching and reading by giving me a great starting point. It didn't work this time, but the models will get better. Eventually, these models will be good enough to produce solid drafts of reports on virtually any topic for which there are documents to research.
The danger comes in when people take these deep research reports at face value. I was expert enough in the use of AI to not simply accept the report as correct and knowledgeable enough about the topic to 1) recognize something unusual, and 2) know how to (try to) track it down. I wasn't going to simply use the matrix without understanding it.
Unfortunately, novice, less knowledgeable users such as students might have simply accepted the report's findings as true, done a little editing and called it a day. I have to admit, if an undergraduate or master's student had submitted the report, at first glance, I would have been impressed. Who knows, unless I took the time to really dig into the report, I might have remained impressed.
This is why it's SO IMPORTANT to teach students the ethical and effective use of AI tools. We need to help them understand AI's capabilities and limitations, and how to leverage these tools to enhance their learning not substitute for the work of learning. I've written a lot about this, so I won't belabor the point here.
Let me tie all of this together. Deep research models have tremendous capabilities that will change the way we think about and perform research and scholarship, although AI won’t replace human scholars. That could go well, or poorly and will likely exhibit a bit of each, at least in the medium term. But, deep research still has significant problems, not the least of which is the appearance of authority and completeness, in other words, the results look impressive. Finally, deep research is dangerous in unskilled hands. As educators, it will be our job to help students gain the skills necessary to use deep research (and all AI tools) ethically and effectively. We have a big job ahead of us.
Want to continue this conversation? I'd love to hear your thoughts on how you're using AI to develop critical thinking skills in your courses. Drop me a line at Craig@AIGoesToCollege.com. Be sure to check out the AI Goes to College podcast, which I co-host with Dr. Robert E. Crossler. It's available at https://www.aigoestocollege.com/follow. Looking for practical guidance on AI in higher education? I offer engaging workshops and talks—both remotely and in person—on using AI to enhance learning while preserving academic integrity. Email me to discuss bringing these insights to your institution, or feel free to share my contact information with your professional development team.
It is a bit worrying to hear that Deep research still spews out hallucinations. I wonder if the Perplexity Deep Research has a double check reasoning function...? If so, it would seem to still fall prey to the stochastic whims of LLMs.
In your experience, does Gemini's Deep Research also hallucinate? I haven't give it a real test drive yet.
And since you have played around with both, do you have a sense of the level of quality of the sources they draw on and cite?
I remember about a year ago playing with Perplexity to do some initial research ... it gave me subreddits and even a LinkedIn article I wrote! I was hoping for more higher quality and vetted sources.