Generative AI of the “large language” kind has been an attention hog over the past 10 or 11 months. The buzz has been so loud and constant that it’s all but asking to be dismissed as hype. That would be a mistake—especially in healthcare.
Consider the pace of the technology’s growth in medical expertise going by just one player not named ChatGPT. In the space of only six months, Google’s Med-PaLM went from correctly answering 67.2% of medical-licensing questions to nailing some 86.5%.
Equally if not more impressive, physician judges frequently ranked Med-PaLM 2’s responses higher than those offered by human doctors in a head-to-head match that used more than 1,000 consumer medical questions. The judges’ preferences held across eight of nine metrics related to clinical utility.
Fast-forward to the present month. Google’s chief clinical officer, Michael Howell, MD, MPH, took questions from JAMA’s editor-in-chief, Kirsten Bibbins-Domingo, MD, PhD. They talked about how large-language AI has evolved so quickly and where its trajectory may lead it next. Among Howell’s key points were these four, lightly edited here for clarity and conciseness:
1. Healthcare AI is going to change a lot of things, and all of them should happen with clinicians, not to clinicians.
“I’ve sometimes thought what it must have been like to be in practice when penicillin showed up. Doctors of the time must have said something like, ‘Mark this moment.’”
2. Watch for advances aimed at assisting clinicians in tasks that take them away from the bedside—and away from the cognitive, procedural and/or emotional work of being a clinician.
“We’re likely to see tools that help clinicians avoid things like diagnostic anchoring bias and diagnostic delay. I’ve been saved by a nurse tapping me on the shoulder and saying, ‘Hey doc, do you really want to do that?’ Watch for AI to fill that role at some point in the future.”
3. Accounting software has been around for years, but we don’t have fewer accountants. We still need professional bean counters to make sure our numbers follow the right track.
“We’ll probably see the same type of thing in healthcare. AI will assist clinicians, but it won’t replace us. It’s going to be interesting just figuring out how best to use it.”
4. All large language models really do is predict next words. Given this, the concept of reinforcement learning with human feedback—or refining AI’s outputs based on people’s preferences—is only going to become more important.
“At the same time, if you get reinforcement learning with human feedback wrong, your models can degrade over time. And then, when you update anything in your model, sometimes you’ll make it better in one area but worse in another. This is similar to the principle that, as a physician, the longer you work in the ICU, the better you get at intensive care—but the worse you get at providing primary care.”
View the interview and read the full (unedited) transcript here.