Readit News logoReadit News
dumbmrblah · 6 months ago
I’ve been beta testing this for several months. It’s OK. The notes it generates are too verbose for most medical notes even with all the customization enabled. Most medical interviews jump around chronologically and Dragon Copilot does a poor job of organizing that, which means I had to go back and edit my note which kind of defeated the purpose of the app in the first place.

It does a really good job with recognizing medications though, which most-patients butcher the name on.

Hallucinations are present, but usually they’re pretty minor (screwing up gender, years).

It doesn’t really seem to understand what the most important part of the conversation is, it treats all the information equally as important when that’s not really the case. So you end up with long text of useless information that the patient thought was useful but not at all relevant to their current presentation. That’s where having an actual physician is useful to parse through what is important or not.

At baseline it doesn’t take me long to write a note so it really wasn’t saving me that much more time.

What I do use it for is recording the conversation and then referencing back to it when I’m writing the note. Useful to “jog my memory” in a structured format.

I have to put a disclaimer in my note saying that I was using it. I also have to let the patient know upfront that the conversation is getting recorded and I’m testing something for Microsoft, etc. etc. You can tell who the programmer patients are because they immediately ask if it’s “copilot“ lol

amoxichillin · 6 months ago
I've been helping test it as well - your experience sounds identical to mine. I was initially very excited for it, but nowadays I don't really bother turning it on unless I feel the conversation will be a long one. Although I am very much looking forward to them rolling out the automated pending of orders based on what was said during the conversation.

LLM's have so much potential in medicine, and I think one of the most important applications they will have is the ability to ingest a patient's medical chart within their context and present key information to clinicians that would've otherwise been overlooked in the bloated mess that most EMR's are nowadays (including Epic).

There's been so many times where I've found critically important details hidden away as a sidenote in some lab/path note overlooked for years that very likely could've been picked up by an LLM. Just a recent example - a patient with repeated admissions over the years due to severe anemia, would usually be scoped and/or given a transfusion without much further workup and discharged once Hgb >7. Blood bank path note from 10 years ago mentions presence of warm autoantibodies as a sidenote; for some reason the diagnosis of AIHA is never mentioned nor carried forward in their chart. A few missed words which would've saved millions of dollars in prolonged admissions and diagnostic costs over the years.

skywhopper · 6 months ago
Given everything I hear about LLMs for similar summary purposes including your description and that given above, it seems unlikely that the LLM would be all that likely to “notice” a side note in a huge chart. I agree that’d be great but I’m curious why you think it would necessarily pick up on that sort of thing.
marcellus23 · 6 months ago
> A few missed words which would've saved millions of dollars in prolonged admissions and diagnostic costs over the years.

I don't mean to come off antagonistic here. But surely the more important benefit is the patient who would've avoided years of sickness and repeated hospital visits?

stuartjohnson12 · 6 months ago
I just wanted to jump in and say - don't give them too much credit on transcribing medication, I'm guessing this is Deepgram behind the scenes and their medication transcription works pretty well out of the box in my experience.
voidUpdate · 6 months ago
Screwing up gender and years sounds pretty serious to me?
beng-nl · 6 months ago
Maybe they mean that it either doesn’t matter in context or it’s easy to catch and correct. Either way it seems reasonable to trust the judgement of the professional reporting on their experience with a new tool.
dumbmrblah · 6 months ago
It's more in scenarios where I enter the room and I ask the patient whether this is their wife/husband etc. It's not like I'm going into the room and saying "hello patient you appear to be a human female". The model is having difficulty figuring out who actors are if their are multiple different people talking. Not a big issue if all you're doing is rewriting information. But if multi-modal context is required, its not the best.
userbinator · 6 months ago
The notes it generates are too verbose for most medical notes even with all the customization enabled.

I've noticed that seems to be a common trend for any AI-generated text in general.

TeMPOraL · 6 months ago
I think this might be because of what GP said later:

> it treats all the information equally as important when that’s not really the case

In the general case (and I imagine, in the specific case of GP), the model doesn't have any prior to weigh the content - people usually just prompt it with "summarize this for me please <pasted link or text>"[0], without telling it what to focus on. And, more importantly, you probably have some extra preferences that aren't consciously expressed - the overall situational context, your particular ideas, etc. translate to a different weighing that the model has, and you can't communicate that via the prompt.

Without a more specific prior, the model has to treat every information equally, and this also means erring on the side of verbosity, as to not omit anything the user may care about.

--

[0] - Or such prompt is hidden in the "AI summarizer" feature of some tool.

gmerc · 6 months ago
Are they charging per token
hakaneskici · 6 months ago
Same for AI coding assistants, most tools generate way too much unnecessary code. Scary part is that the code seems to be running OK.
ksaxena · 6 months ago
Yes, the biggest problem with Healthcare AI assistants right now is that there is no way to "prompt" the AI on what a physician needs in a given scenario - eg. "only include medically relevant information in HPI", "don't give me a layman explanation of radiographic reports", "include direct patient quotes when a neurological symptom is being described" etc.

And the prompt landscape in the field is vast. And fascinating. Every specialist has their own preference for what is important to include in a note vs what should be excluded; and this preference changes by disease - what a neurologist want in an epilepsy note is very different from what they need in a dementia note for eg.

Note preferences also change widely between physicians, even in the same practice and same specialty! I'm the founder of Marvix AI (www.marvixapp.ai), an AI assistant for specialty care, we work with several small specialty care practices where every physician has their own preferences on which details they want to retain in their note.

But if you can get the prompts to really align with a physician's preferences, this tech is magical - physicians regularly confess to us that this tech saves them ~2 hours every day. We have now had half a dozen physicians tell us in their feedback calls that their wives asked them to communicate their 'thanks' to us for getting their husbands back home for dinner on an important occasion!

[Edit: typo and phrasing]

visarga · 6 months ago
> there is no way to "prompt" the AI on what a physician needs in a given scenario - eg. "only include medically relevant information in HPI", "don't give me a layman explanation of radiographic reports", "include direct patient quotes when a neurological symptom is being described" etc.

There is, it is called RLHF.

burnte · 6 months ago
We tried it at my job, I got us in the beta. Go try Nudge AI and tell me what you think. Our providers found Nudge to be a far better product at a fifth of the price.
monkeydreams · 6 months ago
> Hallucinations are present, but usually they’re pretty minor (screwing up gender, years).

And if all hospitals were doing was having doctors treat patients, this would be ok. But healthcare is fueled by these "minor" details and this will result in delays in payment and reimbursent, trouble with patient identification, corruption of clinical coding, etc.

pintxo · 6 months ago
Did you encounter any instances of hallucinations or omissions?

One would image those to be the biggest dangers.

dumbmrblah · 6 months ago
Hallucinations are pretty minimal but present. Some lazy physicians are gonna get burned by thinking they can just zone out during the interview and let this do all the work.

I edited my original post. Omissions are less worrisome, it’s more about too much information being captured which isn’t relevant. So you get these super long notes and it’s hard to separate the “wheat from the chaff”.

knowitnone · 6 months ago
it's not minor when they screw up dosage

Dead Comment

eig · 6 months ago
As a medical student, I used the dragon dictation software (no AI) to write notes in the ED and more recently I used a pilot of this ai version to write clinic notes.

Overall, I was quite impressed. It definitely made writing notes much faster, which all doctors hate to do. While it had some problems with where to put key pieces of information (like putting details from the physical exam back in the history), it only took 5 mins of rearrangement after the visit to complete the note.

For simple diagnoses, it does a decent job coming up with the assessment and plan, probably because all the simple diagnoses were in the training set. For more complex ones though, it needs to be exactly dictated by the doctor. I can see this being used very well in primary care.

Edit: When I said “coming up with an assessment and plan” I mean documenting the assessment and plan based on the ai’s recorded conversation with the patient. The conversation with the patient is meant to be understandable. The “assessment and plan” documentation on the other hand is jargony and meant to be read by other physicians.

conartist6 · 6 months ago
This still sounds bad. 5 mins to rework your notes after each patient visit? I didn't assume doctors had that kind of time.

And let me make this clear. I, as your patient, I never NEVER want the AI's treatment plan. If you aren't capable of thinking with your own brain, I have no desire to trust you with my health, just like I would never "trust" an AI to do any technical job I was personally responsible for due to the fact that it doesn't care at all if it causes a disaster. It's just stochastic word picker. YOU are a doctor.

diggan · 6 months ago
> This still sounds bad. 5 mins to rework your notes after each patient visit? I didn't assume doctors had that kind of time.

Compared to what though? It reads as not additional work, but less work than manually having to do all that, seems likely to needing more than 5 minutes.

> And let me make this clear. I, as your patient, I never NEVER want the AI's treatment plan.

Where are you getting this from? Neither the parent's comment nor the article talks about the AI assistant coming up with a treatment plan, and it seems to be all about voice-dictating and "ambient listening" with the goal of "free clinicians from much of the administrative burden of healthcare", so seems a bit needlessly antagonistic.

ilikecakeandpie · 6 months ago
> 5 mins to rework your notes after each patient visit? I didn't assume doctors had that kind of time.

I worked in a healthcare for over a decade (actually for a company that Nuance acquired previous to their acquisition) and the previous workflow was they'd pick up a phone, call a number, say all their notes, and then have to revisit their transcription to make sure it was accurate. Surgeons in particular have to spend a ton of time on documentation

eig · 6 months ago
I think you may be misunderstanding how the tool is used (at least the version I used).

The doctor talks to the patient, does an exam, then formulates and discusses the plan with the patient. The whole conversation is recorded and converted to a note after the patient has left the room.

The diagnosis and plan was already worked out while talking to the patient. The ai has to convert that conversation into a note. The ai cant influence the plan because the plan was already discussed and the patient is gone.

zeagle · 6 months ago
AI is an assistive tool at best but it can probably speed up by reflowing text. I use dragon dictation with one of the Philips microphones and it makes enough mistakes that I would probably spend the same time editing/proofing. Had a good example yesterday where it missed a key NOT in an impression.

As aside, the after work is what burns out physicians. There is time after the visit to do a note, 5 min for a very simple is reasonable to create dictate fax do the work flow for billing and request a follow up within a given system. A new consult might take 10 min between visits if you have time.

For after hours, ER is in my opinion a bad example because when you are done, you are done.

Take a chronic disease speciality or GP and it is hours of paperwork after clinic to finish notes (worse if teaching students), triage referrals, deal with patient phone calls that came in, deal with results and act in them, read faxes etc. I saw my last patient ~430 yesterday and left for home at 7 dealing with notes and stuff that came in since Thursday night.

bpodgursky · 6 months ago
> I, as your patient, I never NEVER want the AI's treatment plan.

You as a patient are going to get an AI treatment plan. Come to peace with it.

You may have some mild input as to whether it's laundered through a doctor, packaged software, a SaaS, or LLM generated clinical guidelines... but you're not escaping an AI guiding the show. Sorry.

_qua · 6 months ago
You'd be horrified to learn how many doctors spend hours at the end of their day finishing notes on patients. It's a nightmare.
Ukv · 6 months ago
> And let me make this clear. I, as your patient, I never NEVER want the AI's treatment plan. If you aren't capable of thinking with your own brain, I have no desire to trust you with my health,

To my understanding this tool is for transcription/summarization, replacing administrative work rather than any critical decision making.

> just like I would never "trust" an AI to do any technical job

I'd trust a model (whether machine-learning or traditional) to the degree of its measured accuracy on the given task. If some deep neural network for tumor detection/classification has been independently verified as having higher recall/precision than the human baseline, then I have no real issue with it. I don't see the sense in having a seemingly absolute rejection ("never NEVER").

zora_goron · 6 months ago
This doesn’t necessarily apply to this particular offering, but having working in clinical AI previously from a CS POV and currently from as a resident physician, something I’m a little wary of is the “shunting” of reasoning away from physicians to these tools (implicitly). One can argue that it’s not always a bad thing, but I think the danger can lie in this happening surreptitiously by these tools deciding what’s important and what’s not.

I wrote a little bit more of my thoughts here, in case it’s of interest to anyone: [0]

On that same vein, I recently made a tool I wrote for myself public [1] - it’s a “copilot” for writing medical notes that’s heavily focused on letting the clinician do the clinical reasoning, with the tool exclusively augmenting the flow rather than attempting to replace even a little bit of it.

[0] https://samrawal.substack.com/p/the-human-ai-reasoning-shunt

[1] https://x.com/samarthrawal/status/1894779710258733330

mbb70 · 6 months ago
It does feel like we are hurtling towards a world where every industry will have a high volume producer of generated content, which will force the creation of a high volume summarizer of generated content.

"Having trouble processing a medical claim with 50+ pages of notes? Not to worry, Dragon Copilot Claim Review(tm) trims the fluff and tells you what really happened!"

"Having trouble understanding a large convoluted PR? Not to worry, Copilot(tm) Automated Review has your back!"

"Having trouble decided which cordless vacuum to buy? Not to worry, Amazon's Customers Say(tm) shows you what people think!"

There is definitely _some_ world utility to this arms race, but is it enough?

lm28469 · 6 months ago
It's dumb and I hate it. It's exactly the same with job applications: AI generated resumes and AI generated cover letters read by AIs, we might as well save the compute time and send bullet points, but no we all have to continue the dance even though the music stopped. So many bright minds working on such degenerate technology... the flip side is that I spend less and less time online as LLMs greatly accelerated the slow rot that had taken hold of the web
TeMPOraL · 6 months ago
The way you described it, that's not a problem at all, but a clear improvement. Thing is, every industry already has "a high volume producer of generated content" that, except for the last case, arose organically, due to reasons other than trying to confuse the reader. The creation of "a high volume summarizer" doesn't automatically mean an arms race.

Medical claims won't be growing in pages just because a doctor can parse them a bit faster. They may grow initially, because it's likely that people's mental capacity is what keeps other factors from ballooning the claims further - but it'll level out when some other practical limit is reached. Same with coding and PRs, same with research and all kinds of activities - except advertising.

There, AI will (already is) causing an arms race, because the "high volume producer"'s goal is to overwhelm their victims, so if the victims start protecting themselves with AI tools, the producer will keep increasing production to compensate. But that's not the fault of AI - it's the fault of allowing the advertising industry to exist.

supriyo-biswas · 6 months ago
Personally all I can hope for is that people start seeing it for what it is and just shorten their communication, foregoing the use of LLMs.
AtreidesTyrant · 6 months ago
Didn't this have issues recently where symptoms or stories were hallucinated and attributed to the patient?

This seems like a tool that insurance companies would love to get a copy of the data stream, and that could get very sticky quite quickly.

burnte · 6 months ago
I got my company into the beta of DAX Copilot, and it's ok. It's not fabulous. After a year only a third of doctors were still using it. We switched to another product that works better for our providers, but also costs a fifth as much as Dragon. Dragon Copilot is MASSIVELY overpriced, and it is not the premier healthcare note summary product now.
jsnider3 · 6 months ago
Is that other product Nudge?
burnte · 6 months ago
Wow. I'm genuinely surprised you guessed on the first try. Good job, yes, it's Nudge!
DebtDeflation · 6 months ago
This sounds like a basic STT/Transcription app. What makes it a "Healthcare Virtual Assistant"? Presumably it's been trained on a medical dictionary to recognize vocabulary from this domain? Dragon has been making transcription apps since 1997, originally based on Hidden Markov Models, I assume since updated to use transformers.
potatoman22 · 6 months ago
It reformulates the visit transcription as medical notes. That's the "virtual assistant" part afaik.
davikr · 6 months ago
Interesting, but there is a lot of "intent" in writing notes and I am not convinced it could capture the full picture without significant human supervision. Would it really save time writing paperwork if you have to go through it anyways and check if there's anything wrong? At least when I write, I know it's correct.
Ukv · 6 months ago
> Interesting, but there is a lot of "intent" in writing notes and I am not convinced it could capture the full picture without significant human supervision. [...] At least when I write, I know it's correct.

To my understanding, notes would otherwise largely be written from memory after the visit - which adds a fairly significant opportunity for omissions and errors to sneak in.

It seems plausible to me that by fixing that low-hanging fruit, this tool could potentially reach current human levels of accuracy overall even if it has shortcomings in other areas, like not being as good at non-shallow reasoning. Not to necessarily say it's currently at human-level.

> Would it really save time writing paperwork if you have to go through it anyways and check if there's anything wrong?

Five minutes saved per encounter, allegedly[0]. The decrease in clinician burnout and patient satisfaction also seem pretty significant. But, not sure how much Microsoft have massaged those figures.

[0]: https://news.microsoft.com/?p=449586