The typical primary care physician is incompetent in every measurable respect. This is a huge problem.
Here, I make the case that
Primary care physicians are broadly, grossly incompetent
This is due to empty credentialism
Making it much (~10X) easier to become a PCP is a good solution
Primary Care Physicians are Broadly, Grossly Incompetent
The standard of competence I am comparing primary care physicians against is:
They should be able to reliably diagnose diseases they are trained to diagnose.
They should be knowledgeable to a standard similar to what is required to qualify as a doctor
They should be attentive and empathetic towards patients
Visiting them is empirically superior to not visiting them
When actually examined according to these standards, PCPs fail on all counts.
Failure to diagnose uncommon diseases is rampant
A survey of patients with rare diseases found that, in about half of cases, patients received at least one incorrect diagnosis, and two thirds required visits to at least three different doctors before being diagnosed. For 30% of them, a correct diagnosis took over five years.
Another survey of children with rare diseases showed that 38% of them needed to see six or more doctors before being diagnosed correctly. 27% received an initially incorrect diagnosis.
If you happen to suffer from a rare disease, the likelihood you will actually receive a correct diagnosis and treatment for it within a year of first setting foot in a doctor’s office is astonishingly low.
PCPs Are not good at physical examinations
Physical examinations are often hailed as a reason for the necessity of PCPs and their rigorous training. However, every time they are tested on their ability to perform these tests and derive accurate conclusions, they fail abysmally.
PCPs detect heart murmurs at sensitivities of 30-40%, with high inter-rater disagreement. This is a worse level of accuracy than just taking self report at face value.
“Crackles” in the lungs are detected at rates ranging from 19-67%
Even abdominal haemorrhages are detected at sensitivities of 30-40% by emergency care physicians’ physical examinations.
Kappa values (inter-observer agreement) for the various physical exams done by PCPs and non-specialists land in the 0.18-0.45 range, which is the statistical equivalent of “barely better than flipping a coin”.
The current state of the evidence suggests that if a PCP performs no physical examinations whatsoever, there would be no detectable decrease in their diagnostic accuracy or patient outcomes.
PCPs are Apathetic and Rude
At the level of basic social skills and interest in their patients, primary care physicians fail in almost every way they are capable of failing. A 1984 study found physicians interrupt their patients on average 18 seconds after they begin to state reasons for their visits, and most patients stop elaborating once interrupted.
This was subsequently replicated in 2019, which found that this takes a mere 11 seconds for primary care physicians to interrupt a patient describing their reasons for coming in.1
Over half of US patients surveyed report their symptoms being ignored, dismissed or not believed. 50% reported their doctor made false assumptions about them.
Physicians also consistently over-rate themselves on empathy and manner relative to patient perception. In fact, the ratings they give themselves correlate inversely with patient ratings.
The reason for the overwhelming consistency of negative anecdotes about experiences with doctors is not some arbitrary mass hallucination. Doctors simply are, by and large, apathetic and rude.
Doctors get substantially worse at their jobs over time
There is a strong inverse relationship between the “experience” of a doctor and the quality of care they provide. A recent review of 62 studies found that more than half showed a decline on all measures as experience increased, and only one indicated the opposite.
A 2025 study on pulmonary/critical care medicine fellows showed that they scored substantially worse than medical students on foundational pulmonary physiology questions.
The average primary care physician in the United States is 48 years old. Medical residency typically finishes at age ~30, implying the typical doctor you will encounter has about 2 decades of “experience” during which their competence has been logarithmically decaying. In expectation, they will have lost approximately half of the knowledge on uncommon presentations they had at the beginning of their career.
The evidence literally indicates that simply plonking a student who just passed the MCAT yesterday directly into a modern PCP office would produce an above average PCP in expectation.
The standard PCP is no better than a layperson with a computer
Primary care physicians are increasingly redundant in view of LLMs. Numerous studies have compared the performance of your standard PCP to frontier language models, and consistently find that GPT 4 (now far surpassed by modern frontier models) is slightly ahead on hard performance metrics, and vastly ahead in qualitative evaluations of empathy and thoroughness.
Modern LLMs obliterate GPT4 on all benchmarks, including (and in fact, particularly) biomedical expertise.
Today, a man on the street with a week-long crash course in physical examination practices (and likely not even that), with access to the latest version of GPT, will outperform a median primary care physician with 20 years of experience.
Doctors cannot detect drug seekers
There is no known method of reliably identifying drug seeking behaviour.
When doctors are shown videos of potentially drug seeking patients, they indicate suspicion of drug-seeking only 3% of the time when the drug itself is not mentioned . Even in the most blatant, prototypical case of a patient making a direct request that specifically names oxycodone, only 21% of the time was drug seeking suspected.
Modern databases designed to flag “doctor shopping” as a means of assisting PCPs in identifying drug seeking behaviour, miss roughly half of genuine presumptive opioid abusers, and have extremely high false positive rates. Only 5% of even the most “extensive” prescription-shoppers are presumptive opioid abusers. 20% of people flagged as “shoppers” actually turned out to have cancer, meaning that you, as a person flagged by the system, are roughly 4X more likely to have cancer than be a genuine opiate addict.
The offense/defense balance for a savvy drug seeker is heavily skewed in their favour. Pain is a fundamentally subjective and largely unverifiable phenomenon. Anyone with half a brain and a functioning mouth can say the right things to get prescribed virtually anything they like.
The image of the shrewd, discerning doctor noticing the subtle body language of an opiate addict and denying him pain meds is a load-bearing caricature that is largely nonexistent in reality.
The role of the doctor in mitigating drug seeking is merely to function as a trivial inconvenience.
Empty, Unmeritocratic Credentialism is A Major Cause For The Inadequacy Of Primary Care Physicians
How hard is being a PCP, really?
PCPs (attempt to) follow standardised decision trees for diagnosis and referral. This is something a web app can do. In fact, databases of diagnostic decision trees (CDSS: clinical decision support systems) already exist for this purpose—just plug in the symptoms and you’re good to go. Give it a try yourself.
Adoption of these systems is low, and the reasons for this are damning. The dominant failure mode is that doctors simply don’t use them. It’s too time consuming to type symptoms into a computer, despite studies consistently showing improved diagnostic accuracy without extending consultation times. There are also potential liability issues if they are suggested a rare condition, ignore it, and it later turns out to be correct. Better to be ignorant of the possibility and keep your hands clean, goes the logic. When required to use CDSS, PCPs routinely ignore the outputs, preferring their own early hypotheses, despite the fact that deferring to these systems produces an improvement in diagnostic accuracy.
Better still than traditional CDSS, modern LLM-powered systems are now capable of transcribing live conversations and making realtime diagnostic recommendations, as well as suggesting follow-up questions.
All you need to do to outperform the vast majority of PCPs in diagnosing patients is plug in their self-described symptoms verbatim into one of many widely available software products, and relay whatever it says on the screen.
With tools like this, what possible justification is there to require ten (or even five) years of training to be the human face of a computer-automated triage process?
The Case for Highly Trained PCPs—Gatekeepers
The “official” reasons for the necessary existence of PCPs are:
Their ability to diagnose (rare) conditions
Their ability to prescribe, and deny prescriptions to drug seekers
Their ability to provide referrals to the proper specialists
Their ability to perform physical examinations
Let’s look at these reasons one by one. Do these functions require approximately a decade of preparation?
Diagnose (rare) conditions
The typical PCP routinely fails to correctly diagnose rare (and even common) conditions. They are outperformed by LLMs and their personal diagnostic capability has been largely redundant for decades in view of CDSS. They also get logarithmically worse at this task over time.
Prescribe medications, and deny prescriptions to drug seekers
The reason prescriptions exist is that some drugs are not suitable for some patients.
Thus, the PCP’s role is to do one of the following:
A: Identify the patient as being mistaken about or unaware of the proper treatment for their condition
B: Identify a patient gaming the system to obtain drugs for illegitimate purposes.
C: Give the patient the drug they want or need
There is no known method of actually performing function B, and doctors are largely powerless to identify all but the most blatant drug seekers.
Which leaves only A as an alternative to simply dispensing the prescription upon request. A, as discussed, is simply a matter of plugging the symptoms into the computer and doing what it says.
While it is reasonable to throw up a tokenistic level of resistance to drug-seeking behaviour (at least you have to visit a physical office), the idea that we must have academic veterans holding down the fort against a tidal wave of detectable drug seekers is a complete fantasy.
Provide referrals to the proper specialists
Why are referrals necessary in the first place? The thinking is we don’t want to waste the precious time of patients and specialists by allowing patients without relevant symptom profiles to book consultations. Think of the money and time wasted chasing red herrings!
This idea would be more compelling without the knowledge that:
Following a CDSS or asking a LLM is something that you can do from home in a couple of minutes as a patient, and get comparable (if not superior) accuracy to a PCP, and;
The status quo already egregiously fails to address this “time wasting” issue.
Given a preliminary diagnosis, providing a referral is simply a matter of choosing from a predetermined list of specialists—a function that can be delegated to a zapier automation.
Given the baseline unreliability of physicians as a screen, and the trivial task of identifying a specialist for a given presentation, the argument for PCPs being a necessary link in the chain connecting patients to specialists is very weak.
Their ability to perform physical examinations
The sensitivity of physical examinations in detecting illness and injury is so low, and the cross-evaluator disagreement is so high, that it is not exaggerating to say that completely abolishing the practice of physical examinations in primary care offices and replacing it with more detailed questioning would substantially improve their diagnostic accuracy across virtually all presentations.
You don’t need that much training to do that, actually
The standard career track to become a PCP requires roughly 10 years of full time study in the United States, and 6-8 years in other high HDI countries.
What percentage of this decade-long education is actually applied in practice?
Typically, pre-medical education involves 3-4 years of general study in biology, chemistry, mathematics, or in some localities, any four year degree whatsoever. This functions as a screening mechanism for broad competence and stability. The utility of learning advanced mathematics in order to suggest aspirin for headaches is a difficult thing to square.
Of what use is a decade of training when the function of a PCP is simply to screen for initial indications and provide specialist referrals?
You simply do not need to use a $50-$100k four-year degree to screen for broad competence. A G-loaded entry exam on broad physiology is already used—the MCAT. If you can pass the MCAT, additional screening for broad academic competence is redundant. With tens to hundreds of thousands of prospective doctors every year, signing up to waste four years and spending 5-6 figure sums to pass the first filter of “general academic competence”, one shudders to imagine the sheer scale of the economic loss.
The education required to be a functional PCP in line with the standards we observe and accept in practice, is closer to a single year for a competent, motivated student, rather than a decade of coursework that, after 5-10 years, the typical physician largely forgets anyway.
In fact, factoring in knowledge decay, the uselessness of physical examinations, and the broadly low standards we observe today, simply plonking a student who passed the MCAT yesterday directly into a modern medical office would already produce an above average PCP in expectation.
Making it much easier to become a PCP is a solution
Standards are low largely due to limited competition
In the United States, there is approximately one primary care physician per 2000 people. This ratio, coupled with high inelastic demand for medical care, is a major factor producing the poor standards of care we see today.
The standard 10-minute PCP consultation is not a result of some principled analysis of the optimal standard of care. It is the result of virtually unlimited demand and negligible competition. When you have approximately 3000 appointments per doctor per year, dedicating more than a negligible amount of time and effort to each one is a logistically impossible and financially counterproductive strategy.
The typical primary care physician is burning all of their cognitive bandwidth with constant context switching and churning through a crammed agenda of patients on a daily basis.
The outcomes speak for themselves.
Competition drives costs down and increases utilisation
Lowering the barrier to entry to become a primary care physician increases the supply. Patient costs would fall, and availability of alternatives would increase enormously.
The status quo is that location matters far more than competence and reputation for a primary care physician. This results in the perverse incentives and outcomes we see today.
If we 10X’d the supply of PCPs, we would expect to see:
Markedly lower wait times
Greater availability and options for patients, particularly in remote areas
Longer, more detailed consultations
Higher utilisation and more pre-emptive care
Massive reductions in the income of PCPs
The only real concern in doing this would be a reduced standard of care provided by the new entrants. This concern, however, falsely assumes we are living in a world where the standards are not already below the level of a layperson with a software subscription.
The Persistent Cultural Reverence for Doctors
The glue that holds the facade of the necessary “academic veteran” PCP together is widespread cultural reverence toward doctors. Doctors (a label which extends to PCPs in the minds of most people) are an almost untouchable class that is popularly considered to belong at the top of the social and economic hierarchy. They are an essential resource, a trusted authority, and a moral example.
Insulting the quasi religious status of doctors by taking a hatchet to entry requirements for primary care physicians will undoubtedly produce substantial political resistance. However, given the enormous costs involved to the patients (and indeed the doctors themselves) as well as the incalculable scale of physical harm caused by poor access and bad incentives, passively tolerating the status quo is an option that is too expensive to accept.
So, do it gradually. Do it incrementally. Do it with tact and understanding. But do it.
We don’t have much to lose.
Sorry, I stopped reading because of the disingenuous shifting very early on.
Okay, agree. What’s their reliability for all diseases they are trained to diagnose?
Oh, okay, we’re going to focus on their reliability to diagnose uncommon diseases. So how do they do with that?
Okay, we somehow went from general reliability, to uncommon, to rare without skipping a beat. You’re talking about different things. This lack of basic consistency undermined your credibility immediately.
I noticed this was a repeated theme of the article. Another example is equivocating between “doctors are rude” and “doctors interrupt patients.” People interrupt each other in conversation all the time! It’s a normal way of talking that doesn’t automatically make you rude.
There is no quantifiable form of “rude”. 11 second interruption times, (that by the way are not explained by irrelevant blathering. From Claude: “77% of patients (258/335) finished their initial statement within 2 minutes, and only 2% (7/335) spoke for more than 5 uninterrupted minutes. In all cases, physicians considered the information they were given to be relevant. So letting patients finish takes about two minutes for ~80% of cases, and the doctors themselves judged the information relevant. The interruptions aren’t saving meaningful time.”), plus an inverse correlation between self reported empathy scores and those of patients, plus enormous (often majority) percentages of patients reporting some form of neglect or dismissal in surveys, plus virtually ubiquitous and unidirectional anecdotes, is as strong of an empirical case for a profession being “rude” I think you are ever likely to get.
Having taken a quick look at the source for Claude’s data, it seems like a reasonable read is that the doctors are simply not well-calibrated on the relevance of the patients statements and redirect them away too quickly. Not sure this means they were ‘rude’,
The HCAHPS survey measures ‘courteousness’ by asking patients, and consistently find that in ~85% of cases, doctors are rated ‘always’ respectful and courteous. From my understanding, a further ~10% are rated as ‘usually’ respectful and courteous. This is not a perfect measure, but seems better than using interruption times as a proxy.
I also wonder about the snowball sampling approach to this. How do people with rare diseases get to know each other? I’d assume the major way is via forums for people trying to work out their rare diseases, which probably skews towards cases that haven’t been adequately handled by the doctors.
I also noticed things like using behavior from non PCP doctors, like critical care doctors, to make conclusions about PCPs.
Yes, the information is generalisable. Critical care doctors would be expected to outperform PCPs on a specialised function they perform regularly, and nonetheless have very low sensitivity.
Unless you think it’s plausible PCPs would outperform critical care doctors on abdominal auscultation for detecting haemmorhages, I see no reason why it is irrelevant to the case.
The PCP could say “I don’t know this, go to a specialist to find out” or in some cases (especially the critical care doctor) the PCP may not be faced with such a case in the first place.
That doesn’t show that PCP are bad at diagnosing the cases that they do face and that they don’t send to other doctors.
Fair pushback. Although I think you are being somewhat too dismissive.
PCPs are trained to diagnose (and certainly indicate/suspect) both rare and common diseases.
The distinction between uncommon and rare is subjective. There is not an official tier system that graduates from “common” to “uncommon” to “rare”, etc. I used the word “uncommon” as a synonym for rare—“uncommon” doesn’t typically appear in literature. I can appreciate this downplays it somewhat. However, the term “rare”, I would argue, exaggerates rarity for the more common members of that category. I don’t think that pointing out a word choice that commits a connotational but not a factual error is a valid basis to dismiss the whole thing.
I hope a well-informed third party reviews this post. I like the contents of the post, but its rhetorical/charged flavor makes it difficult to believe it contains no biases.
Fortunately what you are suggesting is already happening, kind of: the rise of NPs.
Also PAs.
The effect of this on my emotions is similar to one of those long web pages that is optimized for conversion (i.e., for making sales) that were common 5 to 10 years ago. Sort of the textual equivalent of a high-pressure sales person. Not a fan of the effect.
It is opinionated, but I don’t think there are many parralels beyond that.
Do you have any factual disagreements?
I don’t know enough about the health-care systems (of whichever countries your argument applies to) to either agree or disagree with this post’s assertions. I do know that for something as complicated as a health-care system, little progress can be made by confidently asserting generalities like you do here. It is possible that you have knowledge in sufficient detail to know that your assertions are correct, but you certainly haven’t shared that detailed knowledge with us yet.
The argument applies to the United States primarily, as well as the majority of high HDI countries, including Canada, Australia, parts of Asia, and most of Europe.
Your response strikes me as dismissing the argument on the basis of a lack of authority on my part. Which knowledge are you actually referring to?
I am happy to discuss, clarify and elaborate on specific factual claims I’ve made that you find dubious or under-supported, but so far your critiques have been exclusively aesthetic: “confident assertions”, “salesy”.
I’m probably as skeptical of credentials and experts as you are.
This isn’t a conversation among some adults who share a house on how to handle a chronic problem in the house or some conversation among drivers stuck in a traffic jam about how best to un-jam the traffic. There are vastly more feedback circuits and cause-and-effect relationships—on this site we call them gears—in the health care system than there are in a household or a traffic jam. The limitations of the human brain are such that the only people who can lead a worthwhile discussion on how to improve the health system at the level of comprehensiveness or generality that you are doing here are people who have been employed full time or enrolled full time for years on understanding the gears.
Although the system is difficult to understand in general, in particular instances even an amateur such as myself can trace a particular cause through the system to an effect.
In 1993 or so, my endocrinologist decided to fire all of his patients with Medicare insurance. I’d been seeing him since 1986 and he had till then always been happy to accept my Medicare insurance, and now I no longer had access to him. (IIRC, even if I were willing to start paying him in cash for his time, the mere fact that I had Medicare (even though I would no longer used it at this endocrinology office) would have been enough to cause him to refuse to continue to see me.)
I heard that Washington had introduced a policy change that gave the office that oversees Medicare (the Health Care Financing Administration) the right to review all the charts of any doctor who had even one Medicare patient. My guess is that this policy increased the endocrinologist’s legal liabilty somehow (e.g., perhaps by giving malpractice lawyers easier access to his charts) and that he was probably unusally averse to legal liability.
If I had managed to get a lot of details out of the endocrinologist on his reasons for firing all his Medicare patients or if I had retrieved the text implementing the policy change out of Washington, then interviewed a few malpractice lawyers and if I had collected data to show that the effect is general as opposed to being isolated to this single doctor, then that might have made a good Lesswrong post.
The way I would proceed if I wanted to get good at making effective suggestions on health care policy is I would do hundreds of analyses like the one I describe above that I could have performed if I had put in the effort back in 1993. In choosing the topic of each analysis, I would consider it crucial to wait patiently for a situation, experience or perspective that has the rare property of actually being analyzable by a mere human brain.
Right, expertise matters and the system is complex. But if a system looks this obviously broken to an amateur, and the most obvious answer is “selfish incentives run amok and nobody has bothered to fix it”, an outcry among amateurs can at least force the professionals to explain what is happening and perhaps to try to fix it.
In this case I think your intellectual humility is preventing you from diagnosing the emperor as naked.
I think there’s many good observations and some good proposals in this post. But overall I think it overlooks a lot of the possibly relevant “Why” questions that should inform any solution to a problem this complicated. In many ways I’d say the problems are worse and deeper and sometimes stupider than you are even saying, but not quite the same shape or in the same place as this post suggests.
If you haven’t read these already, I’d recommend taking a look at:
Zvi’s medical roundup posts
Some SSC posts like this one that discusses drug seeking behavior, also this, and also (with differing levels of seriousness), this, this, this, this, and this… actually quite a lot of others, so I’ll stop listing them
this is overbold. they score worse on “foundational pulmonary physiology questions”, sure, but we haven’t established a link between scoring well on these questions and providing adequate care.[1]
i’d expect the recent med students (who likely drilled this info as part of training for the mcat) to do well on quizzes.
the “during which their competence has been logarithmically decaying” link is also by proxy: scores on certain exams drop with experience.
The claim is not that merely because med students outscore physicians on foundational physiology tests that they would be better doctors.
It is an inductive extension of the following:
1. Measured performance across nearly all of the aspects of a PCP’s job is very low, with performance on more difficult and nonstandard tasks being the worst.
2. Performance does not increase with experience, but decreases
3. Deferring heavily to CDSS and LLMs reliably produces better diagnostic and presription outcomes than the baseline of a typical PCP
These facts, PLUS the fact that medical students, (as well as nurses) tend to match or outscore PCPs every time we compare them, suggest rather strongly that early medical students are close to, if not above, the performance threshold of PCPs already.
If you think this is unlikely, I would ask: which aspect of the job of a PCP do you think fresh MCAT passers would do worse in than the average PCP, given the above?
In particular, which patient outcomes do you believe would be worse?
As a fresh MCAT passer (with a high percentile score), I know I’m not competent to jump into that role. I’ve shadowed a few times, and seen my share of PCPs; I just have too many gaps to be comfortable doing it.
That said, I believe I could become median competent as a PCP (with no specialties) with no more than a year of hands-on / practical training.
A lot of your post rings true for me based on my experience in the system, but IMO early students have a lot of gaps and not all of them can be filled with an LLM (yet).
Are these gaps related to medical knowledge, or admin/procedural knowledge?
What is the most complex task you have personally observed a PCP perform that you feel prevents you from doing the role?
I would somewhat suggest older tech like books and flow charts over the latest LLM’s. I’m not saying LLM’s wouldn’t work. Just that I don’t really trust LLM’s, and a simpler flowchart based system won’t suddenly start talking about goblins for inscrutable reasons.
We have digital flowcharts. That’s essentially what a traditional CDSS is.
On the LLM skepticism, it’s possible they have rare hallucinations, but I don’t think it’s likely that spontaneous rambling about goblins is a meaningful concern. In my experience, you have to try pretty hard to get GPT or Claude to have mental breakdown. It’s not usually something that happens for mundane use.
The bottom line is whether, all things considered, the results for patients are better.
You also need to weigh every “goblin moment” against the human equivalent—“couldn’t be bothered” “underslept”, “whoopsie daisy”, and so on.
I do think that, among other things, you’re underestimating the difficulty most people will have using LLMs effectively for healthcare. I agree they can be extremely useful, but there are also many non-obvious failure modes where people can lead themselves astray. It’s still a pretty uncommon skillset and benefits from the user having an above-average level of background understanding and introspective awareness. I don’t think that will keep being true, but I think it’s true now.
I agree, however most people have difficulty “using” PCPs effectively for healthcare as well.
Peoples’ ability to articulate their symptoms doesn’t change much by stepping into a medical office.
Whether you’re prompting an AI or talking to a human, ultimately it’s a matter of the words you say and how those words are interpreted.
Fair enough.
Are we comparing the current system to LLM’s? Or a well designed digital flowchart system to LLM’s?
I think the first one is a win for the LLM’s. Not sure on the second one.
I tried it myself and it didn’t work that way. Inputting symptoms got me “Step 2: Possible causes” with a list of conditions. Clicking on one of those gave me only generic web links to search for the condition, at least on mobile.
I spoon fed it symptoms for a slightly atypical presentation of MCAS, and withheld all my symptoms not related to MCAS. Didn’t even come up as an option.
It is a free patient-facing nerfed version, I figured linking to a paywall wouldn’t be of interest.
That said, I’m curious to know if you’ve ever tried spoon feeding a PCP symptoms for a slightly atypical presentation of MCAS. If so, how did that go?
This is an argument with some merit presented in a slightly antagonistic way. Some specific thoughts.
First, the case for competence. This post argues that they are not competent on the grounds that:
They should be able to reliably diagnose diseases they are trained to diagnose and should be knowledgeable at the standard to qualify as a doctor. If evidence against this were presented here it would be a concern, however the evidence presented mainly demonstrates that:
they are not good at diagnosing rare diseases (this is not a typical use case for primary care in my experience, and the cited study discusses things like genetic counselling which would follow specialist review where I work (non-US))
that physical exams by both primary care physicians and emergency physicians is inferior to ultrasound—this seems likely to be true but is mostly an argument for increasing access to point of care ultrasound/easy point of care tests, not against the existence of physicians per se.
That professional competence deteriorates over time—this was the best claim here in my opinion, and the evidence matches my personal experience. It is of serious concern that continuous professional development is neither maintained well nor adequately assessed by registration authorities. Unfortunately again the evidence here is not specific to primary care, this problem is more widespread. My (unsupported by evidence here) belief is that this represents a default to heuristics and biases over an attempt to genuinely problem solve for presentations in senior clinicians. It is a serious problem.
Next the post argues that primary care doctors should be attentive and empathetic towards patients but are not: the evidence provided for this does partially support this claim. It is however not specific to primary care and highlights this as an issue for medicine at large. One study cited actually appears to states that primary care doctors were actually better at finding out what the patient wanted from the appointment than specialists. A concerning point about medicine at large.
The final point against competence is the claim that visiting them is not superior to not visiting them. The case made here rests primarily on the argument that decision support systems/LLMs giving medical advice are very good and could be easily used well by a layperson. I’m open to this being true but no evidence was cited to support the argument. A quick review of the literature didn’t give me any slam dunks at the level of confidence the author displays. I think this is true for some laypeople and some conditions, but it merits a more thorough exploration (apologies that I have not had time to do this here and I may come back to this, though I believe aspects of this are in previous medical roundups).
Second, the case of empty credentialism:
The claim is made at the top that standardised decision trees are used and could be easily implemented by a web app. I gave the app linked a try, giving it common symptoms of a lower respiratory tract infection (cough, chest pain, low grade fever and general fatigue). It gave me a list of differentials including several kinds of respiratory tract infection, ‘post-myocardial infarction syndrome’, anthrax, lung cancer and a teratoma. When I gave it a timeline (worsening over a few days, present only for a few days) and a severity (impairing activity) consistent with that diagnosis, it suggested I should go directly to the emergency department. This is not what I did last time I had these symptoms (I went to my GP and got antibiotics that worked, and codeine for post-infection cough suppression). Honestly I couldn’t rate this particular app that highly based on this experience, but perhaps this isn’t the intended use case, I’m open to feedback on how others have used it well.
Then the statement is made that doctors don’t use them because they are too time-consuming, or carry liability risks. No evidence is cited Again, this may be due to the environment that I work in, but my experience is not that this is the case. Actually for a more limited case (weird rashes) dermnet that provides a very simple decision-tree is in widespread use by those who don’t have specialist knowledge of dermatology. At an uneducated guess the case against decision support systems has more to do with employer concerns/perceived risk of patient information disclosure than individual clinician choice.
The next set of claims is that the necessity for primary care providers is:
To diagnose rare conditions: I don’t think this is true and am not sure why it is argued repeatedly—my understanding of the purpose of primary care is to diagnose common conditions.
To prescribe, and deny prescriptions to drug seekers: This is made threefold: that primary care doctors should be able to 1) identify patient error about what treatment is needed, 2) prevent seeking of drugs or abuse, and 3) provide the correct treatment.
The argument they make for 1) and 3) relies on their claim that decision support systems are better at this than doctors. As I don’t feel this claim has been supported by the argument so far it’s hard to accept.
The argument for 2) I skipped over above: that primary care doctors are bad at identifying opiate seekers. This seems true, I didn’t bother to check the evidence—but actually the main reason for gatekeeping in prescription is not in my understanding about opiates or benzos, but rather about antibiotics—for which overuse has societal implications as well as individual-level ones. No good argument has been made here that LLM-based drug choice would be superior for this purpose, though again I think that’s a possibility. The main risk is that a patient given the choice will tend to opt in favour of getting the treatment, which would increase inappropriate antibiotic use overall. I would be interested to see solutions to this that don’t involve primary care physicians existing.
To refer on: this post again argues for LLM superiority, if I accept this here then it is fine. As I have said a few times, I don’t think this post has really made this case but it does seem possible. I do think there’s an argument to be made here that it’s tracking multiple referrals and specialist interventions that’s actually the primary care role here—if you’re attending multiple specialists they may not be paying adequate regard to the interactions between medications or difficulties that are cropping up at the intersections of conditions. A very capable patient can manage this themselves, but anecdotally I have seen a lot of patients struggle with this.
To undertake physical exams: this has been addressed a bit above.
Finally the case for less-trained providers:
The claim is made that typical training times for primary care (6-10 years) are not needed. If we take the reasons for primary care doctors existing in this post as valid (I don’t think they entirely have been as stated above), I think this is a reasonable claim.
Then the claim is made that making it easier to be a primary care physician is the solution. Another commenter has pointed out that this has been/is being tried: nurse practitioners and physician assistant/associates have been attempted in a number of places. I don’t really find the evidence from their practice to be great. However it does appear true that very high demand and (in a sense) poor competition have contributed to the 10 minute review which is definitely not optimal for anyone (except perhaps certain NHS managers). Based on this post though, I’m not sure I think the argument has been made for less-trained care providers. Instead I think the argument has been made for self-referral to specialists based on LLM advice. It’s my understanding that different countries have very different procedures about this and I’d be interested to read if anyone has a good understanding of the economic comparison of different models.
Overall I think this post has unfortunately not improved my understanding of this area, though I agree that this is an area ripe for improvement. Another commenter has already directed the poster to the medical roundups. I would also more generally recommend considering looking at different primary care models in different countries, I’d be very interested to read evidence for or against particular models as I’ve done limited reading on this myself.
Thankful for your thoughtful and substantive critique. I’ll address it point by point.
On diagnosing rare diseases: The case for PCPs receiving (anything close to) a decade of training is not generally justified on the basis of the ability to identify common issues like hypertension or cold/flu symptoms. The reason I bring this point up is that a crucial function of PCPs is to flag and escalate rare conditions to the appropriate specialists. The word “diagnose” here, I admit, is not quite right—what I mean is “flag”, “notice”, or “indicate”. This function IS an absolutely central role of a PCP, and the evidence does indicate they don’t perform this function well at all.
On physical exams: The reason I point out that physical examinations are ineffective is that “an LLM can’t touch you” is heavily undermined as an argument in view of this fact. Additionally, the fact that ulrasounds and other devices make this facet of their job redundant further attests to the claim that the vast extent of training PCPs receive is superfluous for the service they provide.
On competence deterioriation: Yes, it’s not specific to PCPs. I am considering writing a follow up post on specialists as well, because the story is not much better in their case and is in some ways worse. However, given the splintering of subspecialities, it is a more tangled picture.
On empathy/manner: Yes, specialists are likely worse. I gather you largely agree with the argument nonetheless.
On whether CDSS/LLM support systems can be used effectively by a layperson: This is somewhat of an extrapolation, however I stand by it. There are unfortunately not many studies directly investigating this, and the combined LLM software products are largely hiding behind paywalls. However, based on a few facts:
- realtime audio->text transcription exists and is integrated into modern CDSS software
- LLMs can suggest clarifying questions (which improve diagnostic performance when doctors use them, implying they are questions that would not have otherwise been asked, or asked worse)
- LLMs outperform doctors in many (possibly all) hard outcomes, and even moreso in soft outcomes. This is undersold in the literature given many studies are using GPT 3.5-4.
it, is on balance, very likely that a layperson with a brief intro to using a LLM-supported CDSS can outperform a PCP. This may not be the case for people of below-average intelligence, perhaps, but this is a narrow and rapidly shrinking moat. The true moat is almost exclusivley regulatory.
On limited usage of CDSS:
Doctors ignore/override CDSS recommendations 46-96% of the time:
GPs identify admin hassle as a major barrier to CDSS adoption.
CDSS increase performance without increasing consult times when actually tested
41% of GPs identify medico-legal liability concerns as a barrier to adoption
On multiple referrals, medications and specialists being a source of value for PCPS: PCPs do not hold this information in their heads; it is a matter of basic admin. I would also add that PCPs are largely quite bad at managing drug interactions. For instance, about 1 in 3 to 1 in 5 prescriptions in polymedicated patients are inappropriate.
A person using the same admin process as a PCP plus a CDSS will get the same (and likely better) results. You can argue that PCPs provide value by offsetting this administrative burden, but a retreat to something this trivial is hardly a glowing endorsement of their utility.