Tag: Preferences

  • One Clinician. One Institution. One Aligned AI.

    That would be convenient.

    It would also be misleading.

    Clinical medicine does not work because everyone shares the same values, the same priorities, or the same tolerance for risk. It works—imperfectly—because decisions emerge from the interaction of many perspectives: clinicians with different training, patients with different preferences, institutions with different incentives, and societies with different norms.

    Yet much of the current discussion about “AI alignment” in medicine proceeds as if there were a single set of values to align to, and as if success could be established by concordance with a small number of experts, guidelines, or benchmark cases.

    A just-published multi-institution article in NEJM AI argues that this assumption is no longer tenable.

    Alignment to Whom?

    Consider a familiar scenario. There is one open clinic slot tomorrow. Two patients could reasonably receive it. One clinician prioritizes recent hospitalizations. Another prioritizes functional impairment. A third considers social context. None is behaving irrationally. None is value-free.

    Now imagine that an AI system recommends one patient over the other. Is that recommendation “aligned”?

    Aligned to whom?

    To the clinician who last trained the model?
    To the dominant practice patterns in the training data?
    To a payer’s definition of necessity?
    To a hospital’s operational priorities?
    To a patient’s tolerance for risk?

    The uncomfortable reality is that today we often cannot tell. Alignment is treated as a property of the model rather than as a relationship between the model and a population of humans.

    Why Single-Perspective Alignment Fails

    In recent work, we and others have shown that large language models can give different clinical recommendations depending on seemingly innocuous framing choices—such as whether the model is prompted to act as a clinician, an insurer, or a patient advocate. These models may be extensively “aligned” in the conventional sense, yet still diverge sharply when faced with categorical clinical decisions where values are in tension.

    What is missing is not more data of the usual kind, nor more elaborate prompts. What is missing is empirical grounding in how many clinicians and many patients actually make these decisions—and how much they disagree.

    Clinical decisions are not scalar predictions. They are categorical choices under uncertainty, informed by knowledge, experience, and values. Treating them as if there were a single correct answer obscures the very thing that matters most.

    From Opinions to Distributions

    The central claim of the NEJM AI article is simple: alignment should be measured against distributions of human decisions, not against isolated exemplars.

    That requires scale.
    It requires diversity.
    And it requires confronting disagreement rather than averaging it away.

    Instead of asking whether an AI agrees with “the clinician,” we should be asking:

    • Which clinicians does it tend to agree with?
    • In which kinds of cases does it diverge from patients?
    • Does it systematically favor particular ethical heuristics—such as urgency, expected benefit, cost containment, or autonomy?
    • How stable are those tendencies across contexts?

    These are empirical questions. They can be measured. But only if we stop pretending that alignment is a one-to-one problem.

    The Human Values Project

    This is the motivation behind the Human Values Project (HVP).

    The aim is not to decree the “right” values for clinical AI. Medicine has never operated that way, and should not start now. The aim is to make values visible: to systematically measure how clinicians and patients make value-laden decisions across many scenarios, and to evaluate how AI systems relate to that landscape.

    In other words, to replace anecdotal alignment with population-level evidence.

    If AI systems are going to participate in clinical decision-making at scale, then alignment must also be assessed at scale. One clinician. One institution. One aligned AI. That would be convenient—but it would not be medicine.

    Making human values explicit is harder.
    It is also unavoidable.