The Human Values Project

The Problem

Nearly half of Americans now turn to AI chatbots for health advice. But the same AI companies that design their products to recognize signs of self-harm and refuse to build bioweapons are quietly allowing other forces to shape the medical advice you receive.

When an AI system processes your case, is it helping you, or your clinic, or your insurer? In a recent study, an LLM gave diametrically different treatment recommendations for the same child depending on whether it was prompted as a pediatric endocrinologist or an insurance company employee. Today, neither the physician nor the patient can know in advance which value framework an AI system embodies.

In a $5 trillion healthcare system, financial pressure to use AI to influence clinical decisions—for reasons beyond patient benefit—will only intensify. If instead we ensure that AI systems are aligned to serve patients first, medical decisions are likely to become safer, more up-to-date with the latest science, and better communicated to patients.

Key Questions

Whose values does a given AI model reflect when making clinical decisions?
How well can AI models be aligned to a specified set of values?
How do patient, clinician, and payer values differ across cultures and healthcare systems?
How should we monitor AI behavior in real clinical deployments?

Research Components

Four interconnected efforts to make clinical AI values visible and accountable

Clinical Decision Dynamics Survey

A large-scale international study collecting tens of thousands of responses from clinicians and patients across multiple categorical clinical decisions. The survey captures diversity in clinician training, geography, specialty, and patient backgrounds—building the empirical foundation for understanding how values shape medical decisions.

International 1,000+ clinicians Active enrollment

Alignment Compliance Index (ACI)

A novel, domain-independent measure that quantifies how effectively an AI model can be aligned to a given preference function or gold standard. In our initial study, three frontier LLMs (GPT-4o, Claude 3.5 Sonnet, and Gemini Advanced) showed significant variability in alignment effectiveness—and models that performed well pre-alignment sometimes degraded post-alignment.

Novel metric Model-independent arXiv 2024

Values in the Model (VIM)

A transparent labeling system—proposed by the RAISE 2025 symposium consensus—that documents how AI systems navigate value-laden clinical trade-offs. The VIM would make transparent whether an AI system leans toward overdiagnosis, cost-sparing, favoring autonomy, or preventing imminent harm, enabling patients, regulators, and health systems to make informed choices.

NEJM AI 2026 Consensus statement RAISE symposium

MedLog

A protocol for event-level logging of clinical AI—healthcare's equivalent of syslog. Each MedLog record captures nine core fields (header, model, user, target, inputs, artifacts, outputs, outcomes, feedback) for every AI interaction in clinical care. Four real-world pilots are running at sites in Ho Chi Minh City, Zurich, San Diego, and New York.

4 active pilots 3 continents Nature 2025

Key Publications

Research advancing the understanding of values in clinical AI

Consensus Statement NEJM AI, January 2026

The Missing Dimension in Clinical AI: Making Hidden Values Visible

Goldberg C, Balicer RD, Bhat M, ... Kohane I. A consensus statement from the RAISE symposium proposing the "Values in the Model" (VIM) framework—a transparent labeling system that documents how AI systems navigate value-laden clinical trade-offs.

VIM Framework 40+ authors

NEJM AI

2026;3(2)

Research Article Nature, 2025

A Global Log for Medical AI

Noori A, Rodman A, Karthikesalingam A, ... Kohane IS, Zitnik M. Introduces MedLog, a universal protocol for event-level logging of clinical AI. Includes four real-world clinical pilots across three continents demonstrating the protocol's utility for post-deployment surveillance.

MedLog protocol 4 clinical pilots

Nature

2025

Preprint arXiv, October 2024

Systematic Characterization of the Effectiveness of Alignment in Large Language Models for Categorical Decisions

Kohane I. Introduces the Alignment Compliance Index (ACI) and evaluates three frontier LLMs on medical triage decisions. Finds significant variability in alignment effectiveness—models that performed well pre-alignment sometimes degraded post-alignment, and small changes in the gold standard led to large shifts in model rankings.

ACI metric 3 frontier LLMs

arXiv

2409.18995

Review NEJM, 2024

Medical Artificial Intelligence and Human Values

Yu K-H, Healey E, Leong T-Y, Kohane IS, Manrai AK. Comprehensive analysis of how human values influence AI outputs in clinical settings, providing frameworks for incorporating ethical considerations into medical AI systems.

NEJM Foundational review

NEJM

390(20):1895

Opinion Boston Globe, January 2026

Your AI Doctor May Be Working for Someone Else

Kohane I. A public-facing argument for value transparency in medical AI, advising patients to interrogate their AI advisers and demand transparency from the companies building these tools.

Public engagement Patient empowerment

Globe

Op-Ed

RAISE 2025

In September 2025, clinicians, ethicists, legal scholars, technologists, and health system leaders convened at the Responsible AI for Social and Ethical Healthcare (RAISE) symposium in Portland, Maine. The participants agreed that while government guidelines and AI model cards describe broad principles and technical specifications, they fail to reveal something crucial: the values embedded in actual clinical decisions.

The symposium produced a consensus statement calling for two parallel tracks: public debate about how values are addressed in AI for medicine, and carefully monitored pilot projects in leading health care systems that begin to craft and test VIM labels for the AI systems already entering clinical use.

Symposium Participants Included

Major health systems (Clalit, Mount Sinai, UCSD, MaineHealth)
AI companies (Google DeepMind, Microsoft Research)
Ethicists and legal scholars
EHR vendors (Epic Systems)
Patient advocates