FYI: The same drug. Four cost-effectiveness studies. Four wildly different answers.

Yesterday we introduced you to a UChicago researcher whose math just predicted the federal government's GLP-1 retreat. Today we're going to teach you to read his papers the way the policy world reads them, and we're going to use the four major cost-effectiveness studies of GLP-1s competing right now as the live example.

The reason this matters is direct. Cost-effectiveness analysis is the single biggest invisible force shaping which GLP-1 prescriptions get covered, denied, dropped from formularies, or excluded from Medicare expansion. Once you can read these studies yourself, you stop being intimidated by citations and start being able to push back on insurance denials with the right framing.

This is Off Label, Not Medical Advice.

Let's go.

The Morning Read: The same drug. Four cost-effectiveness studies. Four wildly different answers. Today, you learn to read them like a pro.

TODAY'S INFO

Two acronyms first.

QALY is a quality-adjusted life year. It's a unit that combines length of life and quality of life into one number. Think of it as currency. One QALY equals one year of perfect health. Half a year of perfect health equals 0.5 QALYs. Two years of life lived at 50% quality of life equals 1 QALY. Researchers assign a "utility weight" between 0 (dead) and 1 (perfect health) to different health states, then multiply that weight by the years lived in that state.

ICER is the Incremental Cost-Effectiveness Ratio. It's the cost per QALY. If a new drug costs $30,000 more than what it's replacing and produces 1 additional QALY of health benefit, the ICER is $30,000 per QALY. If the same drug costs $200,000 more and produces 1 additional QALY, the ICER is $200,000 per QALY.

The US standard for "cost-effective" is generally an ICER below $100,000 to $150,000 per QALY. The UK uses about £20,000 to £30,000 per QALY. Anything above the threshold gets called "not cost-effective" and faces formulary restrictions, prior authorization barriers, or outright coverage denials.

(Confusingly, "ICER" is also the name of an organization, the Institute for Clinical and Economic Review. Same acronym. Different thing. We'll disambiguate as we go.)

That's the foundation. Now the interesting part.

❝

By the numbers (the four GLP-1 cost-effectiveness analyses, all published in 2025):

$197,023 per QALY: Hwang and Kim, JAMA Health Forum, March 2025 (NOT cost-effective)

$53,400 per QALY: ICER Final Evidence Report, December 2025 (IS cost-effective)

$57,400 per QALY: Brigham, Annals of Internal Medicine, knee osteoarthritis subgroup (IS cost-effective)

$10,817 per year value-based price: USC Schaeffer FAM/GRACE methodology, July 2025 (drugs may be UNDERVALUED)

Same drug. Same year. Four answers spanning an order of magnitude. Methodology drives the conclusion.

THE FOUR STUDIES

Study 1: Hwang and Kim, JAMA Health Forum, March 2025.

The paper Off Label profiled yesterday. Lead author Jennifer H. Hwang, DO. Senior author Dr. David Kim. Used the DOC-M simulation model on 4,823 NHANES individuals representing 126 million eligible US adults. Modeled lifetime outcomes. Found tirzepatide at $197,023 per QALY and semaglutide at $467,676 per QALY. Both drugs had a 0% probability of being cost-effective at any threshold between $100,000 and $200,000 per QALY. To reach the $100K threshold, tirzepatide would need a 30.5% price cut and semaglutide would need an 81.9% price cut. Conclusion: not cost-effective at current US prices.

Study 2: ICER Final Evidence Report, December 16, 2025.

Published by the Institute for Clinical and Economic Review (the organization). Tirzepatide ICER (the calculation) of $53,400 per QALY. Injectable semaglutide at $61,400 per QALY. Oral semaglutide at $69,300 per QALY. All three drugs cost-effective at the $100,000 per QALY threshold. ICER also calculated Health Benefit Price Benchmarks: target prices that would represent fair value. Their numbers: injectable semaglutide at $9,100 to $12,500 per year (current price $6,829, well under benchmark), tirzepatide at $11,700 to $16,100 per year (current price $7,973, way under benchmark). Translation: the drugs are cost-effective at current prices, possibly underpriced. But ICER also flagged a budget impact warning: at current prices, fewer than 1% of eligible patients could be treated before crossing their $880 million annual budget impact threshold. So cost-effective per individual, unaffordable at population scale.

Study 3: Brigham, Annals of Internal Medicine, September 2025.

Lead author Elena Losina, PhD, at the Brigham PIVOT Center. Looked specifically at adults with knee osteoarthritis AND obesity. Used a different simulation model (the Osteoarthritis Policy Model). Found tirzepatide at $57,400 per QALY versus diet and exercise. Tirzepatide had a 64% probability of cost-effectiveness at $100,000 per QALY. Semaglutide at 34%. Conclusion: cost-effective in this subgroup.

Study 4: USC Schaeffer Center, July 2025.

Used a methodology called FAM (Future America's Method) with GRACE adjustment. GRACE stands for Generalized Risk-Adjusted Cost-Effectiveness, developed by Lakdawalla and Doctor in 2024. The framework weights health gains in sicker patients more heavily than standard QALY does. Their result: a value-based price of $10,817 per year for tirzepatide for older Medicare populations. For younger cohorts (ages 44 to 46), $12,202 to $16,765 per year. Both numbers are higher than the current price of $7,973 per year, suggesting the drugs may be undervalued.

So which one is right.

All of them.

The drugs didn't change. The methodologies did.

THE SIX THINGS TO LOOK FOR

These are the six dials each of these studies turned to land on different answers. Once you know the dials, you can read any cost-effectiveness study independently.

1. Population modeled.

Who is the study about. Hwang and Kim studied 126 million broad-population US adults with obesity. The Brigham team studied adults with knee osteoarthritis AND obesity. Schaeffer ran separate models for older Medicare and younger cohorts. The narrower the population, the less generalizable the result. The narrower the population AND the sicker the included patients, the more QALY gain there is to capture per dollar spent. That's why Brigham's $57,400 per QALY makes sense and Hwang and Kim's $197,023 per QALY makes sense even though they're modeling the same drug.

2. Time horizon.

How long does the model run. All four studies used lifetime simulations, which sounds like it should produce comparable results. It doesn't, because lifetime models contain decades of assumptions, and the assumptions diverge. Schaeffer specifically tested both a 4-year and a lifetime diabetes prevention assumption. The longer you assume the drug stays effective, the more QALYs accumulate. The results swing dramatically.

3. Prices assumed.

Net price or list price. Net prices are after manufacturer rebates and PBM discounts. List prices are what you see on the bottle. Net prices are typically 30 to 50% lower than list prices. Hwang and Kim used 2023 net prices of $700 to $800 per month. ICER used 2024 net prices from SSR Health: $6,829 per year for semaglutide and $7,973 per year for tirzepatide. Different prices, different ICERs. If you ever read a cost-effectiveness study without checking the price assumption, you don't actually know what you read.

4. Scope of benefits included.

What did the model count as a benefit. Hwang and Kim modeled cardiometabolic outcomes only. They did not include sleep apnea benefits, knee osteoarthritis pain reduction, or productivity gains. Brigham included knee pain reduction (a huge QALY boost in their population). ICER included cardiovascular events plus selected quality-of-life dimensions. Schaeffer included earnings, disability claims, and broader social value. Excluded benefits don't show up in the math. Period. So the same drug with the same trial data can produce different ICERs depending on what the modelers chose to count.

5. Discount rate.

How much do future costs and benefits get worth less over time. The standard is 3% per year, established by the 2016 Sanders Second Panel on Cost-Effectiveness Analysis. All four GLP-1 studies use 3%. This is one of the rare points of agreement. Watch for studies that deviate.

6. Methodology framework.

QALY, evLY, or GRACE. This is the lever that flips the answer. Standard QALY treats one year of perfect health as equally valuable regardless of who is gaining it. ICER's evLY (equal value life-year) was created in 2019 specifically to comply with US anti-discrimination laws that bar Medicare from using QALYs. GRACE assigns more value to health gains for sicker patients. Same data, different framework, different conclusion. Hwang and Kim used standard QALY. Brigham used standard QALY. ICER used QALY plus evLY. Schaeffer used GRACE. The methodology choice is why Hwang and Kim got $197K and Schaeffer got drugs being undervalued. They are not contradicting each other. They are using different rulers.

If you check these six things on any cost-effectiveness study, you can read it as well as the people writing it.

THE FINE PRINT

Three things most coverage of cost-effectiveness analysis won't tell you.

1. Medicare can't legally use QALYs. Your private insurance can.

The Affordable Care Act of 2010 prohibits Medicare from using metrics that "treat extending the life of an elderly, disabled, or terminally ill individual as of lower value than extending the life of an individual who is younger, nondisabled, or not terminally ill." The Inflation Reduction Act of 2022 doubled down on this prohibition. So Medicare cannot use QALY-based cost-effectiveness analysis in coverage decisions.

But the law applies only to Medicare and CMS. Private insurers, employer-sponsored plans, and pharmacy benefit managers face no such restriction. CVS Caremark started using ICER QALY data in August 2018, with a hard cap at $100,000 per QALY for formulary exclusion. Express Scripts (Cigna) and Optum Rx (UnitedHealth Group) followed. The same metric Congress decided was too discriminatory for federal Medicare is the metric your employer-sponsored insurance uses every day to decide whether your prescription is covered. The legal asymmetry is structural, not accidental.

2. CVS Caremark dropped Zepbound from its formulary in July 2025. The math was running upstream the whole time.

Effective July 1, 2025, CVS Caremark removed Zepbound (tirzepatide) from its formularies and now only prefers Wegovy. CVS framed this as a clinical decision. The timing aligns with the Hwang-Kim paper publication in March 2025 and the start of ICER's review process in May 2025. Tirzepatide had higher absolute net prices than semaglutide, making it harder to justify on a cost-effectiveness basis even within ICER's more favorable framework.

The patient sees "your drug is no longer covered." The math driving that decision had been running in the background for years. The infrastructure to do this has been in place since 2018. What's new is GLP-1 demand reaching the volumes that finally made the math impossible to ignore.

3. The same drug can be cost-effective in one population and not cost-effective in another. Both findings can be true.

The Brigham team found tirzepatide cost-effective at $57,400 per QALY in the knee osteoarthritis subgroup. Hwang and Kim found tirzepatide not cost-effective at $197,023 per QALY in the broad population. Both papers are correct. The Brigham analysis captured pain reduction benefits the Hwang-Kim model did not include. That's not academic dishonesty. That's a different scope.

This matters operationally. If you face an insurance denial, the cost-effectiveness study you cite in your appeal should match your subgroup. If you have knee osteoarthritis and obesity, citing Brigham's findings is more relevant than citing the broad-population analysis. If you have diabetes and cardiovascular risk factors, the Hwang-Kim subgroup analysis (which found different results within different BMI categories and comorbidity groups) is more relevant. Same drugs, different studies, different applicability.

Text your doctor this: "I've been reading the four major cost-effectiveness analyses of GLP-1s published in 2025. They reach different conclusions because they use different populations, time horizons, prices, scopes of benefits, and methodologies. Can we look at which subgroup my situation fits best, so that if I face an insurance denial we can reference the cost-effectiveness study that includes my comorbidities and best supports my prior authorization appeal?"

Copy, paste, send.

THE CULTURE BEAT

The Big Three pharmacy benefit managers (CVS Caremark, Express Scripts, Optum Rx) functionally decide which Americans get which drugs at the formulary level. All three use ICER QALY reports as a primary input. CVS Caremark made formulary exclusion explicit policy in 2018. Express Scripts and Optum Rx followed. The same infrastructure that has been quietly excluding hundreds of drugs each year for the past 8 years is now applied to GLP-1s. This is the largest invisible system in American healthcare, and almost nobody outside the industry knows the name of the metric driving it.

Methodology fights are now political battles. GRACE was created to comply with anti-discrimination rules. evLY was created to comply with anti-discrimination rules. The Pioneer Institute is preparing legal challenges to state Medicaid programs adopting ICER reviews. The methodology debate is not academic. It is the proxy fight for whether disabled patients, elderly patients, and chronic-disease patients get equal treatment access.

Insurance companies hide behind the math because the math is convenient. When CVS Caremark removed Zepbound, they did not say "we removed it because the cost-effectiveness analysis didn't favor it." They said it was a clinical decision. The math was the actual decision driver. The clinical framing was the public-facing language. Patients deserve to know which is which.

There is a counter-data point worth knowing. Aon released findings on January 13, 2026 from a multi-year study of over 192,000 GLP-1 users. Female users showed a 47% reduction in cardiovascular hospitalizations, a 50% lower incidence of ovarian cancer, and a 14% lower incidence of breast cancer compared to non-users. For users with diabetes and 80%+ adherence, medical cost growth was 9 percentage points lower. These are the kinds of benefits that didn't appear in the Hwang-Kim model. If they prove out in further research, every cost-effectiveness analysis on GLP-1s gets revised upward. Schaeffer's "drugs may be undervalued" framing starts looking less like an outlier and more like a leading indicator.

Watch this: the ICER (the organization) 2026 review schedule. Their December 2025 Final Evidence Report on obesity drugs is the document driving the next round of CVS Caremark, Express Scripts, and Optum Rx formulary decisions. ICER has flagged oral semaglutide (Foundayo) as a near-term review priority. Their next report on GLP-1s will land before the end of 2026 and will functionally decide which oral GLP-1s get on which formularies in 2027. State Medicaid programs adopting cost-effectiveness reviews under BALANCE Model implementation are the second wave. We're tracking both.

WHAT'S NEXT

Tomorrow is a contrarian beat in the GLP-1 culture war. We're looking at an unexpected industry quietly fighting back hard against these drugs and what their internal data actually shows about the impact GLP-1 adoption is having on their bottom line.

❝

Reader Q: "if Medicare can't legally use QALYs, why does my private insurance get to?"

Because Congress wrote the QALY prohibition into the Affordable Care Act 2010 and Inflation Reduction Act 2022 specifically for Medicare and CMS, but did not extend it to private insurers, employer plans, or pharmacy benefit managers. CVS Caremark started using ICER QALY data for formulary exclusion in 2018, with a $100,000 per QALY cap. Express Scripts and Optum Rx followed. So the same metric Congress banned for federal coverage decisions is the metric your employer-sponsored insurance uses every day to decide whether your prescription is covered. The legal asymmetry is structural, not accidental.

Hit reply and tell us what you want us to cover. Every reply is read. Story tips: [email protected]

See you in your inbox.

— Off Label

This is Off Label, Not Medical Advice. Content is for informational purposes only. Always consult a qualified healthcare provider before making medical decisions about starting, stopping, or modifying any prescription medication. Off Label is not anti-medication, not anti-research, and not anti-cost-effectiveness analysis. Cost-effectiveness analysis is a useful tool for making allocation decisions about limited healthcare resources. The thesis of this issue is that the math driving your insurance coverage decisions deserves to be visible and understandable to the patients whose access is shaped by it. The studies discussed in this issue are all peer-reviewed or published by recognized research institutions. Different methodological choices producing different conclusions is a feature of the field, not a flaw, and reading the studies critically is how patients can engage with their own coverage decisions on more even ground.