Skip to content
The Comparative Health Outcomes, Policy, and Economics (CHOICE) Institute

Data Repository

The CHOICE Institute hosts a variety of healthcare datasets which are available to the UW community under varying permission structures. A description of available datasets and access details are described below.

The Merative MarketScan® Commercial Claims and Encounters Research Databases contain longitudinal inpatient, outpatient, pharmacy claims, and insurance coverage data for patients across the U.S. who are covered by commercial insurance plans. The inpatient and outpatient claims databases include procedure and visit level details from medical claims such as ICD-9-CM diagnosis and procedure codes, Current Procedural Terminology (CPT) medical procedure codes, dates of service, and variables describing financial expenditures. The pharmacy claims database provides details including National Drug Codes (NDC) of the drugs dispensed, dates dispensed, quantity and days’ supply, and payments made for each claim. A separate eligibility and demographics file includes additional information about each subject such as age, gender, insurance plan type, employment status and classification, geographic location, and enrollment status by month.

Dr. Douglas Barthold

Data availability:
2007 – (Current year – 2)

Data location:
HSERV & UW Data Collaborative (UWDC) Servers


Access Renewal Cycle: Annual

Accessibility: CHASE Alliance / CHOICE consortium members

SEER-Medicare linked database (2007-2014) The SEER-Medicare data reflect the linkage of two large population-based sources of data that provide detailed information about Medicare beneficiaries with cancer. The data come from the Surveillance, Epidemiology and End Results (SEER) program of cancer registries that collect clinical, demographic and cause of death information for persons with cancer and the Medicare claims for covered health care services from the time of a person’s Medicare eligibility until death.

The linkage of these two data sources results in a unique population-based source of information that can be used for an array of epidemiological and health services research. For example, investigators using this combined dataset have conducted studies on patterns of care for persons with cancer before a cancer diagnosis, over the period of initial diagnosis and treatment, and during long-term follow-up. Investigators have also examined the use of cancer tests and procedures and the costs of cancer treatment.

Dr. Anirban Basu 

Data availability:
2007 – 2014 (breast, prostate, lung, colorectal, leukemia, Melanoma, Non-Hodgkin Lymphoma, CML, Endometrial, bladder, kidney and liver)

Data location:
UWDC Servers




Access Renewal Cycle: Annual

Accessibility: UW researchers (with permission from Prof. Anirban Basu)

All investigators must be aware of the following:

Access to these data are permitted on an annual basis only. A request to a sponsoring CHOICE faculty member and UWDC must be submitted to renew access, if needed.

All investigators must understand requirements for publications, including manuscript review from SEER-Medicare, prior to submitting a manuscript for any journal review; additionally, proper acknowledgements must be included in the manuscript.

Investigators must ensure that the proposed research analysis aligns with the aims stated in the CHOICE SEER-MEDICARE proposal.

All manuscript drafts must be submitted to the CHOICE faculty sponsor for review prior to publishing and your CHOICE faculty sponsor will work directly deal with SEER-Medicare contacts.

No individual level data can be exported out of UWDC servers.

Requesting Data Access

To request access to these data via UW Data Collaborative (UWDC) remote desktops please review and complete the below steps:

Fill out the UWDC Request for Data Access form:

  • Please note: you will need to provide a brief description of the proposed research.

Review and sign the UWDC DUA

Review and sign the SEER-Medicare DUA

  • Please return the SEER-MEDICARE DUA to Dr. Basu once completed.

National Council for Prescription Drug Programs (NCPDP) DataQ version 3.1. This dataset provides basic information for all approximately 80,000 pharmacies throughout the United States including address, NPI number, license number, Medicare/Medicaid numbers, 340B status, and immunization services provided. This dataset is a one-time dataset current as of April 2022.

For use and access information, please contact Jennifer Bacci .