Skip to main content
Illustration: Green binary code on a black surface with an earthquake-like crack through the center.

Filling the gaps in U.S. health data

The COVID-19 pandemic demonstrated the urgent need for an information overhaul.
Filed Under
Written by
Michelle A. Williams
Gabriel Seidman
January 17, 2024
Read Time
6 min

The COVID-19 pandemic highlighted huge and troubling gaps in the United States’s use of health data. Those well-documented failures in collecting, sharing, and analyzing data made it far harder than it should have been for local, state, and federal officials to understand where the virus was spreading, who was dying, and how vaccines were being distributed.

The federal government has taken several steps to nudge the fractured U.S. data system toward reform. The Centers for Disease Control and Prevention has launched a Data Modernization Initiative for local, state, and tribal jurisdictions, though the funding is far from adequate. And new federal rules require electronic health record systems to make bulk data available in a standardized format, simplifying efforts to monitor trends across health care systems.

That’s a start. But much more needs to be done.

Sign up for Harvard Public Health

Exploring what works, what doesn't, and why.

Delivered to your inbox weekly.

  • By clicking “Subscribe,” you agree to receive email communications from Harvard Public Health.
  • This field is for validation purposes and should be left unchanged.

A strong national health data ecosystem would securely link individual medical records, public health data, and information about the conditions affecting health in specific neighborhoods—giving officials a clear view of the risks affecting various communities and catalyzing smart strategies for mitigating those risks. It would be both an essential tool for handling future pandemics and a vital resource for improving day-to-day public health outside of emergencies like COVID.

Last year, we gathered ideas from two dozen experts from the public, private, and nonprofit sectors to sketch a vision for building this ecosystem, which we published in a recent article in the American Journal of Public Health. We came away with three concrete proposals.

Encourage states to designate entities for data collection, sharing, and use

One of the biggest challenges to building a strong national data ecosystem is that most states and jurisdictions are acting in isolation; there are no universal standards for collecting, sharing, or safeguarding data, nor guidelines to enable interoperability. In 2022 alone, 28 legislative bills pertaining to public health information or reporting were enacted in 17 states.

Even well-meaning laws can obstruct progress toward effective data sharing. New Jersey, for instance, has written into law the specific terms that need to be used when clinical labs collect data on race, ethnicity, sexual orientation, and gender identity. The goal—to improve visibility into health disparities—is laudable. But “hard coding” specific terminology into law makes it more difficult to compare and exchange data across states, or even across sectors within a state.

That’s why we recommend that every state designate an entity for collecting, sharing, and using data from clinical, public health, social determinants, and other administrative data. This approach is consistent with the “health data utility” model that some organizations have already adopted.

To support states in the creation of these entities, a nonpartisan commission led by the federal government should study and develop evidence-based guidelines for the collection, exchange, and analysis of health data while keeping patient privacy and data security at the forefront. Federal funding should encourage states to adopt these guidelines.

This commission should also draft model legislation and regulatory templates. A repository of evidence-based model legislation would be hugely helpful for local and state policy makers. As more jurisdictions adopted the shared model and its language, the U.S. would begin to build a more effective and interoperable data ecosystem. 

Expand regulatory “sandboxes” to promote innovation

Regulatory sandboxes allow entrepreneurs and health system innovators to try new concepts in a controlled environment. As one example, the Massachusetts Digital Health Sandbox offers a variety of programs that can be configured to emulate clinical and IT environments. Companies can request access to real health data to test their technology within these sandbox spaces.

Such environments can dramatically speed up the R&D process by enabling companies to test and iterate on their ideas in realistic conditions—without putting the operations of any real-world health systems at risk.

We’d like to see more such sandboxes nationwide, including environments that bring together data streams from many different agencies across multiple states.

Build the business case for a nationwide health data ecosystem

Modernizing the nation’s health data ecosystem will take a huge investment; estimates range from about $8 billion to nearly $37 billion. It’s essential to build the business case for that expenditure.

Some data points already exist. For instance, the failure to integrate and appropriately leverage health data costs the U.S. tens of billions each year in wasted medical expenditures. But there has been no comprehensive attempt to assess the return on investment from building a well-functioning health data system.

We contend that the benefits will be enormous. Knitting together clinical data from individual patients, population data from public health surveys, and data about the social, economic, and environmental factors affecting health will allow policy makers to craft highly targeted—and often, highly cost effective—interventions to save lives.

Joshua Sharfstein, Maryland’s former Secretary of Health & Mental Hygiene, gave us a powerful example: A robust health information exchange known as CRISP enables public health officials to spot patterns of risk. For example, during the COVID pandemic, CRISP used master patient indexing of electronic medical records to rapidly identify COVID cases and conduct outbreak investigation.

Another example comes from Massachusetts, where a groundbreaking project united 23 data streams—from medical providers, social service agencies, prisons, shelters, first responders and more—in one data warehouse. For the first time, officials were able to gather a comprehensive picture of how the opioid epidemic affected specific populations. Monica Bharel, the state’s former commissioner of public health, told us that these insights allowed officials to tailor public policy and clinical guidance so they could finally halt what had seemed like an inexorable rise in opioid-related deaths.

Quantifying the benefits of such projects will help policy makers and taxpayers understand why a major investment in data modernization is warranted.

Modernizing the national health data ecosystem will be a long and laborious process. Yet we are confident it will pay huge dividends in preventing disease, protecting health, and promoting well-being in every community. The COVID-19 pandemic showed us how broken our current system is. Now is the time to start fixing it.

Source image: wildpixel / iStock

Filed Under
Michelle A. Williams
Michelle A. Williams is Joan and Julius Jacobson Professor of Epidemiology and Public Health at the Harvard T.H. Chan School of Public Health. She is the former dean of faculty at the Chan School. Read more from Michelle A. Williams.
Gabriel Seidman
Gabriel Seidman is the director of policy at the Ellison Institute of Technology and an alumnus of the Harvard T.H. Chan School of Public Health.

More in Policy & Practice

See all