Opinion
We can’t fix health disparities we don’t see
We know that the U.S. is plagued by health disparities that begin in the womb and continue throughout life. The life expectancy of Native Americans and Alaska Natives—67.9 years—is nearly 10 years less than the life expectancy of White Americans. Black infants are more than twice as likely to die as White infants. And we know all of this even though we actually have very little meaningful data about these populations.
Disaggregated data about racial and ethnic groups is often unavailable, not only in health but also in education, labor, and economics. According to my ongoing research, 49 state-level agencies do not publicly report data about their Middle Eastern and North African constituents (Michigan is the exception). Creating health equity requires more information: We cannot fix inequities we do not see.
Sign up for Harvard Public Health
Delivered to your inbox weekly.
Researchers often cite small sample size as an obstacle to collecting data from small minority groups in the United States. In reality, this is less of a problem than a myth—one that begs the question, “small compared to what?”
In fact, researchers have historically been good at doing science using small sample sizes. They’ve worked with community partners to successfully create innovative recruitment strategies (for example, patient registries) and develop health interventions for patients with rare diseases (including cancer and sickle cell disease). The Michigan Sickle Cell Data Collection (MiSCDC) Program, for instance, is a collaboration between the state department of public health and the University of Michigan, created to collect data and submit it to the Centers for Disease Control and Prevention. About 100,000 people in the U.S. live with sickle cell disease. More than 2.3 million people, meanwhile, identify as Vietnamese.
Anne Schneider and Helen Ingram, in their journal article “Social Construction of Target Populations: Implications for Politics and Policy,” argue that groups that have little power but are often viewed negatively—mixed-race communities, for instance—are less likely to receive political favor. And yet other disenfranchised groups that are viewed more positively—people living with breast cancer, for instance—are more likely to have political advantages. The National Institutes of Health spends about $581 million on breast cancer research alone compared to $525 million to operate the entire National Institute on Minority Health and Health Disparities.
Small sample sizes no longer justify avoiding the research necessary to eliminate disparities. Oversampling can be leveraged to yield statistically sound estimates of small groups and enhanced by increasing survey outreach and accessibility for people who use languages other than English, for instance. Advanced statistical techniques such as imputation, Bayesian analysis, and hierarchical modeling allow researchers to make more efficient use of available data, providing more reliable estimates. Researchers can supplement quantitative analyses with qualitative approaches or mixed methods to enrich the understanding of complex phenomena for minority groups.
Furthermore, interdisciplinary collaboration and data-sharing initiatives offer promising avenues for overcoming constraints of small sample sizes, such as conducting meta-analyses, replication studies, and cross-validation exercises that augment the robustness and generalizability of findings.
In March, the Biden administration took an important step toward addressing these problems by improving standards for maintaining, collecting, and presenting federal data on race and ethnicity—the first update in 27 years. But it’s not enough.
Federal agencies need to be held accountable for implementing these standards—for example, by establishing an inventory of data disaggregation, which can be understood as a collection of data about minority groups in the population. Such an inventory would document the extent of agencies’ data collection efforts. We also need more federal agency research about populations within minority groups. Asian Americans, for instance, are not monolithic. People with Chinese origins have different health risks than people from Pakistan.
In short, we need to leave behind excuses for not doing more health research on minority groups. Better data can help us give all communities equal chances at good health.
Republish this article
<p>There are new ways to gather data on marginalized groups. It’s time to use them.</p>
<p>Written by Tran T. Doan</p>
<p>This <a rel="canonical" href="https://harvardpublichealth.org/equity/small-sample-size-doesnt-justify-lack-of-health-disparity-data/">article</a> originally appeared in<a href="https://harvardpublichealth.org/">Harvard Public Health magazine</a>. Subscribe to their <a href="https://harvardpublichealth.org/subscribe/">newsletter</a>.</p>
<p class="has-drop-cap">We know that the U.S. is plagued by health disparities that begin in the womb and continue throughout life. The life expectancy of Native Americans and Alaska Natives—67.9 years—is nearly 10 years less than the <a href="https://www.kff.org/key-data-on-health-and-health-care-by-race-and-ethnicity/?entry=executive-summary-key-takeaways">life expectancy</a> of White Americans. Black infants are <a href="https://minorityhealth.hhs.gov/infant-mortality-and-african-americans">more than twice as likely to die</a> as White infants. And we know all of this even though we actually have very little meaningful data about these populations. </p>
<p>Disaggregated data about racial and ethnic groups is often unavailable, not only in health but also in education, labor, and economics. According to my ongoing research, 49 state-level agencies do not publicly report data about their Middle Eastern and North African constituents (Michigan is the exception). Creating health equity <a href="https://www.healthaffairs.org/doi/10.1377/hlthaff.2021.01417" target="_blank" rel="noreferrer noopener">requires more information</a>: We cannot fix inequities we do not see. </p>
<p>Researchers often cite small sample size as an obstacle to collecting data from small minority groups in the United States. In reality, this is less of a problem than a myth—one that begs the question, “small compared to what?” </p>
<p>In fact, researchers have historically been good at doing science using small sample sizes. They’ve worked with community partners to successfully create innovative recruitment strategies (for example, <a href="https://careregistry.ucsf.edu/home" target="_blank" rel="noreferrer noopener">patient registries</a>) and develop health interventions for patients with rare diseases (including cancer and sickle cell disease). The <a href="https://chear.org/research/projects/MiSCDC" target="_blank" rel="noreferrer noopener">Michigan Sickle Cell Data Collection (MiSCDC) Program</a>, for instance, is a collaboration between the state department of public health and the University of Michigan, created to collect data and submit it to the Centers for Disease Control and Prevention. About 100,000 people in the U.S. live with sickle cell disease. More than 2.3 million people, meanwhile, identify as Vietnamese. </p>
<p>Anne Schneider and Helen Ingram, in their journal article "<a href="https://www.jstor.org/stable/2939044">Social Construction of Target Populations: Implications for Politics and Policy</a>," argue that groups that have little power but are often viewed negatively—mixed-race communities, for instance—are less likely to receive political favor. And yet other disenfranchised groups that are viewed more positively—people living with breast cancer, for instance—are more likely to have political advantages. The National Institutes of Health spends about <a href="https://www.cancer.gov/about-nci/budget/fact-book/data/research-funding" target="_blank" rel="noreferrer noopener">$581 million on breast cancer research alone</a> compared to <a href="https://www.nimhd.nih.gov/funding/nimhd-funding/funding-strategy.html#:~:text=The%20NIMHD%20Appropriation%20for%20FY,previous%20fiscal%20year's%20appropriated%20budget." target="_blank" rel="noreferrer noopener">$525 million to operate the entire National Institute on Minority Health and Health Disparities</a>.</p>
<p>Small sample sizes no longer justify avoiding the research necessary to eliminate disparities. Oversampling can be leveraged to yield statistically sound estimates of small groups and enhanced <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10593109/" target="_blank" rel="noreferrer noopener">by increasing survey outreach and accessibility</a> for <a href="https://pubmed.ncbi.nlm.nih.gov/38462918/" target="_blank" rel="noreferrer noopener">people who use languages other</a> <a href="https://pubmed.ncbi.nlm.nih.gov/38462918/" target="_blank" rel="noreferrer noopener">than English</a>, for instance. Advanced statistical techniques such as <a href="https://healthpolicy.ucla.edu/our-work/training/imputation-methods-increasing-racialethnic-data-disaggregation" target="_blank" rel="noreferrer noopener">imputation</a>, Bayesian analysis, and hierarchical modeling allow researchers to make more efficient use of available data, providing more reliable estimates. Researchers can supplement quantitative analyses with qualitative approaches or mixed methods to enrich the understanding of complex phenomena for minority groups. </p>
<p>Furthermore, interdisciplinary collaboration and data-sharing initiatives offer promising avenues for overcoming constraints of small sample sizes, such as conducting meta-analyses, replication studies, and cross-validation exercises that augment the robustness and generalizability of findings.</p>
<p>In March, the Biden administration took an important step toward addressing these problems by <a href="https://www.whitehouse.gov/omb/briefing-room/2024/03/28/omb-publishes-revisions-to-statistical-policy-directive-no-15-standards-for-maintaining-collecting-and-presenting-federal-data-on-race-and-ethnicity/" target="_blank" rel="noreferrer noopener">improving standards</a> for maintaining, collecting, and presenting federal data on race and ethnicity—the first update in 27 years. But it’s not enough. </p>
<p>Federal agencies need to be held accountable for implementing these standards—for example, by establishing <a href="https://aapidata.com/action/spd15-report/" target="_blank" rel="noreferrer noopener">an inventory of data disaggregation</a>, which can be understood as a collection of data about minority groups in the population. Such an inventory would document the extent of agencies’ data collection efforts. We also need more federal agency research about populations within minority groups. Asian Americans, for instance, are not monolithic. People with Chinese origins have different health risks than people from Pakistan. </p>
<p class=" t-has-endmark t-has-endmark">In short, we need to leave behind excuses for not doing more health research on minority groups. Better data can help us give all communities equal chances at good health.</p>
<script async src="https://www.googletagmanager.com/gtag/js?id=G-S1L5BS4DJN"></script>
<script>
window.dataLayer = window.dataLayer || [];
if (typeof gtag !== "function") {function gtag(){dataLayer.push(arguments);}}
gtag('js', new Date());
gtag('config', 'G-S1L5BS4DJN');
</script>
Republishing guidelines
We’re happy to know you’re interested in republishing one of our stories. Please follow the guidelines below, adapted from other sites, primarily ProPublica’s Steal Our Stories guidelines (we didn’t steal all of its republishing guidelines, but we stole a lot of them). We also borrowed from Undark and KFF Health News.
Timeframe: Most stories and opinion pieces on our site can be republished within 90 days of posting. An article is available for republishing if our “Republish” button appears next to the story. We follow the Creative Commons noncommercial no-derivatives license.
When republishing a Harvard Public Health story, please follow these rules and use the required acknowledgments:
- Do not edit our stories, except to reflect changes in time (for instance, “last week” may replace “yesterday”), make style updates (we use serial commas; you may choose not to), and location (we spell out state names; you may choose not to).
- Include the author’s byline.
- Include text at the top of the story that says, “This article was originally published by Harvard Public Health. You must link the words “Harvard Public Health” to the story’s original/canonical URL.
- You must preserve the links in our stories, including our newsletter sign-up language and link.
- You must use our analytics tag: a single pixel and a snippet of HTML code that allows us to monitor our story’s traffic on your site. If you utilize our “Republish” link, the code will be automatically appended at the end of the article. It occupies minimal space and will be enclosed within a standard <script> tag.
- You must set the canonical link to the original Harvard Public Health URL or otherwise ensure that canonical tags are properly implemented to indicate that HPH is the original source of the content. For more information about canonical metadata, click here.
Packaging: Feel free to use our headline and deck or to craft your own headlines, subheads, and other material.
Art: You may republish editorial cartoons and photographs on stories with the “Republish” button. For illustrations or articles without the “Republish” button, please reach out to republishing@hsph.harvard.edu.
Exceptions: Stories that do not include a Republish button are either exclusive to us or governed by another collaborative agreement. Please reach out directly to the author, photographer, illustrator, or other named contributor for permission to reprint work that does not include our Republish button. Please do the same for stories published more than 90 days previously. If you have any questions, contact us at republishing@hsph.harvard.edu.
Translations: If you would like to translate our story into another language, please contact us first at republishing@hsph.harvard.edu.
Ads: It’s okay to put our stories on pages with ads, but not ads specifically sold against our stories. You can’t state or imply that donations to your organization support Harvard Public Health.
Responsibilities and restrictions: You have no rights to sell, license, syndicate, or otherwise represent yourself as the authorized owner of our material to any third parties. This means that you cannot actively publish or submit our work for syndication to third-party platforms or apps like Apple News or Google News. Harvard Public Health recognizes that publishers cannot fully control when certain third parties aggregate or crawl content from publishers’ own sites.
You may not republish our material wholesale or automatically; you need to select stories to be republished individually.
You may not use our work to populate a website designed to improve rankings on search engines or solely to gain revenue from network-based advertisements.
Any website on which our stories appear must include a prominent and effective way to contact the editorial team at the publication.
Social media: If your publication shares republished stories on social media, we welcome a tag. We are @PublicHealthMag on X, Threads, and Instagram, and Harvard Public Health magazine on Facebook and LinkedIn.
Questions: If you have other questions, email us at republishing@hsph.harvard.edu.