Data Disaggregation and the Importance of the Statistical Policy Directive 15 Revisions
What is Data Disaggregation and Why is it Important?
Data Disaggregation is the practice of separating compiled data into smaller categories to better visualize the trends and patterns that are invisible when looking at the broader dataset. Data is typically disaggregated along key identity lines such as age, ethnicity, and gender and also broader classifications such as geography.
Data disaggregation was a pertinent issue during the COVID-19 pandemic because it was essential to be able to quickly identify factors that lowered or increased transmission among the most vulnerable segments of our population. In hindsight, factors like age were crucial to pandemic modeling because there was a clear correlation between age and hospitalizations stemming from COVID-19. Population level age information that accurately depicted the size of each age group was necessary for hospitals to model the estimated resource requirement. Having better data enables intervention to be its most efficient and effective because resources and solutions can be allocated in response to the data patterns.
Another example is that data disaggregation can benefit specific marginalized communities. For example within the LGBTQ community, the Center for American Progress found that both bi-sexual men and women were more likely to impoverished than their straight counterparts, lesbian and gays were found to be employed at higher rates, and bi-sexual men and women were found to be in poor health more often than their straight and gay counterparts. However due to the lack of data disaggregation, many of the changes were either not statistically significant, perhaps, due in part to the small sample size. This issue largely stems from the fact that Sexual Orientation and Gender Identification data is not often gathered on surveys. Additionally when the data is collected, bi-sexual men and women are typically lumped in with lesbian and gay categories. The Center for American Progress data on health and employment outcomes report indicates that there are differences between bi-sexual and lesbian and gay communities, which means combining both groups is irresponsible. Without more comprehensive and clearer data collection, it will be unknown if these social and health outcome differences actually exist–rendering CBO’s and Federal Government Agencies unable to address the symptoms nor the underlying causes.
To expand on both the above threads, data disaggregation of LGBTQ people was even more important in certain regions during the pandemic. For instance in the Western Balkans existing social inequities and discrimination were exacerbated during the pandemic making it difficult for LGBTQ to engage with critical services. On the whole significantly less LGBTQ people rated their health as very good relative to the general population with Transgender people reporting that their health was very good at the lowest rate. Related is the fact that in data provided by the World Bank, 12% of LGBTQ people had self-reported as not seeking necessary care motivated by the threat of discrimination. In the future if countries in the Balkans region wanted to address health inequities, the visibility of this data will be a key component on whether improvement is made as without Sexual Orientation Gender Identification Data these differences are not apparent when examining the total population survey data.
Office of Management and Budget’s Statistical Policy Directive 15
Recently Data Disaggregation has taken a big step forward within the Federal Government with the announcement of the Office of Management and Budget’s (OMB) revisions to Statistical Policy Directive 15 (SPD 15): Standards for Maintaining, Collecting, and Presenting Federal Data on Race and Ethnicity, marking the first set of revisions since 1997. The revision process was initiated in June of 2022 and the revisions were published in March of 2024. The revisions were driven by a working group consisting of personnel spanning 35 federal agencies that took into account spades of public input to devise their suggestions. The volume of public comments, listening sessions, town hall meetings, and tribal consultations define the SDP 15 revisions as a change that will be materially helpful to the visibility of our most marginalized communities. OMB has already begun spreading the word to community based organizations (CBOs) through virtual webinars and other mediums to connect with community leaders and share with them what SDP 15 means for their communities and what new possibilities will be open to them in the future.
The key revisions included in SDP 15 are as follows: using one combined question for race and ethnicity, and encouraging respondents to select as many options as apply to how they identify, adding Middle Eastern or North African as a new minimum category, the new set of minimum race and/or ethnicity categories are: American Indian or Alaska Native, Asian, Black or African American, Hispanic or Latino, Middle Eastern or North African, Native Hawaiian or Pacific Islander, and White.
SDP 15 also requires the collection of additional detail beyond the minimum required race and ethnicity categories for most situations, to ensure further disaggregation in the collection, tabulation, and presentation of data when useful and appropriate. In addition to the above standards, OMB has furnished the federal agencies with additional guidance, updated definitions and terminology to improve the presentation of data.
To provide an example of why the OMB’s focus on consistent data matters in daily implementation, the variance in terms of census data is important. The US Census Bureau publishes two well known sets of data. The Decennial Census and the American Community Survey which occur yearly are crucial for many government agencies and CBOs to take actions such as generating fact sheets, determining action plans, and communicating with stakeholders. However the Decennial Census and American Community Survey data does not include more specific ethnicity information, so many minority focused organizations need to use Public Use Microdata Samples (PUMS) data which shows further disaggregated ethnicity data. A problem arises because PUMS data is not the same size as the American Community Survey data because it is a ⅔ sized sample, so PUMS data does not square with publicly available American Community Survey data which can lead to confusion or not accurate data. Obviously, it is not solely a processes issue that enables change within the federal government, part of the Census Bureau’s absence of more comprehensive data is a lack of manpower or resources which stems from budget constraints, but giving agencies long implementation runways should enable them secure the necessary funds ahead of the deadline.
Top and Middle: Snapshots of PUMS Language Spoken at Home table. Below: ACS NY State Language Spoken at Home table. Both tables from Census Bureau
Circling back to the new SDP 15 revisions, the most impactful action is that the working group will continue to function beyond this year’s revisions. OMB has set areas of concern to be researched and to be addressed in a future revision that will happen no later than 2029. This prevents the data standard from again becoming antiquated. Leaving room for standards to change and evolve to fit constituents needs. With accurate wide-reaching data, the quality of life of the nations most vulnerable can only improve.
Further Reading
Information on SDP 15: https://spd15revision.gov/
Information on specific uses of data disaggregation:
https://civilrights.org/edfund/data-disaggregation-action-network/
http://www.educationnewyork.com/files/The%20importance%20of%20disaggregating_0.pdf
https://eab.com/resources/blog/student-success-blog/disaggregate-success-data/
https://www.oecd.org/gender/governance/toolkit/government/assessment-of-gender-impact/disaggregated-data/
https://www.urban.org/urban-wire/state-data-disaggregation-asian-american-native-hawaiian-and-pacific-islander-groups
https://www.americanprogress.org/article/disaggregating-data-bisexual-people/
https://www.ncbi.nlm.nih.gov/books/NBK589469/https://blogs.worldbank.org/en/europeandcentralasia/closing-data-gap-lgbti-exclusion
Information on data disaggregation
Shorter: https://iris.paho.org/bitstream/handle/10665.2/52002/Data-Disaggregation-Factsheet-eng.pdf?sequence=19
Longer: https://www.adb.org/sites/default/files/publication/698116/guidebook-data-disaggregation-sdgs.pdf