APDU Vice President Reviews APDU Symposium Session on Census Data Products: What Would Your Billboard Say?

|

By: Amy O’Hara, Research Professor, Massive Data Institute, Georgetown University

During the APDU spring symposium, I had the pleasure of moderating the Census Data Products Roundtable with Beth Jarosz, Terry Ao Minnis, Allison Plyer, Yvette Robideaux, and Steve Ruggles.

We discussed decennial census products – statistics from the 2020 Census, the American Community Survey, and population estimates – and how communities are faring with 2020 data disruptions and delays in data releases.  These experts shared concerns about the accuracy and utility of all the data products. As the recently released Post-Enumeration Survey (PES) results revealed, the 2020 numbers were good in some places, less good in others.  Fourteen states had overcounts or undercounts.  These PES results, along with the excellent work of the CNSTAT Panel to Evaluate the Quality of the 2020 Census, will be dissected by census watchers.  But many broad audiences rely on Census Bureau data as well, and those groups may not be hearing messages about data availability and quality.  To address this, I asked the panelists what information they would convey:

If I gave you a billboard that would be seen by people that aren’t part of the Census nerd community, what would you put on that billboard?

The panelists shared a lot of great insights.

Yvette Robideaux stressed the need for quality subnational and subcounty data.  “I’ve heard a lot of people that work in data say “Well, we’re not gonna get that [population subgroup or geographic detail], but we have really good data at the national level!” It’s like, so you’re basically saying you don’t care about me and my people.”  Yvette wanted her billboard to say “Data and Equity,” and stress the needs of small groups and small communities.

Yvette also suggested a billboard read “Proceed with Caution,” because census data have issues.  They always have.  She notes that users need to reckon with the fact that the census is not a complete count like it’s supposed to be in the Constitution, that census has problems in the data, and try to prevent harm.  “I think that there’s potential harm that can happen if people don’t really understand the challenges and limitations of the data.  So we need more education broadly.”  Other panelists (and the moderator!) agreed with this.

Terry Ao Minnis added onto the theme, noting that people need help understanding what the different data sets are and how they should be used — using the proper source for the proper purpose.  “Once you start using things that aren’t supposed to be used for a particular purpose, then things really start to run amok I think. And it makes it really difficult to hone into what the actual issues are for the communities.”

During this conversation, I mentioned what I’d put on a billboard:  Don’t use the 2020 block data.  The Census Bureau told data users that higher geographies are better than lower ones.  Many census power users heard this message.  But many occasional users have not.  Not all blocks are bad, but without knowing which are good, treat them all as bad.

Beth Jarosz seconded the “don’t use the blocks” message.  Beth and her PRB colleagues drafted the disclosure avoidance documentation for the Redistricting Data File where they note that “Block-level data should be aggregated before use.”  That report also references Census Bureau research stating users should find reliable demographic characteristics when looking at places and minor civil divisions with population over 200 and block groups with population over 450.

Yvette reminded us that block level data is used in housing data, funding formulas, and by local governments to make resource allocation decisions.  “It’s created problems for a lot of smaller communities and rural communities to not have that data available. What’s the price of privacy? We don’t have block data.”  Panelist Steve Ruggles expressed the frustration that many users share: “I just think that a lot of this is self-inflicted wounds, and it would be nice if it hadn’t happened.”

The Census Bureau is pursuing formal privacy for its data releases.  Our panel discussed whether that strains its ability to produce the data that our communities need, particularly disaggregate data on race and ethnicity.  Beth noted, “there’s a push from the White House to do better data disaggregation but at the same time there’s a push for more disclosure avoidance. How do we get both? And get both with good quality data which we can trust?”

Panelist Allison Plyer introduced another concern: “Obviously the data at small geographies for small characteristic groups is of very much concern and has a lot of problems with it. I think what concerns me way more than differential privacy is the persistent undercounts. Mathematically they are much more damaging.”

More from Allison:  “Everything washes out in terms of differential privacy at the state level, right? The undercounts don’t though, and they have been going on for decades. I guess if I had a billboard it would say “Stop focusing on cost per household of the Census” because we have never achieved equity in this country by focusing on the cost of something. When we had built the middle class it cost a whole lot of money. Everything we have done that built equity in this country was not cheap. So that’s what my billboard would say.”

The entire APDU spring symposium was filled with experts like these sharing their views, concerns, and lessons.  I want to again thank Allison, Beth, Terry, Steve, and Yvette for sharing their views.  I hope we stop some traffic with these billboards.