Category Archives: Front Page

Posts and Pages with this category appear in the slider on the front page

2021 Annual Conference

 

AGENDA
REGISTRATION

With economic, public health, and governance challenges arising from COVID-19 and political polarization, trustworthy public data is vital to open and honest policy debates. Federal statistical data is used to understand the shifting American landscape, helping make sense of the new normal at work, in our communities, in governance – the list goes on. With trust in institutions waning among some, accurate public data can help restore trust encourage cooperation.

Register today for the APDU Virtual Annual Conference to be a part of this important discussion. APDU is where users, producers, and disseminators of government statistical data come together to learn of changes in public data, provide feedback to statistical agencies, and share best practices in the use of data.

Loading…

 

APDU Workshop Series: Making the Best of the 2020 Census

Virtual Workshop

Town Halls: April 14 and May 12, 2021

Trainings: June 16, August 18, and September 15, 2021

Office Hours: Biweekly beginning June 9, 2021

Price: Free

Register Here

Accurate statistics about 2020 will rely on much more than the decennial census data collection. Developing reliable data will require an understanding of challenges resulting from the pandemic, combined with greater use of non-traditional sources like administrative records. The solutions to these problems will impact how data is gathered going forward for a variety of purposes: education, housing, economic development, public health, and more.

Register today for this series of town hall events and trainings. During this workshop series you will learn more about the quality of the data that state and local leaders rely on and how you can improve and supplement it.

Town Hall #1: April 14, 2021 (3:00 – 4:00 PM ET)
2020 Census was “Different” – A Rundown of Issues

Facilitators:

  • Amy O’Hara, Research Professor, Massive Data Institute, Georgetown University
  • danah boyd, Principal Investigator, Microsoft Research & Founder, Data & Society

With the COVID-19 pandemic, political interference, and disclosure avoidance concerns, this census was deeply impacted. Amy and danah will discuss what happened with the census, where we are now, what researchers are hearing from the Census Bureau, the updated timeline, and what the Census Bureau can still fix.

Town Hall #2: May 12, 2021 (3:00 – 4:30 PM ET)
Solving Data “Differences” – Assessing the Use Cases

In this town hall, we will solicit your concerns and questions about upcoming census products – specifically about urban/rural, housing, workforce, health, and justice use cases. We will discuss data sources and methods for these different use cases. Since 2020 census products are delayed, we will discuss alternative data sources that may support population measurement.

Training #1: June 16, 2021 (1:00 – 3:30 PM ET)
Addressing the Census – Why Address Data is Crucial and How to Use It

In the first of a series of trainings focused on preparing data users to use the 2020 census data, we will begin by familiarizing the group with types of address data to lead to a high-quality census enumeration, help to validate the census publications that come out, and potentially how to mount a Count Question Resolution challenge. In this session, we will review coverage and classification issues, how to evaluate data sources and tools to assess your data.

Training #2: August 18, 2021 (1:00 – 3:30 PM ET)
Age Bins – Where to Find More Data

In our second training, we will discuss the importance of obtaining accurate data on different age categories. The Census Bureau has released demonstration data on their disclosure avoidance system; however, age bins have not been a component. Accurate age bins are critical for urban planning, public health, social research, and funding, and we know that the census has traditionally undercounted very young children and overcounted the elderly. We will discuss how possible imprecision in published census results may affect the age distribution and consider how age bins can be smoothed. We will also explore other datasets that can be used to understand key population subgroups.

Training #3: September 15, 2021 (1:00 – 3:30 PM ET)
Beyond COVID – Identifying Public Health Data to Prevent Disaster

Whether it’s a global pandemic or an overdose crisis in your community, we want to empower you with the tools and resources to identify patterns and be prepared to respond. This training will go over the new administration’s Executive Order, which datasets can drive insights around health, highlighting differences between statistical and tactical data. We will also discuss measuring migration and service utilization. With these tools, we are hoping to prepare our attendees to identify the best data and methods to deal with future public health crises or natural disasters.

2021 APDU Conference Call for Proposals

Public Data: Making Sense of the New Normal

APDU is welcoming proposals on “making sense of the new normal” using public data. With economic, public health, and governance challenges arising from COVID-19 and political polarization, trustworthy public data is vital to open and honest policy debates. APDU is interested in proposals regarding:

  • Novel uses of public data to understand the shifting American landscape;
  • Ways that researchers and advocates are ensuring that public data is accurate and equitable;
  • How public data can help restore trust in institutions;
  • How to rebuild trust in public data; or
  • Other related and relevant topics.

Proposals can be for a single presentation or panel, whether based on a particular project, data practice, or formal paper. You may submit ideas for a single presentation or a full panel (three presenters, plus a moderator). However, it is possible that we will accept portions of panel submissions to combine with other presenters. Submissions will be evaluated on the quality of work, relevance to APDU Conference attendees, uniqueness of topic and presenter, and thematic fit.

EXTENDED Deadline: March 26, 2021

Please submit your proposal using the Survey Monkey collection window below.  Proposals will need to be submitted by members of APDU, and all presenters in a panel must register for the conference (full conference registration comes with a free APDU membership).  Proposers will be notified of our decision by mid-April.

About APDU

The Association of Public Data Users (APDU) is a national network that links users, producers, and disseminators of government statistical data. APDU members share a vital concern about the collection, dissemination, preservation, and interpretation of public data.  The conference will be held virtually on July 26-29, 2021, and brings together data users and data producers for conversations and presentations on a wide variety of data and statistical topics.

Create your own user feedback survey

APDU Member Post: Assessing the Use of Differential Privacy for the 2020 Census: Summary of What We Learned from the CNSTAT Workshop

By:

Joseph Hotz, Duke University

Joseph Salvo, New York City Department of City Planning

Background

The mission of the Census Bureau is to provide data that can be used to draw a picture of the nation, from the smallest towns and villages to the neighborhoods of the largest cities. Advances in computer science, better record linkage technology, and the proliferation of large public data sets have increased the risk of disclosing information about individuals in the census.

To assess these threats, the Census Bureau conducted a simulated attack, reconstructing person-level records from published 2010 Census tabulations using its previous Disclosure Avoidance System (DAS) that was based in large part on swapping data records across households and localities. When combined with information in commercial and publicly available databases, these reconstructed data suggested that 18 percent of the U.S. population could be identified with a high level of certainty. The Census Bureau concluded that, if adopted for 2020, the 2010 confidentiality measures would lead to a high risk of disclosing individual responses violating Title 13 of the U.S. Code, the law that prohibits such disclosures.

Thus, the Census Bureau was compelled to devise new methods to protect individual responses from disclosure. Nonetheless, such efforts – however well-intentioned – may pose a threat to the content, quality and usefulness of the very data that defines the Census Bureau’s mission and that demographers and statisticians rely on to draw a portrait of the nation’s communities.

The Census Bureau’s solution to protecting privacy is a new DAS based on a methodology referred to as Differential Privacy (DP). In brief, it functions by leveraging the same database reconstruction techniques that were used to diagnose the problem in the previous system: the 2020 DAS synthesizes a complete set of person- and household-level data records based on an extensive set of tabulations to which statistical noise has been added. Viewed as a continuum between total noise and total disclosure, the core of this method involves a determination regarding the amount of privacy loss or e, that can be accepted without compromising data privacy while ensuring the utility of the data. The key then becomes “where to set the dial”—set e too low and privacy is ensured at the cost of utility, but set e too high and utility is ensured but privacy in compromised. In addition to the overall level of e, its allocation over the content and detail of the census tabulations for 2020 is important. For example, specific block-level tabulations needed for redistricting may require a substantial allocation of the privacy-loss budget to achieve acceptable accuracy for this key use, but the cost is that accuracy of other important data (including for blocks, such as persons per household) will likely be compromised. Finding ways to resolve these difficult tradeoffs represents a serious challenge for the Census Bureau and users of its data.

The CNSTAT Workshop

In order to test how well this methodology worked in terms of the accuracy of noise-infused data, the Census Bureau issued special 2010 Census files subject to the 2020 DAS. The demonstration files applied the 2020 Census DAS to the 2010 Census confidential data — that is, the unprotected data from the 2010 Census that are not publicly available. The demonstration data permit scientific inquiry into the impact of DP. In addition, the Census commissioned the Committee on National Statistics (CNSTAT) of the National Academies of Sciences, Engineering and Medicine to host a 2-day Workshop on 2020 Census Data Products: Data Needs and Privacy Considerations, held in Washington, DC, on December 11-12, 2019. The two-fold purpose of the workshop was:

  • To assess the utility of the tabulations in the 2010 Demonstration Product for specific use cases/real-life data applications.
  • Generate constructive feedback for the Census Bureau that will be useful in setting the ultimate privacy loss budget and on the allocation of shares of that budget over the broad array of possible tables and geographic levels.

We both served as the co-chairs of the Committee that planned the Workshop. The Workshop brought together a diverse group of researchers who presented findings for a wide range of use cases that relied on data from past censuses.

These presentations, and the discussions surrounding them, provided a new set of evidence-based findings on the potential consequences of the Census Bureau’s new DAS. In what follows, we summarize “what we heard” or learned from the Workshop. This summary is ours alone; we do not speak for the Workshop’s Planning Committee, CNSTAT, or the Census Bureau. Nonetheless, we hope that the summary below provides the broader community of users of decennial census data with a better understanding of some of the potential consequences of the new DAS for the utility of the 2020 Census data products. Moreover we hope it fosters an on-going dialogue between the user community and the Census Bureau on ways to help ensure that data from the 2020 Census are of high quality, while still safeguarding the privacy and confidentiality of individual responses.

What We Heard

  • Population counts for some geographic units and demographic characteristics were not adversely affected by Differential Privacy (DP). Based on results presented at the Workshop, it appears that there were not, in general, differences in population counts between the 2010 demonstration file at some levels of geography. For the nation as a whole and for individual states, the Census’s algorithm, ensured that that counts were exact, i.e., counts at these levels were held invariant by design. Furthermore, the evidence presented also indicated that the counts in the demonstration products and those for actual 2010 data were not very different for geographic areas that received direct allocations of the privacy budget, including most counties, metro areas (aggregates of counties) and census tracts. Finally, for these geographic areas, the population counts by age in the demonstration products were fairly accurate when using broader age groupings (5-10 year groupings or broader ones), as well as for some demographic characteristics (e.g., for non-Hispanic whites, and sometimes for Hispanics).
  • Concerns with data for small geographic areas and units and certain population groups. At the same time, evidence presented at the Workshop indicated that most data for small geographic areas – especially census blocks – are not usable given the privacy-loss level used to produce the demonstration file. With some exceptions, applications demonstrated that the variability of small-area data (i.e., blocks, block groups, census tracts) compromised existing analyses. Many Workshop participants indicated that a larger privacy loss budget will be needed for the 2020 Census products to attain a minimum threshold of utility for small-area data. Alternatively, compromises in the content of the publicly-released products will be required to ensure greater accuracy for small areas.

The Census did not include a direct allocation of the privacy-loss budget 2010 demonstration file to all geographic areas, such as places and county subdivisions, or to detailed race groups, such as American Indians. As noted by numerous presenters, these units and groups are very important for many use cases, as they are the basis for political, legal, and administrative decision-making. Many of these cases involve small populations and local officials rely on the census as a key benchmark; in many cases, it defines who they are.

  • Problems for temporal consistency of population counts. Several presentations highlighted the problem of temporal inconsistency of counts, i.e., from one census to the next using DP. The analyses presented at the Workshop suggested that comparisons of 2010 Census data under the old DAS to 2020 Census data under DP may well show inexplicable trends, up or down, for small geographic areas and population groups. (And comparisons of 2030 data under DP with 2020 data under DP may also show inconsistencies over time). For example, when using counts as denominators to monitor disease rates or mortality at finer levels of geography by race, by old vs young, etc., the concern is that it will be difficult to determine real changes in population counts, and, thus, real trends in disease or mortality rates, versus the impact of using DP.
  • Unexpected issues with the post-processing of the proposed DAS. The Top-Down algorithm (TDA) employed by the Census Bureau in constructing the 2010 demonstration data produced histograms at different levels of geography that are, by design, unbiased —but they are not integers and include negative counts. The post-processing required to produce a microdata file capable of generating tabulations of persons and housing units with non-negative integer counts produced biases that are responsible for many anomalies observed in the tabulations. These are both systematic and problematic for many use cases. Additional complications arise from the need to hold some data cells invariant to change (e.g., total population at the state level) and from the separate processing of person and housing unit tabulations.

The application of DP to raw census data (the Census Edited File [CEF]) produces estimates that can be used to model error, but the post-processing adds a layer of complexity that may be very difficult to model, making the creation of “confidence intervals” problematic.

  • Implications for other Census Bureau data products. Important parts of the planned 2020 Census data products cannot be handled by the current 2020 DAS and TDA approach. They will be handled using different but as-yet-unspecified methods that will need to be consistent with the global privacy-loss budget for the 2020 Census. These products were not included in the demonstration files and were out of scope for the Workshop. Nonetheless, as noted by several presenters and participants in the Workshop, these decisions raise important issues for many users and use cases going forward. To what extent will content for detailed race/Hispanic/nationality groups be available, especially for American Indian and Alaska Native populations? To what degree will data on household-person combinations and within-household composition be possible under DAS?

For example, while the Census Bureau has stated that 2025 will be the target date for the possible application of DP to the ACS, they indicated that the population estimates program will be subject to DP immediately following 2020. These estimates would then then be used for weighting and post-stratification adjustments to the ACS.

  • Need plan to educate and provide guidance for users of the 2020 Census products. Regardless of what the Census Bureau decides with respect to ε and how it is allocated across tables, the Workshop participants made clear that a major re-education plan for data users’ needs to be put in place, with a focus on how best to describe key data and the shortcomings imposed by privacy considerations and error in general. Furthermore, as many at the Workshop voiced, such plans must be in place when the 2020 Census products are released to minimize major disruptions to and problems with the myriad uses made of these data and the decisions based on them.
  • Challenging privacy concerns and their potential consequences for the success of the 2020 Census. Finally, the Workshop included a panel of experts on privacy. These experts highlighted the disclosure risks associated with advances in linking information in public data sources, like the decennial census, with commercial data bases containing information on bankruptcies and credit card debt, driver licenses, and federal, state and local government databases on criminal offenses, public housing, and even citizenship status. While there are federal and state laws in place to protect the misuse of these governmental databases as well as the census (i.e., Title 13), their adequacy is challenged by advances in data linkage technologies and algorithms. And, as several panelists noted, these potential disclosure risks may well undercut the willingness of members of various groups – including immigrants (whether citizens or not), individuals violating public housing codes, or those at risk of domestic violence – to participate in the 2020 Census.

The Census Bureau has recently stated that it plans to have CNSTAT organize a follow-up set of expert meetings to “document improvements and overcome remaining challenges in the 2020 DAS.” In our view, such efforts, however they are organized, need to ensure meaningful involvement and feedback from the user community. Many within that community remain skeptical of the Bureau’s adoption of Differential Privacy and its consequences for their use cases. So, not only is it important that Census try to address the various problems identified by Workshop presenters and others who evaluated the 2010 demonstration products, it also is essential that follow-up activities are designed to involve a broader base of user communities in a meaningful way.

We encourage members of the census data user community to become engaged in this evaluation process, agreeing, if asked, to become involved in these follow-up efforts. Such efforts will be essential to help ensure that the Census Bureau meets its dual mandate of being the nation’s leading provider of quality information about its people and economy while safeguarding the privacy of those who provide this information.

FY20 Budget Moves from House to Senate

The House has passed appropriations bills to the Senate for FY2020, and there are important developments for statistical agencies. The Census Bureau, Bureau of Labor Statistics (BLS), and Bureau of Economic Analysis (BEA) each received modest to substantial increases in their budgets.

With massive increases in spending by the Census Bureau needed to successfully complete the Decennial Census, Congress appropriated $7.558B for the Census Bureau, with $274M for Current Surveys and Programs and $7.284B for Periodic Censuses and Programs. Importantly, this provides 6.696B for the Decennial Census, which is the minimum requested to complete the count effectively.

BEA received $107.9M, which assumes full funding for efforts to produce annual GDP for Puerto Rico. In addition, Congress apportioned $1.5M to the Outdoor Recreation Satellite Account, and $1M to develop income growth indicators.

After several years of flat funding, the BLS operational budget has been increased to $655M. This includes $587M for necessary expenses for the Bureau of Labor Statistics, including advances or reimbursements to State, Federal, and local agencies and their employees for services rendered, with no more than $68M that may be expended from the Employment Security Administration account in the Unemployment Trust Fund. This number includes $27M for the relocation of the BLS headquarters to the Suitland Federal Center and $13M for investments in BLS such as an annual supplement to the Current Population Survey on contingent work, restoration of certain Local Area Unemployment Statistics data, and development of a National Longitudinal Survey of Youth.

APDU Board Member: Learn about new data sources at the APDU Annual Conference

By APDU Board Member Beth Jarosz

If you ask a data user to name public data sources, she might name the decennial census, American Community Survey, National Vital Statistics System, or Current Population Survey. Each of those sources provides robust, timely, accurate public data on important topics like population, housing, and employment. Yet the “big name” public data sources merely hint at the breadth and depth of data available, which includes information on consumer expenditures, healthcare access and utilization, and participation in the arts.

Do you know what share of the American public attends jazz concerts or reads poetry? Attendees will learn about trends in arts and leisure activities from the Survey of Public Participation in the Arts, which tracks arts attendance by detailed event type and state.

Do you know that the Consumer Expenditures Survey is one of the nation’s most complex surveys (by number of variables)? The Consumer Expenditures Survey captures a range of expenditures, incomes, and demographic characteristics. Attendees will learn about what types of questions can, and cannot, be answered with this dataset as well was the geographic detail currently available.

Did you know that the Medicare Current Beneficiary Survey now includes a de-identified public use file? With the public use file analysts can answer questions about Medicare beneficiary insurance status, socio-demographic characteristics, access to care, health status, preventive behaviors, falls, housing characteristics, and experiences with Medicare Advantage.

Join us in Arlington, Va. July 9th and 10th, 2019 at APDU’s Annual Conference to learn more about these, and other, under-the-radar public data sources.

APDU Board Member: Register for the conference today!

By APDU Board Member Sue Copella

Be sure to register for the 2019 APDU Annual Conference! The conference is being held at the Key Bridge Marriott in Arlington, VA on July 9-10. The agenda is set and there are some very exciting speakers. Along with the ever popular Washington Briefing and Data Viz Awards, this year the conference has three main themes: the breath of public data, diverse uses of public data and strengthening and supporting the public data system.

I am also very excited to present this year on the panel discussing Best Ways to Access the Census Data You Need.  As the State Data Center for Pennsylvania I will be presenting on the National State Data Center program and how the states and territories serve the information needs by disseminating demographic and economic data to academic institutions, businesses, non-profits and private citizens.  For the session I will be partnering with the Census Bureau who will be talking about the new dissemination platform and the Federal Statistical Research Data Center.

While I have been a member of APDU for years, last year was the first time I was able to attend the conference since it was held during the summer.  It is a great place for both data users and producers of pubic data to meet and share information on what others are doing in the public data world.  So, if you haven’t had a chance to register, please do so today.

APDU Conference Chair Invites You to the APDU Annual Conference

As 2019 APDU Annual Conference Committee Chair, I would like to cordially invite you to attend what is shaping up to be a fantastic 2019 APDU Annual Conference July 9 – 10 at the Key Bridge Marriott in Arlington, VA.  The 2019 Annual Conference “Wide World of Data” will focus on three main themes this year: the breadth of public data; diverse uses of public data; and what is being done to strengthen and support the public data system.  The conference will once again be an excellent opportunity for users and producers of public data to connect, share information, and learn.

I myself being an economist at the U.S Bureau of Economic Analysis (BEA) am excited as both a producer and user of public data to attend the APDU Annual Conference.  As a producer of public data I am eagerly looking forward to discussions about the pros and cons of using differential privacy with public data as well as the use of machine learning and algorithms with public data.  I want to hear an update about the latest information on the federal statistical agency reorganization proposed by the administration. I am looking forward for the opportunity to meet users of BEA data.

As a user of public data I am excited to hear the latest developments with regard to data from colleagues at the U.S Census Bureau, U.S Bureau of Labor Statistics, the National Endowment for the Arts, the Center for Medicaid and Medicare Services, and the Kentucky Center for Statistics.  I am looking forward to learning how public data is being used to inform decision making from speakers from Georgetown University, Indiana University Purdue University Indianapolis (IUPUI), the George Washington Institute of Public Policy, the University of Alabama Institute for Rural Health Research, the American Association of State Highway and Transportation Officials, and more. Finally, I am especially thrilled by the opportunity to hear from statistical agency leadership including, the Directors of both the U.S. Census Bureau and the U.S. Bureau of Economic Analysis.

I look forward to seeing you at the 2019 APDU Annual Conference.

364 Days Until Census Day 2020!

APDU staff attended the Census 2020: Navigating the National and Local Challenges panel discussion hosted by the Brookings Institution to hear legal, demographic, and Census experts discuss a Decennial Census that has garnered interest both for its importance and for its controversies. Primary questions from the meeting revolved implications of including the citizenship question in the Decennial Census, cybersecurity, and how to encourage residents to respond.

Former Census Bureau Director John Thompson noted that there is “no basis for the citizenship question” and that agency research indicates that it will decrease the response rate. Brookings Senior Fellow William Frey supported Thompson’s statements by emphasizing the importance of gathering this community data and the impacts it will have on communities’ federal funding, private grant dollars, and resources to serve the right population.

Thompson shared that the Census Bureau was underfunded from 2012 – 2017, so the Bureau prioritized shifting from the traditional paper collection to an automated and online process. He noted the Bureau is constantly working on improving cybersecurity and is committed to keeping residents’ responses safe and confidential.

The second panel facilitated by the National League of Cities’ CEO Clarence Anthony focused on the implications and efforts at the local level to ensure the best data possible is collected. Beth Link, the Director of Census Counts, encouraged communities to educate their elected officials and noted that there will be questionnaire assistance centers to help make the necessary technology accessible to communities where it’s needed and to help answer questions as residents complete the forms.

APDU will continue to monitor 2020 Census preparations and will serve as a resource to our members moving forward. To learn more and hear directly from Census Bureau leadership, join us at the APDU Annual Conference in Arlington, VA on July 9-10, 2019.

Resources shared during the discussion include:

Recordings of the Panels can be found below.

Join APDU

APDU is a membership network of individuals and organizations that provide a voice for public data. We foster communication between data users and stakeholders regarding important issues of government information and statistical policy. APDU Members enjoy a number of benefits:

  • APDU Weekly, bringing you the latest news, trends, events, publications, and opportunities.
  • Public Data University, comprehensive training on public/open datasets, along with special topics.
  • Annual Conference, our premier networking and education event for public data users.
  • Member Area on APDU.org, with archived training, resources, tools, and publications.
  • Preferred submissions to APDU Job Board.
  • Member Network and committees, connecting you to federal agencies, academic institutions, state and local agencies, businesses, consultants, entrepreneurs, and students.

APDU offers four levels of membership:

Premium Organizational membership – $995

  • Full access to all member benefits and content
  • Up to 25 staff with access to member benefits
  • Unlimited free Public Data University webinars and resources
  • 10 annual conference registrations at Premium member rate
  • Training at member rate
  • Eligible for APDU Board of Directors and committees

Basic Organizational membership – $375 / $700

  • Full access to all member benefits and content
  • Up to 3 staff with access to member benefits (4-6 contacts, $700)
  • Unlimited free Public Data University webinars and resources
  • Conference and training registration at member rate
  • Eligible for APDU Board of Directors and committees

Individual membership – $200

  • Full access to all member benefits and content
  • One person (additional contacts activate the Basic Organization rate)
  • Unlimited free Public Data University webinars and resources
  • Conference and training registration at member rate
  • Can serve on APDU Board of Directors and committees

Affiliate membership – $75

  • APDU Weekly and access to member area content
  • One person (additional contacts activate the Basic Organization rate)
  • One free access to live webinar; conference and training registration at full rate
  • Can serve on APDU committees
  • Students (w/ active student ID) eligible for special member and conference registration rates

For more information about becoming an APDU member, contact info@apdu.org

Download the membership form (PDF)

Send your completed form and dues to:
Association of Public Data Users
P. O. Box 12546
Arlington, VA 22209