All posts by Brendan Buff

APDU President’s Letter Inviting Members to the Annual Conference

Dear colleagues,

The 2021 Association of Public Data Users Annual Conference, “Public Data: Making Sense of the New Normal,” is only a few weeks away! Hopefully, you are making plans to attend, July 26-29. While we wish we could be together in person, we are confident that you will find this year’s virtual conference to be very relevant, addressing many issues important to public data users.

High ranking federal officials, such as the directors of the Census Bureau, Bureau of Labor Statistics, and Institute of Education Sciences, are headlining one of the week’s plenary sessions, while Hansi Lo Wang, National Correspondent at NPR, is moderating another plenary session on diversity, equity, and inclusion in public data. In addition, attendees will hear from experts in the field regarding innovations in linked administrative data and trends in collaborative data sharing. The conference will also provide ample opportunities for data users to network and exchange information informally.

The complete schedule is posted on the APDU home page. Please register and encourage your colleagues to do the same. If you have any questions, please do not hesitate to contact the APDU staff at info@apdu.org.

We hope to see you there!

Best,

Mary Jo Hoeksema

2021-2022 APDU President

May 12 Workshop Notes: Discussion and Concerns

On May 12, the Association of Public Data Users and the Massive Data Institute at Georgetown University held a town hall session on Solving Data “Differences” – Assessing the Use Cases. 

For the panel, Amy O’Hara, Research Professor at the Massive Data Institute, joined Connie Citro, Former Director of the Committee on National Statistics (CNSTAT), Joe Salvo, Former Director of the NYC Department of City Planning, and Chris Dick, Founder of Demographic Analytics Advisors. The panel discussed implications of new methods employed in the 2020 Census – above all, the Disclosure Avoidance System (DAS) and differential privacy – on common use cases of census data. With an interactive format including breakout room discussions, the panel solicited questions and concerns from the audience on use cases including urban/rural, housing, workforce, health, and justice issues. The panel and attendees engaged in a fruitful conversation about the implications of these changes and what users would like to see given the need to balance privacy and utility for different data categories and use cases. 

During this event facilitators invited attendees to breakout rooms to discuss concerns related to the quality of decennial census data being released this year. There were two main themes identified in those discussions, along with an assortment of other concerns.

Balancing Privacy and Utility

First, participants are concerned with balancing the privacy and utility of data. For example, the Census Bureau’s new privacy mechanism, known as the Disclosure Avoidance System (DAS), uses a process known as differential privacy to limit identification of individuals using granular data. There are currently no tools available that explain these changes in layman’s terms, leading to a lack of clarity among data users on fundamental questions such as whether new data from the Census Bureau will be comparable to non-DAS data. 

In particular, there is concern about whether data on smaller populations (and smaller sample sizes), such as American Indians, will be fit for use. Researchers wanting to conduct analyses of housing structure by race, for example, are uncertain if the data will be accurate for all groups. These concerns extend beyond the 2020 census planned releases.  Participants were confused and concerned with how the population base from the 2020 census will affect the American Community Survey and population estimates.  

Further, with the Disclosure Avoidance System and differential privacy there may be inconsistent household and population data, as person and household records will not be processed simultaneously and therefore not linked. This will prevent meaningful measures of persons-per-household. There are concerns related to the levels of noise in the data, and how that will affect the ability of local governments to serve their communities.

Census data has a wide variety of use cases, and nearly all discussions of the DAS have focused on the Redistricting File to be published Summer 2021.  Participants question how all the use cases for the Demographic and Housing Characteristics File will be handled, and whether there will need to be different versions of datasets to fit different use cases or some other work-around. 

It is important that the Census Bureau finds a way to communicate differential privacy to laypeople through educational trainings and tools.  They must work with community groups to share information and build grassroots understanding. Using story maps like On the Map or GIS visualizations may be able to supplement these trainings, showing how current statistics are affected by these developments.

With the need to balance privacy and data quality in mind, what are some compromises that were acceptable to attendees? The practice of “binning” data may be an option – for example, releasing three-group race data rather than four. For race and Hispanic origin, some attendees indicated that summary race data was usable, but that keeping block level data available is essential. Block level data in general has been helpful for cross-walking between tracts, which supports infrastructure planning. Some participants felt that detailed age data may be more important than race data.  For some, it would be preferable to forfeit highly-detailed tables to preserving publications for more geographies. Data accuracy was favored over granularity by many participants (though without consensus over which statistics to roll-up or suppress).

Data Categories and Use Cases

Attendees had various concerns related to specific data categories and use cases. With regard to urban and rural geographies, it is helpful to minimize constraints on data. Attendees were concerned about definition changes that may be implemented, such as the change of the definition for metropolitan statistical areas and how this will affect funding allocations and metropolitan planning organizations. In addition, it is unclear how the differences in data collection and other characteristics between rural and urban areas would cause disproportional errors in imputation.

Group quarters also present unique challenges and opportunities for the census. As many group quarters are businesses or government facilities, extensive administrative data are often available on these facilities, and can play a role in producing more accurate group quarters population counts.

Data Collection

Large-scale changes have occurred in recent decennial censuses in the way the Census Bureau collects data, such as internet response and greater use of administrative data. Users are interested in more information about changes from prior decades and how those changes affected data quality. For example, it would be helpful to have a step-by-step guide to changes found in a single location that explains planned changes to the 2020 census and changes imposed on the Census Bureau due to COVID-19.

The decennial changes unrelated to differential privacy that attendees were monitoring. Housing and housing stock changes and internet self-response (especially in areas with poor internet connections such as rural areas or impoverished inner cities) will impact data in ways that are yet to be determined. Also, during pandemic lockdowns, people moved to unexpected places, exacerbating the typical springtime “snowbird effect.” Finally, there are concerns about duplication of entries due to non-ID submissions. 

2021 APDU Data Viz Awards: Call for Visualizations

The Association of Public Data Users (APDU) is pleased to announce the 2021 Data Viz Awards. After a hiatus due to pandemic disruptions, we are again soliciting creative and meaningful graphic designs that use publicly-available data (for example, data from the Census Bureau or Bureau of Labor Statistics) to convey a compelling point or story.

APDU is particularly interested data visualizations relevant to issues of 2021, such as:

  • Public health and COVID-19
  • Racial equity
  • Public engagement

About the Award

APDU started the Data Viz Awards in response to our members’ growing need to communicate their data and research to a variety of audiences using graphic technologies and cutting-edge techniques. APDU hopes to engage data users and help them understand and share data for analysis and decision making.

Nominees selected by an impartial expert committee from each category (listed below) will be invited to share their visualizations  on the first day of the  2021 APDU Annual Conference: July 26, 2021, held virtually through Whova. Conference attendees will then vote on winners from each category using the Whova platform.

Winners in the “Researchers & Students” category will also receive a free APDU membership for 2021.

What We’re Looking For

APDU will select creative and compelling images in four categories:

  • State/Local government, including independent and quasi-independent agencies;
  • Federal government, including independent and quasi-independent agencies;
  • Private firms, which can include consultancies, advocacy groups, or any other private firms using public data; and
  • Researchers/Students, which can include any visuals published or formally presented by researchers or students in higher education, think tanks, research organizations, nonprofits, or similar.

Submissions must have been created after January 1, 2021.  All visualizations nominated for presentation at the Annual Conference will be eligible for the award provided that nominees register for the conference.

Deadline: Friday, May 28, 2021

Create your own user feedback survey

APDU Workshop Series: Making the Best of the 2020 Census

Virtual Workshop

Town Halls: April 14 and May 12, 2021

Trainings: June 16, August 18, and September 15, 2021

Office Hours: Biweekly beginning June 9, 2021

Price: Free

Register Here

Accurate statistics about 2020 will rely on much more than the decennial census data collection. Developing reliable data will require an understanding of challenges resulting from the pandemic, combined with greater use of non-traditional sources like administrative records. The solutions to these problems will impact how data is gathered going forward for a variety of purposes: education, housing, economic development, public health, and more.

Register today for this series of town hall events and trainings. During this workshop series you will learn more about the quality of the data that state and local leaders rely on and how you can improve and supplement it.

Town Hall #1: April 14, 2021 (3:00 – 4:00 PM ET)
2020 Census was “Different” – A Rundown of Issues

Recording

Facilitators:

  • Amy O’Hara, Research Professor, Massive Data Institute, Georgetown University
  • danah boyd, Principal Investigator, Microsoft Research & Founder, Data & Society

With the COVID-19 pandemic, political interference, and disclosure avoidance concerns, this census was deeply impacted. Amy and danah will discuss what happened with the census, where we are now, what researchers are hearing from the Census Bureau, the updated timeline, and what the Census Bureau can still fix.

Town Hall #2: May 12, 2021 (3:00 – 4:30 PM ET)
Solving Data “Differences” – Assessing the Use Cases

Recording

Summary

Facilitators:

  • Amy O’Hara, Research Professor, Massive Data Institute, Georgetown University
  • Connie Citro, Former Director, Committee on National Statistics
  • Joe Salvo, Former Director, NYC Department of City Planning
  • Chris Dick, Founder, Demographic Analytics Advisors

In this town hall, we will solicit your concerns and questions about upcoming census products – specifically about urban/rural, housing, workforce, health, and justice use cases. We will discuss data sources and methods for these different use cases. Since 2020 census products are delayed, we will discuss alternative data sources that may support population measurement.

Training #1: June 16, 2021 (1:00 – 3:30 PM ET)
Addressing the Census – Why Address Data is Crucial and How to Use It

In the first of a series of trainings focused on preparing data users to use the 2020 census data, we will begin by familiarizing the group with types of address data to lead to a high-quality census enumeration, help to validate the census publications that come out, and potentially how to mount a Count Question Resolution challenge. In this session, we will review coverage and classification issues, how to evaluate data sources and tools to assess your data.

Training #2: August 18, 2021 (1:00 – 3:30 PM ET)
Age Bins – Where to Find More Data

In our second training, we will discuss the importance of obtaining accurate data on different age categories. The Census Bureau has released demonstration data on their disclosure avoidance system; however, age bins have not been a component. Accurate age bins are critical for urban planning, public health, social research, and funding, and we know that the census has traditionally undercounted very young children and overcounted the elderly. We will discuss how possible imprecision in published census results may affect the age distribution and consider how age bins can be smoothed. We will also explore other datasets that can be used to understand key population subgroups.

Training #3: September 15, 2021 (1:00 – 3:30 PM ET)
Beyond COVID – Identifying Public Health Data to Prevent Disaster

Whether it’s a global pandemic or an overdose crisis in your community, we want to empower you with the tools and resources to identify patterns and be prepared to respond. This training will go over the new administration’s Executive Order, which datasets can drive insights around health, highlighting differences between statistical and tactical data. We will also discuss measuring migration and service utilization. With these tools, we are hoping to prepare our attendees to identify the best data and methods to deal with future public health crises or natural disasters.

Office Hours

The 2020 Decennial Census faces a number of potentially significant impacts to data quality for a variety of stakeholders with varying levels of data expertise. The Association for Public Data Users and the Massive Data Institute at Georgetown University are partnering up to facilitate Office Hours for census stakeholders. These virtual meetings are dedicated spaces to speak with a team of experts to answer questions related to census data quality. Please click on the links next to the expert you would like to schedule office hours with to be directed to a customizable calendar invite link. You can also send your questions or topic ideas to mdi-research@georgetown.edu, and we’ll be sure to find the answer or find an expert who knows the answer. All persons interested are welcome to attend. 

Meet Our Census Data Experts:

Biweekly Open Office Hours:

Amy O’Hara: General Questions, Administrative Records, Data Linkage

Every other Wednesday at 5pm (June 9, 23; July 7, 21; August 4, 18; September 1)

https://georgetown.zoom.us/j/93504860306 

One-on-One Office Hours:

Claire Bowen: Data Privacy, Differential Privacy

Schedule with Claire 

Chris Dick: Population Estimates, Administrative Data, Data Use in State and Local Government

Schedule with Chris

Ron Prevost: Administrative Records, Population Estimates, Data Privacy

Schedule with Ron

2021 APDU Conference Call for Proposals

Public Data: Making Sense of the New Normal

APDU is welcoming proposals on “making sense of the new normal” using public data. With economic, public health, and governance challenges arising from COVID-19 and political polarization, trustworthy public data is vital to open and honest policy debates. APDU is interested in proposals regarding:

  • Novel uses of public data to understand the shifting American landscape;
  • Ways that researchers and advocates are ensuring that public data is accurate and equitable;
  • How public data can help restore trust in institutions;
  • How to rebuild trust in public data; or
  • Other related and relevant topics.

Proposals can be for a single presentation or panel, whether based on a particular project, data practice, or formal paper. You may submit ideas for a single presentation or a full panel (three presenters, plus a moderator). However, it is possible that we will accept portions of panel submissions to combine with other presenters. Submissions will be evaluated on the quality of work, relevance to APDU Conference attendees, uniqueness of topic and presenter, and thematic fit.

EXTENDED Deadline: March 26, 2021

Please submit your proposal using the Survey Monkey collection window below.  Proposals will need to be submitted by members of APDU, and all presenters in a panel must register for the conference (full conference registration comes with a free APDU membership).  Proposers will be notified of our decision by mid-April.

About APDU

The Association of Public Data Users (APDU) is a national network that links users, producers, and disseminators of government statistical data. APDU members share a vital concern about the collection, dissemination, preservation, and interpretation of public data.  The conference will be held virtually on July 26-29, 2021, and brings together data users and data producers for conversations and presentations on a wide variety of data and statistical topics.

Create your own user feedback survey

2020 APDU Candidate Statements

Candidate for President: Mary Jo Hoeksema

Since January 2004, Mary Jo Hoeksema has been the Director of Government Affairs for the Population Association of America and Association of Population Centers. In addition to representing PAA and APC, Ms. Hoeksema has co-directed The Census Project since 2008.  Prior to her position with PAA/APC, Ms. Hoeksema worked at the National Institutes of Health for approximately 10 years, as the Legislative Officer at the National Institute on Aging and as the Special Assistant to the Director of the NIH Office of Policy of Extramural Research Administration.  Ms. Hoeksema served as a Legislative Assistant for Congresswoman Rosa DeLauro and Legislative Correspondent for U.S. Senator Jeff Bingaman.  Ms. Hoeksema moved to Washington, DC from her home state of New Mexico to work at the Council for a Livable World as a 1990 Scoville Fellow.

Ms. Hoeksema has a Master of Public Administration from the George Washington University and is a former Presidential Management Fellow. She also has a bachelor’s degree in political science and history from the University of New Mexico.

Candidate Statement

I was introduced to APDU shortly after arriving at the Population Association of America (PAA). I was immediately drawn to the organization given its mission and the fellowship that I found with its members. I discovered that the annual meeting was a unique opportunity to meet data users outside of academia–especially those from federal, state, and local governments–and learn firsthand what issues were affecting their access to timely and accurate data.

I have served on the APDU board, as a member and previously as Vice President, for approximately four years. During this time, I’ve been involved in several initiatives, including revising the organization’s strategic plan, advising APDU’s advocacy agenda, and co-chairing the annual meeting. These experiences, combined with my frequent interactions with APDU members, has given me insight into the organization’s strengths and challenges. If elected president, I would build upon the work APDU has initiated to:

  • Increase APDU’s membership, particularly among young professionals entering the field;
  • Enhance the organization’s visibility inside and outside of the data user community;
  • Improve APDU’s education and training opportunities;
  • Strengthen communication with APUD members; and,
  • Seek opportunities to collaborate with similar organizations to advance the interests of the diverse data users APDU represents.

If elected president I will always be open to hearing ideas and discussing issues with members.

Candidate for Vice President: Amy O’Hara, Research Professor, Georgetown University

Amy O’Hara is a Research Professor in the Massive Data Institute and Executive Director of the Federal Statistical Research Data Center at the McCourt School for Public Policy. She also leads the Administrative Data Research Initiative, improving secure, responsible data access for research and evaluation. Her research focuses on population measurement, data quality, and record linkage. O’Hara has published on topics including the measurement of income, longitudinal linkages to measure economic mobility, and the data infrastructure necessary to support government and academic research.

Prior to joining Georgetown, O’Hara was a senior executive at the U.S. Census Bureau where she founded their administrative data curation and research unit. She received her Ph.D. in Economics from the University of Notre Dame.

Candidate Statement

Last year, I wanted to serve on the APDU board to improve data access and quality for members, researchers, and program administrators. This year has revealed the cracks in our measurement infrastructure and the dire need to explain and inform our decision makers.  2020 has been rough on everyone, but especially on institutions like CDC and the Census Bureau.  The impact of the pandemic continues to evolve in state and local governments, who face rising infection rates, battered economies, volatile budgets, and a great deal of uncertainty.  Data will not solve these problems, but none of these problems can be solved without data.

APDU can, and must, foster coordination between federal, state, and local data producers and data users.  For ADPU, I will work towards establishing standards and norms for secure and responsible data use.  Our community needs to incorporate broader views of where data comes from and what it is needed for; emphasize data utility when designing privacy protections; and increase social license.

Candidate At-Large Director: Bernie Langer, Senior Data Analyst, Center for Court Innovation

Bernie Langer’s expertise in public data comes from his previous work at PolicyMap. Mr. Langer has a deep and broad knowledge about federal statistical agencies and private data providers, as well as experience working with data and data users to solve problems. He worked with data from the Census Bureau, BLS, IRS, SSA, HUD, USDA, FDIC, FBI, FCC, FEMA, DOT, NCES, EPA, SBA, and CDC, just to name a few. Mr. Langer also led PolicyMap’s “Mapchats” webinar series, a forum for data providers and users to discuss their work.

Mr. Langer’s current work at the Center for Court Innovation deals with a very different type of data, regarding New York City’s criminal justice system. In his role as a senior data analyst, Mr. Langer works with the organization’s Supervised Release Program, a pre-trial alternative to bail.

Candidate Statement

I am excited to continue serving on the APDU Board of Directors. In my last term, I served on the conference committee, which put together APDU’s first ever virtual conference. The conference was a success, virtually bringing together people working in data from across the country at a crucial point during the 2020 Census and Covid crisis.

I find APDU’s conferences, webinars, and newsletters invaluable. As a board member, I would continue my commitment to maintaining the high quality of APDU’s services and events, finding additional ways for data providers and users to interact, and raising the profile of public data in society.

Candidate for At-Large Director: Michelle Riordan-Nold, Executive Director, Connecticut Data Collaborative

Michelle Riordan-Nold has served as Executive Director of the Connecticut Data Collaborative (CTData) since 2014. In her current role, Ms. Riordan-Nold leads CTData, whose mission seeks to democratize access to public data and build data literacy skills to increase data informed decision making in Connecticut. CTData is also the designated Census State Data Center for Connecticut. In addition, the organization holds monthly public data literacy workshops; creates maps and other visualization tools for community organizations to access and use data; and is building an integrated data system in Hartford. In 2020, the organization was the winner of the CT Entrepreneurial Award in Education.

Prior to leading CTData, Ms. Riordan-Nold worked as a research analyst for the CT Economic Resource Center and before that for the Connecticut Legislature in the Program Review and Investigations Committee. Ms. Riordan-Nold has a Bachelor degree in Mathematics from Boston College and a Masters in Public Policy from the University of Chicago.

Candidate Statement

I have been both an attendee and a presenter at the APDU conferences for the past five years. It is great to be a part of a community that is working on improving public access to data and sharing new ways to access and improve its use. I am always amazed at the initiatives happening at the federal level and leave each conference with new ideas and data to share with the community of data users we serve in Connecticut.

If elected, I would be interested in finding ways to increase the membership to include more state level data users. Federal data is critical to much of the work at the state level and I see an opportunity for sharing and increasing the knowledge of both state and federal data users to help improve the work at all levels of government.

I also see an important role of the APDU in staying connected and informed about the evolving Disclosure Avoidance Policy implementation. I believe this should be at the forefront of all data discussions and was encouraged by the attention it received during this year’s conference. The APDU plays an important role in guiding the data user community on how to use the data but can also advocate to make sure the data is provided in such a way that it can be used for informed decision making at all levels of government. I would encourage the APDU to take a more active role in advocating for transparency around the implementation.

If elected, I hope to provide a state level perspective and contribute to the growth of the organization by helping to broaden the membership to include a more diverse group of data users.

Candidate for At-Large Director: Daniel Quigg, CEO, Public Insight Corporation

Dan Quigg is a serial entrepreneur focusing primarily on software analytics. Dan has served as CEO of Public Insight Data Corporation (Public Insight) since 2012, a business intelligence company that transforms public data into actionable insights with solutions in career and workforce development, staffing and recruiting, and higher education benchmarking. Public Insight leverages industry and government data in its self-service business intelligence applications such as Insight for Work and Insight for Higher Education.

Over his career of over thirty years Dan has founded or led eight early stage businesses. Dan is an Ernst & Young Entrepreneur of the Year finalist and winner of the Smart Business Rising Star Award. He successfully sold three businesses, two to public technology firms where he took a senior executive position. He has also served on the adjunct entrepreneurship faculty of Kent State University and has served on multiple corporate boards. Dan has also served as an advisor for micro-economic development in developing countries, primarily Rwanda and Peru. He currently is on the National Council of the Valparaiso University College of Business.

Dan received his B.S. from Valparaiso University in 1981 and his CPA in 1983.  He received his MBA from Case Western Reserve University Weatherhead School of Management in May 2007.  Dan was the inaugural winner of the Weatherhead Executive MBA Leadership Award as nominated by his peers.

Candidate Statement

I have always had a passion for data and am a self-described “data junkie”. I founded Public Insight in 2012 because I saw an asset in public data that was dramatically underutilized. Public Insight was built around that very concept.

I have been involved with APDU since starting Public Insight. I and my company have benefitted greatly from the research, webinars, and conferences. However, I feel that there is a large, untapped audience in the private sector that utilize public data and are not being reached by APDU. I see it every day. Should you decide to accept my candidacy into APDU, I would advocate for outreach to the private sector. Given my startup experience, I can add a lot of value in how to reach and extend APDU’s reach into the private sector.

I would advocate for more online education and training to the private sector. In the labor market particularly, there is a hunger for more information due to pandemic-induced volatility. I see courses like what is currently being offered through the Labor Market Institute (LMI) as a vehicle to reach a broader audience with minimal investment and risk.

My impressions of APDU suggest it is moving more and more to policy and advocacy. My interest is not in these areas nor do I add any value. I am a user of public data and want to see its value disseminated. This is where I can add value and where the mission is aligned with Public Insight.

Candidate for At-Large Director: Lori Turk-Bicakci, Ph.D., Director, Lucile Packard Foundation for Children’s Health

Lori Turk-Bicakci, Ph.D., is Director for Kidsdata, a program of the Lucile Packard Foundation for Children’s Health. She promotes data-based decision making and action to improve children’s health and well-being, and she contributes to the quality, relevance, and utility of the data and content on kidsdata.org.  She oversees the process of collecting, preparing, and releasing data from more than 35 federal and state data sources. Before joining the Foundation, Dr. Turk-Bicakci was a senior researcher at American Institutes for Research. She has extensive experience with data collection, analysis, and reporting for education, social services, and other research projects that support children’s long-term health and development. Prior to her work in research, Dr. Turk-Bicakci was a middle school social studies teacher.

Fundamentals of Data Science and Visualization

Virtual Training

November 9 – 19, 2020

Classes: Nov 9, 10, 16, 17, 19 from 2:00-4:00 pm Eastern

Office Hours: Nov 9, 10, 16, 17, 19 from 4:00-5:00 pm Eastern

DOWNLOAD AGENDA

PDF Registration                                                          Online Registration

Data analysts can use a variety of methods and tools to accomplish their goals. With a deeper understanding of data visualization software packages, your organization can produce more intuitive data visualizations in less time and identify the best software solutions to optimize your team’s workflows.

In this course, we will review best practices in data visualization design and use cases for Excel, Tableau, and R (programming language).

Learn how to clean and format data in Excel, create interactive dashboards in Tableau, and clean and visualize data in R. This course will help participants identify use-cases for each software package that maximize impact with minimal effort, expanding participants’ toolbox as an analyst.

Join us to learn about how your organization can better leverage data visualization software!

Meet Your Instructor:

Lee Winkler joined the Center for Regional Economic Competitiveness (CREC) in 2018 after graduating with a Master’s in Public Policy from the George Washington University. He currently supports projects analyzing state-level certification and license attainment and the prevalence of educational and workforce credentials. Lee regularly uses Tableau to clean data, mine insights and create interactive visualizations and is excited to help the class find how Tableau can add value to their workflow.

Registration:
APDU Members: $390
Non-Members: $715

Looking Back on the 2020 APDU Annual Conference

 

With the 2020 APDU Annual Conference in the rearview mirror, now is a good time to reflect on the week and look ahead to what’s next.

This year’s conference, as so many things in 2020, was disrupted but not diminished. While we didn’t have the opportunity to meet with each other in person, the virtual format enabled some of our friends from around the country to participate who might not have been able to otherwise.

Speakers like danah boyd of Microsoft Research and Data & Society Research Institute (excerpted above) brought a unique perspective to the conference, challenging our thinking about from issues ranging from how we approach issues of privacy and accuracy to the impacts misinformation and data voids can have on our understanding of data quality and reliability.

Federal agency leaders such as Deborah Stempowski, Brian Moyer, Bill Beach, and Mary Bohman provided insider insights into their organizations.

Speakers from universities and research organizations across the country covered hot topics such as data on COVID-19, evictions, policing, and more.

Speakers from the Census Bureau, universities, and nonprofits discussed how the Disclosure Avoidance System will affect the quality of Census data.

Attendees met with APDU board members in a series of town hall conversations on a variety of topics – offering a promising way for APDU members to connect with one another.

This year’s conference was a success for a variety of reasons – but the biggest reason was the engagement of our attendees and speakers. Stay tuned for continued quality programming in Fall 2020!

Intermediate Data Visualization Techniques in Tableau

August 25-September 3, 2020

Virtual Training

AGENDA

A picture is worth a thousand words. Use data to state your case using easy-to-understand data visualization tools. Give your audience the freedom to adapt your data in new ways in interactive dashboards that answer immediate questions and uncovers new insights. Data visualization tools can help you communicate better both internally and with your partners.

Tableau can help you produce more intuitive data visualizations, and we can show you how. In this course, you will build your skills in making appropriate graphics, but you will also incorporate complex calculations in ways that improve insights, make charts more relevant, and create the most impactful dashboard graphics.

Learn how to clean, shape, aggregate, and merge frequently used public data in Tableau Prep. Then, organize your visualizations into sleek dashboards in Tableau Desktop. We will provide helpful tips on how to analyze, design, and communicate these data in ways that will wow your supervisor and organization’s customers.

Training Prerequisites:

Skills: Participants must have a basic understanding of how Tableau works before attending this class, including knowledge of Tableau terminology, uploading data, editing data sources, and creating basic charts. Attendees should be familiar with all materials presented in the Pre-Session Videos: Overview of Charts and Calculated
Fields.
Tools: Laptop, wired mouse, Tableau Desktop (personal, professional, or public version), and Tableau Prep.
• Public version of the Tableau desktop is available at:
https://public.tableau.com/s/download
• Tableau Prep Software can be downloaded here:
https://www.tableau.com/products/prep/download

**Zoom will be required for this training – if you have Zoom restrictions for a work laptop, we recommend using a personal laptop or desktop. We do not recommend using an iPad for this training.
Pricing
APDU, C2ER, LMI Institute Premium Organizational Members $ 495
APDU, C2ER, LMI Institute Individual & Organizational Members $ 575
Non-Members $ 715

CANCELLATION POLICY: APDU must confirm cancellation before 5:00 PM (Eastern Standard Time) on August 14, 2020, after which a $135 cancellation fee will apply. Substitute registrations will be accepted.

APDU Member Blog Post: It’s not too late to rebuild data-user trust in Census 2020 data products

By: Jan Vink, Cornell Program on Applied Demographics
Vice chair of the Federal State Cooperative on Population Estimates Steering Committee
Twitter: @JanVink18
Opinions are my own

The Census Bureau is rethinking the way it will produce the data published from the Census 2020. They argue that the old way is not good enough anymore in this day and age because with enough computer power someone could learn too many details about the respondents.

There are two separate but related aspects to this rethinking:

  1. The table shells: what tabulations to publish and what not to publish
  2. Disclosure Avoidance Systems (DAS) that add noise to the data before creating these tables

Both aspects have huge consequences for data users. A good place to start reading about this rethinking is the 2020 Census Data Products pages at the Census Bureau.

The Census Bureau is aware that there will be this impact and has asked the data-user community for input in the decision process along the way. There were Federal Register Notices asking for use cases related to the 2010 tables, an ask for feedback on a proposed set of tables. There were publications of application of a DAS to 1940 Census data, 2018 PL94-171 data from the 2018 test and the 2010 Demonstration Products. Currently the Census Bureau is asking for feedback on the measurement of progress of the DAS implementation they plan to use for the first set of products coming out of the Census.

The intentions of stakeholder involvement were good BUT didn’t lead to buy-in from those stakeholders and many are afraid that the quantity and quality of the published data will severely impact the capability to make sound decisions and do sound research based on Census 2020 and products that are directly or indirectly based on that data. Adding to this anxiety is the very difficult unexpected circumstances the Census Bureau has to deal with while collecting the data.

From my perspective as one of those stakeholders that is wary about the quantity and quality of the data there are a few things that could have gone better:

  • The need for rethinking is not communicated clearly. For example, I cannot find a Census Bureau publication that plainly describe the re-identification process, all I can find are a few slides in a presentation. A layman’s explanation of the legal underpinning would be helpful as well as some argue that there has been a drastic reinterpretation.
  • The asks for feedback were all very complicated, time consuming and reached only a small group of very dedicated data users that felt tasked to respond for many and stick with the low hanging fruits.
  • It is not clear what the Census Bureau did with the responses.
  • The quality of the 2010 Demonstration Products was very low and would have severely impacted my use of the data and many others uses.
  • Most Census Bureau communications about this rethinking consisted of a mention of a trade-off between privacy and accuracy followed by a slew of arguments about the importance of privacy and hardly any mention how important accuracy is for the mission of the Census Bureau. Many stakeholders walked away with the feeling that the Bureau feels responsibility for privacy protection, but not as much for accuracy.

There is a hard deadline for the production of the PL94-171 data, although Congress has the power to extend that date because of the Covid-19 pandemic. Working back from that, I am afraid that decision time is not too far away. The Census Bureau is developing the DAS using an agile system with about 8 weeks between ‘sprints’. The Bureau published updated metrics from sprint II at the end of May, but already started with sprint IV at that time. If we keep the 8 weeks between sprints this implies in my estimation that there is room on the schedule for 2 or 3 more sprints and very little time to rebuild trust from the data-user community.

Examples of actions that would help rebuilding some trust are:

  • Appointing someone that is responsible for the stakeholder interaction. So far, my impression is that there is no big picture communication plan and two-way communication depends too much on who you happen to know within the Census Bureau. Otherwise the communication is impersonal and slow and often without a possibility for back-and-forth. This person should also have the seniority to fast-trac the publication review process so stakeholders are not constantly 2 steps behind.
  • Plan B. A chart often presented to us is a line that shows the trade-off between privacy and accuracy. The exact location of that line depends on the privacy budget and the implementation of the DAS and the Census Bureau seems to have the position that they can implement a DAS with a sweet spot between accuracy and privacy that would be an acceptable compromise. But what if there is no differential privacy based DAS implementation (yet?) that can satisfy a minimal required accuracy and a maximal allowed disclosure risk simultaneous? So far it is an unproven technique for such a complex application. It would be good to hear that the Census Bureau has a plan B and a set of criteria that would lead to a decision to go with plan B.
  • Promise another set of 2010 data similar to the 2010 demonstration products so data users can re-evaluate the implications of the DAS. This should be done in a time frame that allows for tweaks to the DAS. Results of these evaluations could be part of the decision whether to move to plan B.
  • Have a public quality assurance plan. The mission of the Census Bureau is to be the publisher of quality data, but I could not find anything on the Census Bureau website that indicates what is meant with data quality and what quality standards are used. Neither could I find who in the Census Bureau oversees and is responsible for data quality. For example: does the Bureau see accuracy and fitness for use as the same concepts? Others disagree. And what about consistency? Can inconsistent census data still be of high quality? Being open about data quality and have a clear set of quality standards would help showing that quality is of similar priority as privacy.
  • Publish a time line, with goals and decision points.
  • Feedback on the feedback: what did the Bureau do with the feedback? What criteria were used to implementing some feedback but not others?

Time is short and stakes are high, but I think there are still openings to regain trust of the data community and have Census data products that will be of provable high quality and protects the privacy of the respondents at the same time.