Escaping the Vicious Cycle: Public Programs Must Invest in Data Quality

Escaping the Vicious Cycle: Public Programs Must Invest in Data Quality

Jan 17, 2017
Scott Cody

Low-income families who live paycheck to paycheck are trapped in a vicious and frustrating cycle.

Consider, for example, a family spending hundreds of dollars a month on avoidable car expenses. You need a car to get to work. But because of a lack of savings and poor credit, you can only afford a used 20-year-old clunker financed with a high-interest loan. Each month you make inflated interest payments and pour money into endless repairs because the interest payments preclude you from getting a more reliable car. Another cracked ball joint? Just how many ball joints can one car have, anyway?

If you just had enough cash to afford a better car, you could avoid expensive repairs and high-interest payments. You could pay off high-interest credit card debt, begin buying daily necessities in bulk, and even start a nest egg. Instead, the bills keep coming, and you struggle to stretch each paycheck.

bad data

The frustrating reality is that the social programs that serve our neediest families operate the same way. I’m not just talking about the services for families who live paycheck to paycheck; I’m talking about services for those people for whom that would be a step up. People who can’t secure employment because of severe mental or physical limitations. Children who have been removed from abusive homes. Seniors who, after a lifetime of living paycheck to paycheck, must now survive on government benefits.

These programs don’t have enough funding to cover their monthly expenses. As any program director will tell you, the tough budgeting choices they must make are gut-wrenching and heartbreaking—reducing services to clients, increasing caseloads for workers, freezing salaries year over year. In this context, the ability to invest in efforts that yield savings in the future comes at the price of more hardships today. Maybe you could buy a car that won’t break down, but you won’t have enough money to feed your family for the next three months.

But it doesn’t have to be this way. Governments can make investments that have the added bonus of increasing program effectiveness and reducing costs.

The Costs of Bad Data

One of these investments is in data quality. In era when Big Data and sophisticated algorithms drive everything from how to track every shopper in a store to which potholes get fixed, too many social programs suffer from poor data quality. Administrative records contain incomplete, incorrect, and inconsistent information. Over a career spent working with public program data, I have seen too many examples of these data problems: Multiple records for the same household. Impossible birthdates (I’m confident that a mother of two cannot herself be three years old). Addresses entered in the wrong data field. Data fields that, because workers are trying to expedite data entry on clunky systems, have the exact same information for every program participant.

These types of data problems undermine program effectiveness. Program administrators can’t monitor programs and target services if their data are unreliable—and more often than not, they are.

According to a recent article in Harvard Business Review, IBM estimates that bad data costs the commercial sector more than $3 trillion a year. This is a staggeringly large number, especially when considering how advanced the commercial sector is when it comes to data. The costs of bad data come from the following:

  • The post-processing of bad data to clean them for analysis
  • The time that decision makers spend hunting for the right data to inform a decision
  • The effort necessary to fix problems from decisions based on incomplete or incorrect data

The IBM estimates reflect costs to the commercial sector, not the public sector. But I am confident that most public sector programs are burdened by similar costs. Public programs—from education to health care, from child welfare to vocational services—lack state-of-the-art tools for managing their data. Programs spend exorbitant resources cleaning what limited data they have. Administrators spend countless hours trying to track down answers to their most basic questions. Program staff lack the information necessary to target services to the places they will be most effective. And policymakers lack the basic analytics required to assess whether costly activities generate any benefit whatsoever.

I know of no efforts to quantify these costs across federal, state, and local programs. But even the crudest attempts to approximate these costs suggest they are substantial. Government spending makes up one-third of U.S. gross domestic product (GDP). Let’s say that the government costs of bad data are therefore one-third of the commercial costs. That’s $1 trillion. Of course, this logic ignores how GDP is actually calculated (and would surely make my undergraduate macroeconomics professor cringe). Still, an estimate that government costs equal 33 percent of the commercial costs seems a reasonable starting point. Think $1 trillion is too high? Well even if it’s one-quarter of that, we’re still talking about $250 billion. (Think that’s too high? Well a quarter of that is still significant— $62.5 billion—which, at 2 percent of the commercial costs, seems unrealistically low). We can save money by addressing our data problems. And in doing so, we can make our programs more effective.

Better Data for a Better Future

What can we do? Legislative bodies can prioritize data quality investments for programs. These investments should not come at the cost of current services; rather, they should be seen as a way of ensuring lower costs and better results in the future.

Specifically, legislatures and policymakers should:

  • Ensure agencies can adopt effective data governance practices that designate responsibility for defining and maintaining high quality information across the agency
  • Promote data quality assessments and support agencies’ abilities to investigate and remediate data quality problems
  • Support agencies’ ability to invest in streamlined data entry procedures and automated tools that ensure data quality
  • Address the silos, redundancies, and communication barriers that preclude multiple agencies from coordinating effectively toward the same goals

These principles seem straightforward, and recent advances such as the Commission on Evidence-Based Policymaking are an important step in the right direction. But implementing these principles effectively requires commitment on the part of funders and program administrators. Without this commitment, programs will continue to pay the expensive and ineffective costs of bad data today, and will never escape the vicious paycheck-to-paycheck cycle.

Learn more about Mathematica’s data analytics work.

About the Author