Researchers say the voluntary National Household Survey is unreliable.
In 2006, the small Canadian town of Snow Lake, Manitoba, had 837 residents, many of whom worked in the local mining industry. It was a prosperous community: The typical family earned 84,000 Canadian dollars a year, well above the national median of about $66,000, and the unemployment rate was just 5.1 percent even though only a small fraction of residents had a college degree.
Five years later, Snow Lake had lost more than a tenth of its population, shrinking to 723 residents. But the government doesn't know who those residents were - what they earned, how much school they'd completed, whether they were working. That's because in 2011, unlike five years earlier, filling out the government survey that collected the information wasn't mandatory - and nearly three-quarters of the Snow Lake residents who received it decided not to bother.
Snow Lake's experience is an extreme example of a challenge playing out across Canada. When Canadian Prime Minister Stephen Harper announced plans in 2010 to make the government's primary source of household data into a voluntary survey, researchers across Canada warned of dire consequences for the survey's reliability. Those predictions have largely come true: In 2006, nearly 94 percent of Canadian households that received the survey responded to it. In 2011, the response rate fell below 70 percent. As a result, Statistics Canada, the country's statistical agency, decided not to release detailed data on Snow Lake and more than 1,000 other communities, and researchers have called into question the validity of the data on other areas that was released.
Canada's experience with a voluntary household survey is now drawing attention in the United States. Republican lawmakers led by Texas Congressman Ted Poe are pushing to make a similar change to the American Community Survey - a similar, annual questionnaire that aims to measure national trends in dozens of areas such as education, housing, fertility and employment by surveying more than 3 million Americans each year. Both Poe and his Canadian counterparts consider the surveys an invasion of privacy, but researchers on both sides of the border say the Canadian experiment is a harsh lesson in what can happen when a country loses its commitment to collecting accurate information about its residents.
"There seem to be just uncanny and unfortunate parallels," said Terri Ann Lowenthal, who has written extensively about the issue for the Census Project, a U.S.-based network of census data users. "What Canada's parliament did is exactly what Congressman Poe and other ACS critics are proposing to do" to the ACS.
Some disclosure is probably in order here: My FiveThirtyEight colleagues and I aren't exactly neutral observers when it comes to the value of the American Community Survey. When I reported on the economic consequences of different college majors last year, that analysis was based on ACS data. So was my story from June on where people are killed by police. So was Mona Chalabi's analysis of the most cosmopolitan metropolises, Leah Libresco's story on domestic partnerships, and Jia Zhang's fantastic censusAmericans Twitter bot.
But government surveys are good for a lot more than data journalism and Twitter bots. They are used to evaluate the effectiveness of government programs, guide decisions on school construction and infrastructure upgrades, and make business investment decisions. When researchers in San Diego wanted to study the impact of the city's proposed minimum-wage hike, they turned to ACS data. So did Boston when it wanted to assess the progress of its programs to encourage cycling, and Cincinnati when it applied for a grant to build a new streetcar system.
Canada, like the U.S., conducts a regular census (every five years compared with every 10 years in the U.S.) that attempts to count every resident. The full census collects only very limited information such as age, sex and marital status. Until 2011, Canada supplemented that data with a broader, mandatory survey that was known as the long-form census and was sent to roughly 1 in 5 households along with the standard census form. But in 2010, Canada's Conservative government decided to eliminate the long-form census and replace it with a new survey that covered the same basic topics but was conducted separately from the census and was voluntary to complete.
That seemingly small change has had far-reaching consequences. Canadian researchers say the new, optional National Household Survey is less reliable, less comprehensive and significantly more expensive than the mandatory survey it replaced. And it isn't just researchers who are worried: A diverse set of groups, from local governments to business organizations, has criticized the shift to an optional survey as shortsighted.
Critics argue that the voluntary survey fell short in two crucial ways. First, because response rates were so much lower, the survey wasn't able to collect reliable data on smaller communities, including Canada's many sparsely populated areas. Second, surveyors struggled to collect sufficient data on certain groups - the poor, aboriginal populations, immigrants and others - that have always been among the hardest to reach. That means the resulting data could be biased, possibly in ways that could be difficult to detect.
"The replacement survey is a complete waste of money because the data are simply not reliable," said Munir Sheikh, who resigned in protest from his position as chief statistician of Canada in 2010 following the government's decision to make the survey voluntary. "The quality of the data that the census collects obviously is not good enough to be used as census data. That is the conclusion of any researcher who has looked at it."
A long list of Canadian organizations - research groups, nonprofits, local governments - joined Sheikh in objecting to the change. But Harper justified the move on civil liberties grounds, releasing a statement in response to the criticism that said citizens should not "be forced, under threat of fines, jail, or both, to disclose extensive private and personal information."
U.S. opponents of the ACS use much the same argument. In a May opinion piece, Poe described the survey, which is mandatory, as an infringement on Americans' liberty, calling it "another example of unnecessary and completely unwarranted government intrusion."
Poe has introduced a bill in each of the past two congresses that would make the ACS voluntary. Those efforts have failed, but in June, Poe successfully attached an amendment to an appropriations bill that would cut off any money for enforcement. The bill passed the House but looks unlikely to pass the Senate in the same form. Even if it did become law, what the practical impact would be isn't clear. Americans who fail to answer the mandatory survey theoretically face fines, but there's no record of anyone being forced to pay one.
Still, there is evidence that moving away from a mandatory survey would have consequences in the U.S. similar to those in Canada. In 2003, at the request of Congress, the U.S. Census Bureau conducted a test to study the effect of making the ACS voluntary. The result, according to the bureau's report: "A dramatic decrease occurred in mail response rates when the survey was voluntary," which "adversely impacted" the reliability of estimates because sample sizes were too small in some areas. The overall response rate fell 12 percentage points, and the bureau estimated that it would cost an additional $59 million to conduct a voluntary survey, a 38 percent increase.
It might seem counterintuitive that conducting a voluntary survey costs more than conducting a mandatory one, but it's a simple question of math: Response rates are lower on voluntary surveys, so collecting a large enough sample requires conducting more interviews. Canada sent the voluntary household survey to 1 in 3 households instead of 1 in 5, at an added cost of 22 million Canadian dollars.
But even the larger, more expensive survey couldn't collect enough information to produce reliable estimates for many small groups and less populated areas. Statistics Canada lowered the bar for the minimum acceptable response rate to 50 percent from 75 percent yet couldn't meet even that reduced threshold for hundreds of small communities representing about 3 percent of the population. It chose not to publish detailed data for those communities; in other cases, it flagged information as potentially unreliable after identifying possible flaws in the survey results.
Moreover, response rates weren't even across the country. In some areas, nearly all residents returned their surveys. In others, fewer than 1 in 5 did.
"Those who are high income, who are educated, who are native Canadians, their response rate is in the 90s," said Sheikh, the former chief statistician. "The response rate of people who are aboriginal, who are immigrants, who are unemployed, who are low-income, who are less educated, their response rates are quite low."
Statistics Canada acknowledges that response rates were much lower for the voluntary survey and that as a result, it had to stop publishing data on some communities and add warnings about data quality on some tables. But the agency denies that the results it did publish were unreliable. Marc Hamel, director general of Canada's census, said the agency concentrated its follow-up efforts on groups with low response rates and was able to use statistical techniques to adjust for missing data. He noted that Canada still conducted its main, short-form census as usual, allowing statisticians to identify groups that were undercounted in the household survey and tweak its estimates accordingly. And he said his agency had compared the survey results to immigration and tax records to flag potential discrepancies.
"Many of the reports that we read in the media are not supported by more concrete examples," Hamel said. "All of the analyses that we've done and all of the approaches that we've used seem to show that the data are fairly solid."
But many researchers remain skeptical. They say their biggest fear is what they don't know. Small sample sizes are easy to identify, but other forms of error in the data can be harder to recognize. If low- or high-income households were particularly likely to skip the survey, for example, that could bias the results. Stephen Gordon, a Canadian economist who was among the first to highlight the risk of eliminating the mandatory survey, said he is working on research on the top 1 percent of Canadian earners. But he said he doesn't know whether he can trust the data.
"We really don't know how good it is," Gordon said. "Everyone's going to be putting a big asterisk on 2011."
At least with the 2011 survey, researchers can compare the numbers to 2006 to look for discrepancies. But the more time that passes since the last mandatory survey, the harder it will be to detect problems, said Frances Woolley, another economist who has spoken out about the change. She likened it to listening to the radio as you're driving out of range of the station.
"It just gets worse and worse and worse," Woolley said, "and there is some point where you just say, ‘I can't listen to this anymore,' and you change the station."