Industrious data probers at the Financial Times have raised good, tough questions about some of the wealth data in Thomas Piketty’s influential book (specifically, the material in his Chapter 10). Though we’ll need to wait for a full response from the author to questions they raise, their careful forensics provide a must read by anyone who relies on Cap21’s wealth data.
A few responses:
–The key question is whether the apparent mistakes and inadequately-explained adjustments uncovered by the FT change the trends in Cap21 in a way that would alter fundamental conclusions. According to an initial response by TP, wealth data using “…more systematic data than I used in my book, especially for the recent period,” and “completely different data source and methodology” find similar trends.
OK, but I haven’t taken the time to scrutinize those data either, and they too are constructed using a series of imputations and adjustments that, while necessary, involve judgments by the authors. More on this problem below.
More supportive of TPs original work are the trends in the FT charts themselves, which for much of the data they’ve corrected, look a lot like TPs original figures, though the UK trends are an exception (with considerably less wealth concentration evident in the FT version). Here, for example, is the figure for France, where the FT’s trends are about the same as TPs.
Source: Financial Times
–TPs income data, which come from tax records, are more reliable than the wealth data. It’s true that the income data, as I and others have stressed, generally leave out taxes and transfer payments, so they’re incomplete too. And TP and colleagues make adjustments to these data as well (something called “Pareto imputations” to estimate the distribution at the top of the income scale). But I trust those adjustments and believe their market income inequality trends are reliable. In fact, they’ve been around for a while now and have held up well to forensic scrutiny.
–There’s other solid evidence showing the increase in US wealth inequality over the past few decades, presented in the wealth chapter of EPI’s State of Working America, such as the figure below from the work of economist Ed Wolff. Also, especially in recent years, various national data sources, and not just in the US, reveal a shift in “factor incomes” from labor to capital, and capital ownership is more highly concentrated among the wealthy. I’m confident that recent trends in wealth accumulation are reliable.
Source: State of Working America.
So while the FT is absolutely correct to raise these questions, I’m quite certain their findings will not change the fundamental conclusions of Cap21. Specifically, the mechanics he emphasizes regarding wealth accumulation look to me like they will stand once these errors are fixed. But, pending TPs response, they may be significant enough to change some of what those of us who read book but applied less scrutiny to its findings believe to be true about certain countries’ wealth trends.
Which leads me to some closing thoughts about data work in empirical economics.
In reading a work like Cap21, I did what most people did. One, you made sure trends comport with your understanding of other sources and developments, like Wolff’s wealth work, which I know well, and the fact that years of growing income concentration and shifting factor shares contribute to growing wealth inequality. Two, you try to gauge the scholarship of the author which you correlate with getting the data right. Piketty is a well-established economics scholar so let’s see his response.
But really, especially if you’re in the business of researching and explaining these trends, there should always be a lot more digging than that before folks like me accept a lot of new evidence in this work. I generally trust what the scientists tell me regarding their work on stuff like the Higgs boson. I’m less certain about what economists say about their data work.
Even if I’m right that when all’s said and done Cap21 will stand just about as strong as it did before, this episode combines with other stuff going on in my work that has triggered these reactions:
–Generally speaking, I like simple, understandable data of the type produced by government agencies with large statistical staffs and many layers of scrutiny, like the BLS, BEA, SSA, Statistics Canada, and so on. I’m not saying that the unemployment rate, for example, or the hourly wage, median earnings—all BLS statistics I know well—are always exact. By the definition of a “statistic,” they’re not.
But I know how they’re derived, in no small part because the agencies print reams of explanations. Also, while they too undergo adjustments—weighting, seasonality, birth/death estimates for firms when counting jobs–I know how they work and can handily gauge their impacts. Even here though, the deeper you dig, the more you run into uncertainty-increasing imputations (e.g., the BLS local labor market variables are largely model based, as opposed to survey based like the national statistics).
–Though I use such data a lot, I’m increasingly suspicious of analyses that add a bunch of different sources on to a base data set and then crunch it back down into small bits. For example, CBO and many researchers add lots of other sources to income to get “comprehensive incomes.” They adjust incomes for different family sizes, the valuate publically provided health care, typically at market values, and add it onto microdata-level incomes. Some researchers even try to impute unrealized wealth gains, like home or stock appreciation.
Don’t get me wrong. As Gary Burtless convincingly stresses, a lot of that spending makes the families who receive it better off (though even here, it’s easy to get it wrong: assigning inflated US health care spending at market values to families is a clear over-valuation of its worth to them). But the simulated income distributions that result from these add-ons can end up yielding questionable results once you start breaking things down by quintiles and narrower percentiles. That’s “questionable,” not wrong. My point is that the more you’re simulating and adding on, the more you may be drifting from the truth.
I don’t know what the right answer is…I’m working on it. On the one hand, I’m increasingly drawn to Matt Yglesias’s point that “empirical evidence is overrated.” On the other, I’m much more convinced by data like the first chart here which uses administrative IRS data, right off W2 filings, to generate what I consider a highly reliable series on real annual earnings (worth noting that these data are a core component of TPs US income data). And you’ll note both the inequality pattern and the earnings stagnation from the bottom 90% of earners. These data and the story they tell strike as about as close to truth as you’re going to get in this work.