In order to get an expert look into the controversy over Thomas Piketty’s work and the Financial Times’ exposé of the errors and “constructs,” I asked Ricochet writer and my good friend King Banaian to break it down for us. King also has a Saturday morning show on KYCR in the Twin Cities on economics and business policy, and is a professor of economics at St. Cloud State University. King is also a senior fellow at the Center for the American Experiment.
A small bombshell exploded Friday with the publication by the Financial Times of an article by Chris Giles and Ferdinando Giugliano detailing data errors in Prof. Thomas Piketty’s unlikely bestseller Capital in the 21st Century. As noted there, most every reader of the book, fans and critics alike, had praised the detailed data Prof. Piketty has provided. The argument heretofore has been entirely over his theory that, because the return on capital inexorably rises above that of GDP, wealth increasingly concentrates in fewer and fewer hands.
Like many people who have bought the bestseller, I haven’t been able yet to wade through all 577 pages. I expect many copies are sitting on coffee tables next to unopened books of art or architecture that help us make statements in our homes. (Mine is on my iPad like so many others these days.) So I will refrain from discussing the theory and only focus on the issues Giles and Giugliano (hereinafter GG) raise.
GG show some examples, first, that there were transcription and “fat-finger” errors in the datasets that Piketty put out in an online annex. This kind of error is every scientist’s nightmare. I produce many datasets for local and regional policy makers and scholars, and even on a third or fourth read you sometimes find gremlins in the data. The errors of this nature are of a kind that were made in the criticism made of Ken Rogoff and Carmen Reinhart’s book This Time is Different. Another book with centuries of data, at one point Rogoff and Reinhart had improperly filtered their Excel spreadsheet. The result, it turns out, wasn’t greatly changed by the properly-filtered data. If this was the total of what GG found, there’d be little to report here, and I’d be inclined to give him the benefit of the doubt: “Happens to the best of us, Thomas.”
(Steve Hayward wonders why Harvard University Press didn’t find these errors. The simple answer, from my own 30 years of publishing experience, is that there is almost no review process in academic book publishing, and not much more in the vaunted “peer-review” process. It’s worth noting that Harvard U.P. doesn’t keep the dataset on its website; Piketty does that himself. For that he should be praised: science is only science when your results are transparent and replicable.)
There is a second similarity between Piketty and Rogoff/Reinhart. GG say that Piketty didn’t weight the observations of Europe properly, saying “when averaging different countries to estimate wealth in Europe, Prof Piketty gives the same weight to Sweden as to France and the UK – even though it only has one-seventh of the population.” Well, that’s probably something I might change, but that doesn’t make what Piketty did an error or dishonest. He just picked a different averaging method that you did. The critics of Rogoff and Reinhart made the same claims, and found that the averaging method made a difference in the claims those authors make about the impact of government debt on economic growth. It isn’t obvious to me or anyone else that one way is right and the other wrong. They are just different, and it will not surpise you to find people cheering the averaging method that comes up with their preferred result. Fun for econo-bloggers but not many others.
This criticism, however, is more serious and troubling. Let me quote GG once more:
A second class of problems relates to unexplained alterations of the original source data. Prof Piketty adjusts his own French data on wealth inequality at death to obtain inequality among the living. However, he used a larger adjustment scale for 1910 than for all the other years, without explaining why.
In the UK data, instead of using his source for the wealth of the top 10 per cent population during the 19th century, Prof Piketty inexplicably adds 26 percentage points to the wealth share of the top 1 per cent for 1870 and 28 percentage points for 1810.
It turns out the differences are huge. In a companion blog that details what they found, Giles shows that the data Piketty cited for wealth of the top 10% in the UK was different from the official source … by 71% for Piketty versus the official number of 44%. In some places Piketty’s spreadsheets had had random digits added that seemed to make the data smoother, fit better than it would have otherwise. Some data appaered to be cherry-picked and other data simply “constructed” (what the non-economist probably calls “made up.”) It’s one thing to interpolate or average over two data points when the middle one is not available, but what Piketty has done appears to be more aggresive in constructing data than that.
Piketty responded to the FT with an explanation that is not altogether satisfying:
For the time being, we have to do with what we have, that is, a very diverse and heterogeneous set of data sources on wealth: historical inheritance declarations and estate tax statistics, scarce property and wealth tax data, and household surveys with self-reported data on wealth (with typically a lot of under-reporting at the top). As I make clear in the book, in the on-line appendix, and in the many technical papers I have published on this topic, one needs to make a number of adjustments to the raw data sources so as to make them more homogenous over time and across countries. I have tried in the context of this book to make the most justified choices and arbitrages about data sources and adjustments. I have no doubt that my historical data series can be improved and will be improved in the future (this is why I put everything on line). [Emphasis mine.]
But how do you do that without doing things that could be construed as biased towards your outcome? It is a basis of scientific analysis that we agree what the data are, but it seems in some cases Piketty is saying “this is my data, there isn’t a perfect dataset out there, if you think you can do better go ahead.” Well, that is hardly a defense for your own data being definitive, Professor.
No single graph or statistical analysis will ever convince someone that the theory he didn’t think was true in fact is true. Scientists value results that are robust, meaning the results are the same across different times and places. This last error appears to make Piketty’s results less robust and so less likely to change anyone’s mind about what they already thought of inequality, and more likely to make Capital in the 21st Century an expensive, unread table book.