Evidence: Is It Really Overrated?

July 4th, 2014 at 11:44 am

A few weeks ago, during the evidentiary dustup between Piketty and the FT, I quasi-favorably quoted a Matt Yglesias line re empirical evidence being overrated.  A number of readers were understandably unhappy with that assertion, arguing that they come here to OTE for fact-based analysis based on empirical evidence (with, admittedly, a fair bit a heated, if not overheated, commentary).  If facts all of the sudden don’t matter anymore, why not just call it a day and join the Tea Party?

So let me add a bit more nuance.  The statement is about the quality and durability of evidence, which is not only varied, but, at least in the economic policy world, increasingly problematic.  A number of developments have significantly lowered the signal-to-noise ratio.

I’d divide the evidence problem into two separable categories.  First, statistical issues about what’s “true” and what’s not, and second, ideological ways in which the noise factor is amplified at the expense of the signal.  It’s this latter bit that’s arguably gotten worse.

“Variable coefficients” Sounds wonky, but all’s I’m saying is that relationships change over time.  As economies and societies evolve, as globalization increases, as technologies change, as cultural norms ebb and flow, we shouldn’t expect the relationships between inflation and unemployment, minimum wages and jobs, growth and inequality, education and pay, or pretty much anything else to stay the same.  It’s rare for the “sign to flip” meaning I expect education and pay to remain positively correlated and inflation and unemployment to remain negatively correlated.  But magnitudes change a lot.  No elasticity is etched in stone!  So be skeptical when someone tells you “this leads to that and therefore this is a good or bad idea!”

That’s “skeptical,” not cynical.  Judicious use of data by unbiased practitioners can help inform our understanding of the relationship between this and that, at this juncture in time.

Complexity As I stressed in the first link above, it is hard to know what to make of empirical evidence when it is based on simulated data, meaning data that the analyst has changed in ways that she believes are necessary (remember, we’re not talking ideological thumb on the scale yet).  In my world, this happens a lot with income and wealth data (it was also the basis of the Piketty/FT argument).

Such data are often incomplete in ways that matter.  They might, for example, leave out the value of government-provided benefits.  So analysts are clearly justified in tacking on such things as the value of food stamps or medical benefits.  But how to do so can be deceivingly tricky and there’s often no obvious right answer.

Health care benefits in the US provide the best example.  If you just append the market value of coverage to the income of say, Medicaid recipients, as CBO does in some widely used tabulations, you’re inflating the incomes of the poor in ways that reflect not their buying power, but the widely documented excesses in the US health care system, including inflated salaries and over-priced medicines.

One study that employed this and other even more dubious methods ended up publishing results showing that between 1989 and 2007, when every other study showed income inequality growing, the income of the poorest 20% rose the most in real terms while that of the top 5% fell the most (see figure here).

With such analysis proliferating, you’ve really got to know what you’re looking at.  I’m also increasingly drawn towards simple data that didn’t involve a lot of researcher manipulation, like these annual wage trends, for example.

Noisy data I wrote about this yesterday re first quarter GDP—a huge negative outlier—but the fact that statistical evidence always has uncertainty bands around it gets quickly lost in the debate.  I try to remind everyone, most notably myself, that the payroll jobs number upon which we all obsess every month has a 90% confidence interval of 90,000 jobs.  Imagine how differently yesterday’s big jobs report—288K pop on payrolls—would have gone if the result was 200K, but in fact, that result is within the confidence interval (the CI means there’s a 90% chance that the true change in payrolls was 90K below or above the point estimate).

Again, not a fundamental critique of empirics at all.  Just a call to know and respect their limits.

OK, that’s the benign part.  Here’s the other part.

More so than in the past, we live in a world where money doesn’t just buy power and political access—that’s always been the case.  It also buys the answers it wants, from faux-climate-scientists and “think tanks” that can generate whatever result you need.  Recall, for example, the Heritage Foundation’s initial projections, later revised (or deep-sixed) as I recall, that “dynamic effects” of one of Paul Ryan’s budgets would drive the unemployment rate down to 2%.

This problem seriously boosts the noise in the already hard-to-decipher signal-to-noise ratio.

You could just say, “well, I’m not going to believe evidence that comes from people or institutions with an ideological bias.”  But that would be wrong.  Paul Krugman is a “liberal” but in countless empirical columns and blogs, his analysis has been careful and accurate.  The Center on Budget, EPI, CEPR too.  Me too.  That’s not to say they or I don’t make mistakes, of course.  It’s that we’re almost always careful to know the data pretty well and not go beyond it.  (I know I should balance the above with conservative examples—I think Chuck Lane at the Post often uses evidence effectively, also Jimmy P and others (Hassett, Strain) at AEI; also Doug Holtz-Eakin on a good day).

So what is a thinking person, who’s not a statistician, to do?  Perhaps the answer comes from movie criticism.  Before I had kids I used to really enjoy the movies (now I click on Netflix, immediately fall asleep, hopefully waking up in time to catch my laptop before it crashes to the floor).  Figuring out which films to go see was vastly aided by knowing which critics to trust.

Same with evidence.  I hesitate to start naming names or I’ll leave out some valued assets, but I already named some wonky types above.  I’d add VOX, Upshot, Wonkbook, all of which I’d argue exist in no small part for this very reason—as go-to places for quality evidence (now that I’m writing for PostEverything, I’d shamelessly add them too).  Dean Baker’s Beat the Press is must reading in this space.

In sum, let me rephrase the original point: careful evidence based on transparent data with respect for confidence intervals and variable coefficients is great.  A lot of the rest is overrated.

Print Friendly, PDF & Email

24 comments in reply to "Evidence: Is It Really Overrated?"

  1. Paul says:

    Nice timing! As you mention Upshot, this article “When Beliefs and Facts Collide” http://www.nytimes.com/2014/07/06/upshot/when-beliefs-and-facts-collide.html?rref=upshot has an interesting point of view that adds to your point.

    Thanks for your interesting and well written blog!

  2. Robert Buttons says:

    When I took biostatistics we got an interesting assignment: Pick a journal article and critique it. As a naive young student I initially thought this would be difficult: “These are famous researchers publishing in major peer-reviewed journals, how could the studies be faulty?” It didn’t take long for me to figure out how wrong I was. When I started studying economics (18months ago) I realized that economics students face a MUCH, MUCH easier task than my own: A random cloud of data points with a line through becomes a “trend”. Creating fantastical measurements like “socioeconomic indexes” because using raw income data isn’t statistically significant. Using arbitrary baselines.

    I would get a creepy feeling when I found a drug study whose results hinged more on the parameters chosen than the clinical efficacy, but it seems that situation is de rigueur for econ.

  3. Robert Buttons says:

    A critique of Krugman:

    Drawing a trendline from a single point:

    Truncating an unemployment chart at 1924, because extending it beyond would disprove his thesis:

    • Jonathan says:

      I clicked through to see your examples. Your assertion that Krugman was misusing data is incorrect. The claim that he drew a trend through a single data point shows that you don’t bother to read the text (or possibly that you can’t comprehend it).

  4. Greg Byshenk says:

    I realize that I may be the one mistaken here, but it seems to me that the statement about confidence intervals is misleading.

    A “90% confidence interval of 90,000 jobs” means that we have 90% confidence that the actual number is within 90K of the stated number, right? Which in turn means that it is misleading (at least) to say “the CI means there’s a 90% chance that the true change in payrolls was 90K below or above the point estimate”. Given that the point is precisely -at- the limit of the confidence interval, it would be much closer to correct to say that “the CI means there’s a _10%_ chance that the true change in payrolls was 90K below or above the point estimate”.

    (In fact, neither are completely correct, as the CI doesn’t actually say -anthing- about specific numbers within the CI. But, if the statement of the CI is correct, then there is only a 10% chance that the stated number is off by 90,001, which means that the chance of 90,000 is as close to 10% as makes no difference.)

    • Jared Bernstein says:

      Here’s how BLS describes the 90% CI (my bold): “For example, the confidence interval for the monthly change in total nonfarm employment from the establishment survey is on the order of plus or minus 90,000. Suppose the estimate of nonfarm employment increases by 50,000 from one month to the next. The 90-percent confidence interval on the monthly change would range from -40,000 to +140,000 (50,000 +/- 90,000). These figures do not mean that the sample results are off by these magnitudes, but rather that there is about a 90-percent chance that the true over-the-month change lies within this interval.”

      • Greg Byshenk says:

        That’s exactly my point: the chance is 90% that it is somewhere within the interval. That is not at all the same thing as a 90% chance that is at one endpoint of the interval.

        If I select a random number between 1 and 10, then there is a 90% chance that it is between 1 and 9. There is -not- a 90% chance that it is 9.

  5. Robert Salzberg says:

    Robert Redford said that when he started acting, he believed that if the truth could be revealed to a wide audience, that would create critical mass for positive political change. Redford said he soon realized how naive he was.

    Former Representative Barney Frank said that he learned in Congress that truth was a necessary but not sufficient reason to be included in any argument or negotiation.

    The deeper problem is that Democrats in general and progressives in particular rely too heavily on facts and too little on emotion while Republicans in general and Tea Party Conservatives in particular rely too heavily on emotion and too little on facts.

    The harsh truth is that truth is an insufficient tool when it comes to policy making.

    The trick is making good policy exciting and emotionally charged enough to be implemented.

    • Robert Buttons says:

      That’s just not true. Headstart’s existence is based solely on emotions (“but its for the children”), the data show headstart is not helpful.

      Another example: Leftist Harvard Crimson writer seeks to censor science in the name of social justice.


      Multiple surveys have shown that Libertarians have the highest IQs (myself being the glaring exception)

      • Jonathan says:

        Would you please cite a source for this? My understanding of the evidence is that while Head Start does not raise IQ it has important beneficial effects, and that earlier intervention in addition to Head Start would be possibly even more effective.

        • Robert Buttons says:


          Don’t fall for the trap: Spending more on secondary school, didn’t work OBVIOUSLY we need to spend more on primary school. Primary school upgrades didn’t work, we OBVIOUSLY need to spend more on pre-school. Pre-school didn’t work, we OBVIOUSLY need earlier intervention. Actually, I agree with that last point. Since IQ is mostly genetic (in the range of 75%), we need to start interventions 21 months before the child’s first birthday.

  6. Smith says:

    The argument over evidence is crucial. But there are ways to address the difficulty in ascertaining the truth which are being ignored.
    1) Reading multiple sources
    2) Seeing primary data
    3) Understanding statistics, controlled experiments vs. observational studies, correlation vs. causation, types of errors, confirmation bias
    4) Being familiar with a field of study (e.g. economics)
    5) Being exposed to opposing views
    6) Questioning assumptions and rhetorical devices
    7) Examining underlying causes

    Everyone knows economic recovery sputters along slowly and inequality grows. Democrats do not propose anything to address the situation because the top 10% who run the country and control the Democratic agenda are doing fine. The prime political concern of Democrats is to hold on to marginally progressive Democratic Senate seats http://www.latimes.com/nation/politics/politicsnow/la-pn-senate-midterm-races-to-watch-20140703-story.html
    Why are Democrats struggling? The underlying problem is that the economy isn’t that good. The Republicans have succeeded in thwarting a robust recovery for political gain. The Democrats acquiesced for financial gain, because they benefit from lower taxes, campaign contributions, and incumbency (it appears only 4 of 53 Senators face a difficult reelection in 2014)

    Economists who know better ask us not to trust our future to the childless.
    “(Keynes himself was childless.)”
    “I am a parent of three, and as far as I know, Paul does not have any children.”

    Meanwhile 80% of the Bush tax cuts remain.
    Even Krugman sweeps aside the gift. How so?
    The crucial passage comes in framing and making an assumption.
    “Obama wasn’t going to let all the Bush tax cuts go away in any case; only the high-end cuts were on the table.”
    I’m guessing he means the tax cuts on those making between $250,000 and $450,000 or .1% of GDP. Thus left undocumented is the enduring legacy of Bush II to permanently reward mostly the upper 20% and 10% of households.

    This could backfire if the revenue loss is made up by taking from the 1%. Not happening. Meanwhile…

    “That is a dividing point,” says Mark Rank, a professor at Washington University in St. Louis. People in the top 20 percent of income — roughly $100,000 in 2013 — have taken nearly all the economic gains of the past 40 years.

    • Robert Buttons says:

      Either what you said is true or the truth is that stimulus and QE just aren’t effective, how would you design an experiment to decide between the two choices?

      • Smith says:

        Krugman says we got the experiment:
        “Even more important, I’d argue, is the huge natural experiment Europe has provided on the effects of sharp changes in government spending.”

        • Robert Buttons says:

          For once, I agree with Krugman, the sharp upswing in public debt in Greece, facilitated by easy credit from EU connections shows you are worse off when the bill comes due.

          Japan has oodles and oodles of public spending—they aren’t doing so well.

          • Smith says:

            Is Japan really spending oodles on stimulus to revive the economy? Or has the national pension system increased government expenditures without promoting economic expansion. Do you think newly retired 65 year old Japanese run out to buy new cars knowing their life expectancy is another 20 years?
            “Containing social security spending is a key fiscal policy challenge in Japan. Social security spending (mostly pension, medical, and old-aged care spending) has been rising steadily and now takes up nearly 55 percent of the total non-interest spending by the general government, reflecting the rapid population aging (Figure 2)”

            There is no upswing in public spending in post recession Greece. The U.S. sharply increased public debt after the recession hit. Which country is doing better?
            “The Greek economy has been kept afloat for the past three years by rescue loans from Europe and the IMF in exchange for harsh austerity measures that have worsened the recession, currently in its sixth year.”
            “The central bank said this month that the country’s economy is likely to contract by a further 4.6 percent in 2013, with unemployment set to reach 28 percent.”

            If you think deficit spending in a thriving economy is the same as deficit spending for stimulus in a recession, you’d best look at the record.

            Again, U.S. has less austerity than Europe and look who is doing better.

          • Robert buttons says:

            It really sounds like your thesis is: “greece would be doing just fine if they didn’t run out of someone else’s money”

          • Smith says:

            My theory, which is common sense, Keynesian, and Krugmanian, is that deficit spending in a full employment economy can cause problems, but during a recession, is often necessary for a successful recovery.

            Greece had plenty of other people’s money during the boom, which led to the recession. Greece has a lot of other people’s money now, but they’re fairing poorly.

          • Robert Buttons says:

            Greece is an example of what happens when you produce too little and consume too much.

            Your theory is quite elegant and does well in a textbook. I ask you:
            1. The recession ended in 2009, why is deficit spending still necessary?
            1a. More generally, what specific levels economic metrics tell you when deficit spending goes from being helpful to “not so much”?
            2. To spend we either need to tax or borrow or print money. Both taxation and (domestic) borrowing remove money from the general economy, which is contrary to what you are trying to accomplish. BTW, multipliers >1.0 are not real (Cogan, Cwik, et al 2009)
            3. How do you deploy the money after you remove it from the economy? Digging holes and filling them up is not helpful.

  7. WRJ says:

    Please judge the validity of my statistics. I took all the years from 1950 to 2007 and grouped the years with similar top income tax rates. I picked those years to avoid disruptions from WWII and the current Great Recession. All the years in a group are contiguous. 1987 was an outlier and was calculated by itself. I did not even look at which party held the Presidency the House or the Senate in these time periods. The result shows that GDP growth slowed down every time we lowered the top tax rate. Here is the table listed from highest to lowest top tax rate.

    Period Average Top Tax Rate Real GDP Growth Rate

    1950 to 1963 91.1% 4.03%

    1964 to 1981 71.1% 3.53%

    1982 to 1986 50.0% 3.43%

    1993 to 2002 39.5% 3.38%

    1987 38.5% 3.20%

    2003 to 2007 35.0% 2.79%

    1988 to 1992 29.2% 2.53%

    I think this table proves the fallacy of Republican Economic ideas.


  8. Erick Cobb says:

    > So what is a thinking person, who’s not a statistician, to do? Perhaps the answer comes from movie criticism.
    So that means we need a Meta-Wonk rating system the same way we have Meta-critic rating systems (where several critics/the public get compiled together).