The other day I posted a gripe about a plot in a recent Nature article that it was misleading because it used a vertical range that amplified the size of the effect. I went so far as saying that Tufte would not approve. This morning, I found the following comment on the article from Dan Eisenberg, who is now Assistant Professor in the Department of Anthropology at University of Washington, where I am pursuing my PhD. Here's the reply:

http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=00003q

(It's a link to Edward Tufte saying the exact opposite of what I was saying. That is, you should plot your data, not zero.)

So, I was wrong that Tufte would disapprove. And, thinking about it, I was wrong that it is always bad to leave zero out of it. In fact, I technically went against my own advice in constructing the plots for my new website, Malark-O-Meter. Only there, I include zero on the y-axis. Except it's the x-axis that shows the size of the effect! 

Now, the question at hand: am I blowhard? Nah! I think there's still something to what I was saying. The plot does inflate the size of the effect, which looks about 133% the size that it really is. So while the plot's purpose is to show the direction of the effect, it's glossy, brightly colored depiction leaves people with a 133% inflated impression of the effect. And Tufte does agree that we want to try not to do that. 

You might ask, "Is 133% effect size inflation that big of a deal?" I guess it ultimately depends on the research question. But I also think that scientists should by default pay as much attention to effect size as direction when reporting their results.

How to guard against leaving an inflated impression of effect size while still showing the direction of the effect?

Maybe reporting the effect size itself more or just as prominently as the effect direction would be good. One way to do this is to plot a histogram of the distribution of the ratio of the two values, which is what I do at Malark-O-Meter to reality check the inflated effect sizes in my own plots (which have a range from 0 to 100, but are often truncated to just above and below the highest and lowest values of the distributions). This histogram could have been inset with the plot showing the direction of the effect, and the mean and confidence bounds of the ratio could have been labeled. Another way to do it would be to somehow label the effect size within the bar plot.

So, in sum, okay, there are good reasons to truncate the range of your plots. But there are good reasons to compensate for the misleading effects of doing so.

In any case, thank you Dan Eisenberg for the link, which got me thinking. I'd totally love it if you responded to this post so that we could have some further discussion, because I haven't had the chance to meet you yet and yet you teach in the department where I learn!
 


Comments

Dan Eisenberg
10/26/2012 22:38

Hello,

Thanks for getting me thinking as well. I've heard the rule of thumb to always include zero, but never quite understood why some insist on it (I still only partially understand).

It seems we are mostly on the same page here. I often find it is far from obvious what the relevant scale of presentation should be--and agree with you that scientists should be focusing more on effect sizes and confidence intervals. What worries me more are multiple testing and publication biases which often make individually calculated confidence intervals bogus to begin with. On this tangent, I've been particularly intrigued by the idea of the http://openscienceframework.org/ that we all start to more clearly register our hypotheses before conducting our research (like clinical trials are supposed to do).

Looking forward to meeting in person before too long.

Dan

Reply
Brash Equilbrium
10/27/2012 04:30

Whoa. OSF is pretty awesome. Together with the knowledgeblogs movement, maybe someday the scientific publication process will become faster, leaner, and less prone to false positives and cheating soon. Thanks for sharing!

Reply



Leave a Reply