Monday, 13 April 2009

Peering in from the outside

So here's a thing: It seems an awful lot of people don't get what peer review is for, and how it works. Keeping the standard issue "I may not know what I'm talking about" disclaimer in mind, I thought I'd take a look at it since I've been both the reviewer and the reviewee in the past. And the perception that some people have seems to bear little relation to my experience of it. So, let's take a look at the strange idea of peer review that people have.

As far as I can tell, this is how the people that've bothered me with this think the process works:
1) A scientist writes a paper to prove their theory is the new theory of awesome
2) Some other scientists read said paper and then vote on whether or not it's the new theory of awesome
3) Depending upon what these other scientists vote, it's either an official New Theory of Awesome (and hence Is Science), or it's not (and hence is either junk or One Of The Great Ideas The Establishment Have Turned Their Back On).

There's a weird misunderstanding of the process of scientific endeavour in there, and it's one that is most likely linked to the way the media cover scientific research (with all the talk of "breakthroughs" and scientists "challenging" each other when they disagree). I think it's fundamentally leaning towards seeing science as received wisdom, where Experts who Know snipe at each other until the ones who win the most points get to have their idea exalted as Science and the losers are... I dunno exactly. I suspect that people think this is where the nutjob theorists who claim to have been rejected by The Establishment come from.

The funny thing is that this is almost exactly not what happens. Although the majority of scientists within a field will tend to share a consensus view of many aspects of that field, they didn't vote for it. There's not an organisation that neatly divides theories and empirical data into "science" and "not science" based on the votes of scientists. What happens to form this consensus then? Well, in my experience this:-

A researcher, working in some field or other, will to begin with spend many, many hours trying to find every paper which might possibly have something to do with what they're working on. Why? Well, for a start, how do you think researchers decide what to research? And how do you think they decide how to go about it? And how do they figure out what might already have been done many times before so they can choose whether or not to do it again (this bit's actually quite important)? By trawling through huge numbers of papers and reading bits of them.

Hold on a moment.

Bits of them?

Yes. Bits of them. Right now, they've got an epic stack of research papers on their desk/hard drive. They've pulled everything they can find which might be relevant. Most of it won't be. So now, they read through the abstracts and maybe some other bits of the papers, and throw out everything that was likely looking, but turned out to be irrelevant. Note irrelevant. Not "inconsistent with what they want to read". I mean stuff like, you were looking for papers on face recognition because you're doing a study on how people recognise faces. Unless you're planning to also look at how automated face recognition works, any papers you might have picked up which deal purely with automatic face recognition probably aren't relevant to you at all. If you find a paper which inconveniently seems to show exactly the opposite of what you hope to find - well, that's not only relevant, that's gold dust. Put that on the top of the pile. You'll be needing to read that.

Right, we've manageed to shrink our pile of papers to something manageable. Now we're going to go through it all in detail and figure out what each paper says, what the researchers did, how they did it, and if they did it sensibly. Once we've done that we can decide if we need to replicate any of their results before we carry on.

This is the important bit I mentioned earlier.

Now, if you're going to be directly building on the work other people have done, it's a good idea to (a) make sure their theoretical work holds water (b) make sure their maths isn't squiffy and (c) make sure that you can empirically demonstrate what they said they demonstrated. After all, best will in the world, they could've just got (un)lucky and fluked it. They may have messed up their experimentation. Probably best if you verify that you can pretty much do what they say you can do with their technique, right?

Obviously, you can't always do this. Sometimes you don't even want to. But in these cases, you can really rely on their results only when you have lots of other papers which have been independently produced by other people who've been in your position but actually done the verification work. If fifty research groups have already demonstrated a paper to be reliable. you're not adding much by re-running the experiment. It may still be helpful for you as a learning exercise or a baseline - in which case you'll run it anyway. Sometimes, of course, you simply can't reproduce the experiment. You don't have the resources. So you do the best you can. You check their methodology. You make sure you can reproduce any derivations. You take a look at their data and verify that what they claim it shows is, indeed, what it shows. And you keep in your mind - and likely note in any publication - that you're basing this on work that you couldn't verify and that no one else has. It'll be a worry. And it'll mean your work has less weight to it because it's built on a shakier foundation.

OK, so, you've done all your work and you write up your paper detailing what you did, how and why, what your results were and what you think they mean. And you reference all the papers you found which were relevant to your work. So, anyone really interested in what you did can go back and see the work that your work relies on.

Now, multiply this up by ALL THE RESEARCHERS IN THE WORLD.

What you see is that useful, reliable work is being checked all the time by being used. Scientists aren't voting on what they believe to be right, they're just reporting back on what they tried out, and what seemed to be reliable because they used it themselves, or because they wanted to verify that someone elses work ...well... worked. And what they're doing the whole time is criticising the work that came before - pointing out experimental flaws, errors in reasoning, over-stated conclusions. And that all gets fed back round the system. On top of that, you have to remember that scientists aren't trying to prove themselves right - you take your hypothesis and you try to prove it wrong. So, all those papers are trying to show that their central idea is false, and then reporting back on their failure to do so. (Before anyone points out this isn't always the case - I know, ideal world. But the scientific process is fundamentally based on this type of negative feedback. Even if you're trying to prove something is the case, you do this by trying to demonstrate that it isn't, if you get what I mean).

OK, so, that's science. And personally, I wouldn't consider any of the above to be "voting". And at no point does anyone decide what is or isn't science, you just have models which have proven to be useful and ones which haven't. Where does peer review come in? And why is it so important?

To answer the second question first: It isn't all that. Peer review is a very, very poor approximation to the above (which happens after publication). Just because something's appeared in a peer reviewed journal doesn't make it right. Independent verification is the thing. So what's the point of peer review?

As far as I can see, it comes down to basic quality control. It's an attempt to weed out the papers that are nonsensical, unreadably badly written, patently fraudulent, misleading, blatantly poorly designed, have obviously fudged their results, etc. It doesn't pick out the absolute best papers, it filters out the garbage that no one would (or could) ever apply the above process to in the first place. And why do we need it? Partly to reduce the expense of publishing, though I have little sympathy for that argument, but also partly to cut down on the volume of crap that researchers would have to wade through to get to the well-written papers about well-constructed experiments that are of any value at all. It's a blunt instrument, and it's far from ideal.

But don't be mistaken by the fact that something is peer reviewed. It's a start - but it doesn't make it valuable, reliable or right. Read the paper. See if it makes sense. If you have the resources, reproduce the work. If you don't, find other papers that have. Build an evidence base for its reliability. Build an evidence base for its unreliability. Compare the two. That's science.

No comments: