impact — Mola Mola — sheffieldmeme

The editor of a well-respected ecological journal told me recently, “I am… very down on analyses that use citation or bibliographic databases as sources of data; I'm actually quite concerned that the statistical rigor most people learn in the context of analysing biological data is thrown out completely in an attempt to show usage of a particular term has been increasing in the literature!” I think he has a point, and in fact I feel the same about much that I read on bibliometrics more generally: there’s some really insightful, thoughtful and well-reasoned text, but as soon as people attempt to bring some data to the party all usual standards of analytical excellence go out the window. I see absolutely no reason to buck that trend here.

So…

The old chestnut of Journal Impact Factors has been doing the rounds again, thanks mainly to a nice post from Stephen Curry which has elicited a load of responses in the comments and on Twitter. To simplify massively: everyone agrees that IFs are a terrible way to assess individual papers (and by inference, researchers), but there’s less agreement on whether they tell you anything useful when comparing journals within a field. Go read Stephen’s post if you want the full debate.

But what’s sparked my post was a response from Peter Coles (@telescoper), called The Impact X-Factor, which proposed an idea I’d had a while back about judging papers against the IF of the journal in which they’re published. Are your papers holding up or weighing down your favourite journal? Let’s be clear from the outset: I don’t think this tells us anything especially interesting, but that needn’t put us off. So I have bitten the bullet, and present to you here my own personal impact factor. (The fact I come out of it OK in no way influenced my decision to go public.)

The IF of a journal, remember, is simply the mean number of citations to papers published in that journal over a two-year period (various fudgings and complications make it rather more opaque than that, but that’s it in essence). So for each of my papers (fortunately there aren’t too many) I’ve simply obtained (from my google scholar page, as it’s more open that ISI) the number of citations they accrued in the two years after publication. I’ve then compared this to the relevant journal IF for that period, or as close as I could get. Here are the results:

OK, bit of explanation. This simply plots the number of citations my papers got in the two years post-publication, against the relevant IF of the journal in which they were published. (The red points are papers published in the last year or so, and I’ve down-weighted IF to take account of this; I’ve excluded a couple of very recently-published papers.) The dashed line is the 1:1 line, so if my papers exactly matched the journal mean they would all fall on this line. Anything above the line is good for me, anything below it bad – the histogram in the bottom right shows the distribution of differences of my papers from this line.

I’ve fitted a simple Poisson model to the points, with and without the outlier in the top right – neither does an especially good job of explaining citations to my work, so we might as well take a mean, giving me my own personal IF of around 6.

As my editor friend suggested, there’s a whole lot wrong with this analysis. For instance, I haven’t taken account of year of publication, or any other potential contributing factors (coauthors, publicity, etc. etc.). Another obvious caveat is the lack of papers in journals with IF > 10 (I can assure you that this has not been a deliberate strategy). But back in the peloton of points which represent the ecology journals in which I’ve published most regularly, I’m reasonably confident in stating that citations to my work are unrelated to journal IF. Gratifyingly too, the papers that I rate as my best typically fall above the 1:1 line.

So there we have it. My own personal impact factor.

What does it take to have a real impact on the development of your field? Those charged with assessing UK research have taken the view that a small number of exceptional papers are a better indicator of quality than a mass of ‘lesser’ papers. Now we can quibble (indeed, I have done) about the way that ‘exceptional’ papers are identified (in particular by risk-averse departments and institutions). Furthermore, I’ve argued that setting out with the intention of writing a ‘high impact paper’ is often antithetical to doing good science. However, that’s beside the point. Regardless of the nuts and bolts of measurement, the idea that one should be judged on quality not quantity seems to be reasonably widely accepted. But if we’re taking a retrospective view, is it always the case that you can trace the development of a field back to one or two highly influential papers? I’ve been pondering this since the first meeting last month of the British Ecological Society’s Macroecology Special Interest Group. (Macroecology is ecology at large spatial scales, by the way, and is what I do. There’s a brief but useful wikipedia page here.)

As is the nature of such inaugural meetings, first our committee chair Nick Isaac, then our opening keynote speaker, Ian Owens from the Natural History Museum, provided a potted history of the discipline. In so doing, it is common practice to pick a significant publication and trace its subsequent influence. But in macroecology, that’s tricky...

OK, you could pick the 1989 Science paper by James Brown and Brian Maurer which originally coined the term ‘macroecology’. This paper has accrued a satisfying-but-not-stellar 329 cites, but my suspicion is that it’s not actually been that widely read, and that many of the citations run something like “the term ‘macroecology’ was first coined by Brown & Maurer (1989)...”

Or we could focus on the pioneers of UK macroecology, Kevin Gaston and Tim Blackburn. They have forged a formidable partnership, coauthoring 87 papers over a 20 year period, with a phenomenal burst of productivity in the mid to late 1990s which saw them publish as coauthors around 10 papers a year, papers which provided the foundation for much subsequent macroecological work. (Both have been prolific independently of each other too. Sickening isn’t it?) The point is, though, that it is difficult to pick a single of these 50 or so papers as being suitable for the ‘what happened since the publication of x’ rhetorical device. Although their work in aggregate has been well cited – 8 papers from that period 1993-2000 have picked up >100 cites – the maximum number of cites for a single work is <300. Which is good, no doubt, but not spectacular.

So what Nick and Ian both did was to pick as their milestones three books, two by Kevin and Tim (Pattern & Process in Macroecology from 2000, which summarised much of their previous five years of work, and the edited volume Macroecology: Concepts & Consequences) and one by Brown (Macroecology, a single-word title which has amusingly been cited in 20 different ways according to ISI WoK, including the antonymous Microecology!). Rich Grenyer Tweeted at the time that this had interesting implications for the high-impact-paper-obsessed REF, but I think it also tells us something about the development of scientific fields more generally.

Of course there are occasions when one or two landmark publications define decades of subsequent research (think Einstein, or Crick & Watson, even the occasional book like Hubbell’s Unified Neutral Theory of Biodiversity). But often it is steady accumulation, the gradual assembly of a body of work which counts. This recognises that it is not always possible – or if possible, not desirable – to force everything of value that you have to say into the strict limits of some of the higher profile journals (and you may not wish to see what you consider to be important analyses buried, probably unread, in supplementary material). This view of science essentially treats the literature as a kind of open notebook – a record of thought processes and incremental progress, rather than a single statement of ultimate truth. And in the case of macroecology, this broad foundation has served us very well.

Perhaps I can make a (non-Olympic) sporting analogy. Cricket exists in several formats, with the extremes being the smash-bang-wallop of Twenty20 (matches last about 3 hours) and the rather more sedate 5 day test matches. A test match batsman will steadily accumulate, and won’t try to hit every ball out of the ground – although if a ball is tempting enough, of course he won’t turn down the opportunity for the big hit. This mix of accumulation and opportunism seems to me to be a much better strategy for ensuring that a field is built on solid foundations than the headline-grabbing, REF-driven, try-to-hit-everything-straight-into-Nature T20 style.

As any cricket fan will tell you, test matches are more substantial and ultimately far more satisfying than any limited overs jamboree. And the occasional 6 is made all the sweeter for its scarcity.

My own personal Impact Factor

Defining a Field