My published negative result…
IMAGINE your excitement as a budding young researcher taking on your first piece of research as part of an undergraduate summer studentship; you're working on a gene that makes a type of medically important bacteria resistant to a key group of antibiotics, the tetracyclines. The gene in question is described in a peer-reviewed specialist journal, but no-one is quite sure how the gene works. If we're to understand and address the problem of antibiotic resistance, one of the many things we need to do is understand their mechanisms of resistance. This gene appears very different from any other gene that performs a similar function, so it has been classed into its own 'family' of resistance determinant, which appears in reviews and textbooks. It has also been screened for - and found - when looking under the genetic skirt of a notable 'superbug', VRSA (the vancomycin resistant big brother of MRSA).
The only problem is, the gene in question is not an antibiotic resistance gene, but you won't know this until you have spent the summer working on it; indeed, it won't be until the project is inherited as a pet project by a postdoc that we'll know this. The fact is, the gene had already been recognised for what it really was over a decade earlier, not that this was ever reported. Ironically, the person who immediately dismissed the gene's published function was in fact the one time PhD supervisor of the said postdoc, and a world expert on the type of protein the gene really is, but I'll come back to that.
My post last week, 'On publishing negative results..', briefly described the issue of positive publication bias in scientific and medical literature, and was a pre-amble to the story of my own experience publishing negative results. So let me now tell you about how I tried, and succeeded, at getting ostensibly negative results published.
Obviously I was the postdoc I describe above, and the project to study this 'tetracycline resistance gene' burned through the summer of both an undergraduate and a graduate student, and then through another six+ months of my time (on and off). It is not uncommon, as a postdoc, to undertake several smaller projects in parallel to the major project for which the postdoc is employed; sometimes you win, sometimes you lose. Despite having accumulated plenty of 'file drawer' datasets and potential papers over the years, this was one we tried - out of principle - to follow through to publication.
In 1996 a paper from the laboratory of Dr Lolita Daneo-Moore, Temple University PA, reported the isolation of a novel tetracycline resistance gene on a plasmid found within the gut bacterium Enterococcus faecium:
Ridenhour et al. (1996) A novel tetracycline-resistant determinant, tet(U), is encoded on the plasmid pKq10 in Enterococcus faecium. Plasmid 35: 71-80 [DOI]
For information, plasmids are rings of DNA that exist within bacterial cells, but are distinct from the cell's own DNA; they encode a library of weird and wonderful (though sometimes cryptic) utilities that bacteria can inherit or share between themselves - these utilities often include antibiotic- and antiseptic-resistance, amongst others.
Whilst I cannot hope to give a full run down of all the details of both this paper and my own paper, in essence their work represents a classic piece of reductive scientific investigation, trying to find out what it was that made several E. faecium isolates resistant to the antibiotic tetracycline. They honed in on a single small plasmid, 'pKq10', present in all the resistant isolates, and having determined the DNA sequence of the plasmid, they found a candidate gene that they called tet(U). As is the norm, once you have a DNA sequence for something, you can both compare this sequence, or the amino acid sequence that the DNA ultimately translates to, with others in databases such as GenBank - to see if anything like it already exists. Based on this sort of computer-based analysis, the authors suggested that it shared some similarity with other types of tetracycline resistance genes, but was different enough to be assigned its own class, Tet(U).
It was on this basis, many years later, that our lab decided to figure out how the protein produced by tet(U) actually worked. Our lab has had some success picking apart resistance mechanisms in this way, using a biochemical approach to purify the protein away from the cells and then test its activity. That takes a sentence to say, but took several months to do! As tetracycline is an antibiotic that targets the protein-making machinery of bacteria, it is possible to add our purified TetU protein to a small tube that contains an active protein-making soup extracted from bacteria, and see whether its presence protects this protein production from the effects of tetracycline.
It didn't. Nor, in fact, could any kind of resistance be conferred by this gene in any other context in our hands.
Repeating the original experiments
When I inherited the project, the first thing I did was go back to the original paper and scrutinise both the report, and the DNA sequences they'd submitted to the database, in great detail. The first thing that struck me was that the gene in question, tet(U), looked very familiar. I spent all of 5 minutes double checking the DNA sequence against GenBank and became clear why - in terms of familiarity tet(U) was as obvious in function as a double-decker bus - only, it wasn't the whole bus, it was just the front half. The fact that the whole 'bus' wasn't obvious was perhaps because somewhere along the line, when the original lab determined the DNA sequence, two DNA mutations (or sequencing errors) crept in that split a single obvious gene into two, smaller, more quizzical genes (at least to those less familiar with them).
We have to remember that in 1996 sequencing was rather more error prone than it was today. Also, for the sake of ease (and I can't blame them), the way the authors had chosen to make enough DNA for the sequencing was also error prone. It's also true that the GenBank database has had a hell of a lot more submissions in the past 15 years, thus making the identification of 'tet(U)' far more obvious now than back then. It also didn't help that their bioinformatic analysis stretched the imagination into the patently bizarre; the similarities they reported between tet(U) and other tetracycline resistance genes seemed only possible with wanton desire, rather than empiricism.
So what is tet(U)? It's actually the back end of a replication gene that codes for a protein that the plasmid produces so that the plasmid can make more of itself. It's a pretty important protein, and not one that - in this family of proteins - you would find in two pieces and still be functional. It's also from the same family of replication proteins with which my former PhD supervisor has a world class expertise.
Given that other researchers were continuing to screen for the presence of tet(U), we felt it prudent to inform the community that it should not be considered a tetracycline resistance determinant. However, with myself and the lab head being self-doubting and meticulous chaps, we thought we'd run through their original experiments, just in case there was something obvious we were missing.
The start of the real problems
The real problem with repeating this work was the fact that Dr Daneo-Moore - of the original paper - is sadly deceased, and the lab and materials have long since dispersed. No-one I contacted had any of the original bacterial isolates, or indeed the plasmid! All we had was the DNA sequence of their plasmid deposited in the database, and the knowledge that tet(U) has cropped up time and again where people have screened for it, all be it without looking to see whether it was a distinct gene or in fact part of a (much larger) replication gene. In fact, the saving grace was recent mass DNA sequencing projects that inadvertently delivered up the sequences I'd need, but I'll come back to that.
For the time being, I did what I could; I had the whole plasmid synthesised according to their sequence, and then dutifully performed all the cuttings and splicings as originally described, yet saw none of the tetracycline resistance that they did. So this was the point that we decided we'd write a short article to update the literature, and have done with it.
The peer review hurdle
I have been a reviewer for six scientific publications for several years, and have performed my duties in as rigorously scientific a manner possible, looking for sound logic, methodology and substantiated interpretation. The peer review of our paper was, I would admit, a little less than this; one reviewer saw what we were saying, and barring a few stylistic points, was happy with it. The other reviewer, well, cost me another five months work.
You see, the thing is, despite pointing at the gene in question and shouting very loudly, "LOOK, IT'S THE FRONT END OF A BUS ☞" (i.e. it's a replication protein, it's identical to other replication proteins, this is just a small plasmid with one gene, and that gene is a replication protein, end of...), the burden of proof was put upon us to disprove that tet(U) was a tetracycline resistance gene. This is of course the essence of Popperism, but we felt we had essentially done this, but it'd been kind of hard to draw a line and say, 'Does it not confer tetracycline resistance because it is not a tetracycline resistance gene; or does it not confer tetracycline resistance because I am crap and can't make it do what its supposed to do'.
This, in essence, is the big reviewer 'fob off' - who has time to work through so many petty experiments to try to show that the gene isn't the one thing, whilst ignoring the fact that there's fairly strong evidence that it's actually another?
Sadly, one of the reviewers wasn't a biochemist, so proceeded to 'inform' me (a biochemist) that I couldn't use the sorts of techniques I'd been using my entire career, despite the fact that they've been used successfully in biochemistry for over 15 years, including by me, and by others to do just the sort of experiments we'd described. That reviewer also wanted another control experiment, namely to do the same experiments, but on a totally different tetracycline resistance gene, Tet(M); the one that the original authors said shared some similarity, but actually showed no such thing. Comically, I was also told - in a clairvoyant like manner - that the biochemical approach I'd taken wouldn't work for Tet(M) either, but of course it did; and it did everything that Tet(U) didn't do, and thus vindicated the approach.
I was also told that the only proper experiment I could do was to work with the original material used by the lab, and ignore the sequence they had reported in the their paper. So I was being told something that would mean that the hypothesis is now untestable by science, because I could not go back in time and work with exactly whatever it was the original authors were working with. Obviously this is ridiculous - the sequence in their paper is synonymous with tet(U), and is the basis on which it has been accepted in the literature. If they have failed to incude some other important factor, then they have not in fact described the role of tetU (or the true source of resistance) as it stood.
Having performed many more experiments and permutations to generate yet more negative results, I returned to the original tet(U) DNA sequence and compared it against the database again, just for good measure, as it'd been 6 months since I last did it. Wouldn't you know, an identical match appeared. I got excited and then inputted the whole sequence of the plasmid, and got an almost identical match with just a few differences. Those differences would be the ones I had predicted would turn an apparent plasmid replication gene into two smaller genes. The gene that our tet(U) was a mere constituent of was in fact labelled as a replication gene on this new matching sequence, and had been recently submitted as part of a partial genome sequencing of Enterococcus faecium (the same species as the original host of the plasmid) from a laboratory at University Medical Center Utrecht in the Netherlands.
I just assumed that a plasmid similar to mine had gotten sick of its independent life and inserted itself into the genome of the bacterium - this often happens. I intended to include it as a mere footnote. However, a few days later I found a scientist with a familiar name following me on twitter, Dr Willem van Schaik (@WvSchaik). It took me a while to twig, but realised that coincidentally it was his lab that was doing this sequencing. I sent @WvSchaik a tweet to say that I'd been poring through his sequence data (as you do), but had a question about one contig (the unsorted section of sequence where I'd seen my plasmid). He told me it was in fact a separate plasmid - not part of the genome as I'd thought.
This is where I got more excited and asked @WvSchaik if he wouldn't mind sending me the bacterial strain this sequence came from. As promised Willem sent the strain; I grew it up, extracted just the plasmid DNA, and found there were actually three plasmids, one of which looked about the right size. A few quick tests later and I had the plasmid, but not any plasmid... pKQ10, the original plasmid. It had first been studied 16 years previously in a lab in Pennsylvania, and been lost to science, and then it turned up in Utrecht, from strain that had once caused an infection in an old man.
And do you know what was interesting about pKQ10? Absolutely nothing. All it does is replicate itself. It is an end unto itself, it exists merely to exist, and does not confer resistance to tetracycline.
Caryl et al.(2012) "tet(U)" is not a tetracycline resistance determinant. Antimicrob Agents Chemother 56: 3378-3379 [DOI]
The enterococcal plasmid pKQ10 has been reported to carry a poorly characterized tetracycline resistance determinant designated tet(U). However, in a series of studies intended to further characterize this determinant, we have been unable to substantiate the claim that tet(U) confers resistance to tetracyclines. In line with these results, bioinformatic analysis provides compelling evidence that "tet(U)" is in fact the misannotated 3' end of a gene encoding a rolling-circle replication initiator (Rep) protein.
If only you knew how careful we had to be with every.choice.of.word throughout the manuscript so that we weren't accused of being 'judgemental'.
N.B. No disrespect is intended, or should be ascribed, to the authors of the original paper. This is simply the iterative nature of science as it should happen, and is a product of progress in molecular and bioinformatic tools, and some stubborn determination to publish negative results.