When the Virology blog gets around to directly addressing the recombination/contamination argument, I want to be prepared, and I want others prepared. To this end, I'm putting together some thoughts on the huge hole in all denier arguments I have seen. I am not addressing contamination by mouse cells, as in Huber's work. Only indirectly does the question of contamination from infected patients, which I have discussed in a posted comment on the Virology blog, enter this argument. (I think two integration sites do show evidence of laboratory contamination -- from infected humans.) Critics have frequently conflated different meanings of contamination. N.B I am not claiming originality myself. Nor am I acting as a mouthpiece for someone invisible. I am simply restating arguments made in public which do not seem to have been noticed or connected. The original observation, as Mikovits pointed out, goes back to a paper by Paprotka et al. which found XMRV sequences "extensively hypermutated". Other parts of the argument can be checked independently. We need not treat Judy Mikovits as an authority to make these arguments. Anyone with sequence data can check that many codons altered by hypermutation will produce the same amino acid. This is not a perfect defense, but it is more than enough to yield a significant survival advantage. The central point is that this virus has evolved to exploit mechanisms human genes use to escape damage by hypermutation, a natural defense against retrovirus. The virus uses codons which hypermutation converts to synonyms for the same amino acid. This allows enough sequence variation to escape detection of latent virus, either by immune system or PCR, while still producing replication-competent virions when replication is stimulated by hormones matching receptor elements in the initial long tandem repeat (LTR). Even in cases where APOBEC enzymes are not active, this also gives the latent proviral sequence some ability to survive point mutations. It is very likely we do not currently know all such mechanisms active against retroviral infection. There is abundant evidence retroviruses have been around for a long time. Hypermutation is a powerful evolutionary force on retroviruses. Most random sequences will not have this resilience. The process by which a cell line is created is highly selective, so initial virus in a new cell line may be expected to initially show a narrow range of variability. Cell lines free from selection by hypermutation will diverge quickly thereafter. Together these two processes give the appearance of de novo creation of a novel pathogen, if you ignore data from patients. One clue to this selection/divergence is an original sequence with resilience to hypermutation. Arguments being marshaled to suggest contamination use probability, but their primary resonance is with prejudices common among virologists. PCR has been such a powerful tool that any suggestion it can be defeated by viruses makes life much harder for them. Also, working on things which even might be contaminated is not generally a wise career move. A virus cares nothing for the convenience of the host, let alone the convenience or careers of virologists. The brilliance of this 'strategy' for evading defenses argues for a long evolution, not recent origin in a one-off laboratory event. Finding this strategy itself is far more important than either XMRV or ME/CFS. Nothing prevents other retroviruses from operating the same way, even if their sequences are substantially different. The strategy need not be confined to hypermutation by APOBEC3G either. Once you have identified a pattern of hypermutation, you can use a computer to check proposed retroviral sequences for resilience to such changes. Pathogens which take a slow strategy, (as opposed to the fast strategy of a disease like cholera,) will need this resilience to survive long latent periods. Provirus which can survive as long as the host need not replicate rapidly to persist. Diseases where such a retrovirus would be likely include breast cancer, ALS, MS, RA and Lupus. Statistical arguments can reach astonishing heights of sophistication, but most advanced techniques I once learned have fallen by the wayside as far as applicability. The shocking truth is that data going into statistical reasoning is seldom free from selection effects which destroy the validity of those arguments. Hue's argument on the common source of these sequences is a prime example. The problem is not that she used powerful (Bayesian) techniques, but that her input data came mostly from cell lines. If all data were drawn from studies of cell lines, there is no question a phylogenetic tree constructed from that data would be rooted there. If you have a large amount of data from cell lines, and only a tiny amount from virus in the wild, the probability is that algorithms for constructing phylogenetic trees will construct a similar tree. Aside: Switzer's paper on extensive recombination in XMRV actually undermines the idea of trusting phylogenetic tree models. Recombination makes inheritance into a directed acyclic graph. Viable virus emerging from extensive recombination also argues for a long evolution. The recombination-contamination argument shows great similarity to a famous cartoon by Sidney Harris. Even with the assumed minor miracle of recombination producing a potent pathogen, we have a missing wild-type ancestor for those pre-XMRV sequences some 99% homologous to XMRV. Who is looking for this? The second large selection effect is PCR itself, which is capable of incredible selectivity. Tiny mutations in latent proviral sequences which defeat PCR can still produce replication-competent virions at a later date. There don't have to be many survivor sequences to cause persistent infection. The irony here is that sequence variation is being used to suggest origin, while PCR eliminates sequences showing the largest variations. PCR can be used differently, as Lo and Alter demonstrated. Their results were dismissed as coming from an unrelated putative virus. You can't seem to win if you argue on these terms. Prejudices and arguments from authority will triumph. More sequence data outside of cell lines is the answer, but this runs into another roadblock. The most important effect of convenience may be that administrators and authorities who have ignored this disease, and denigrated or disparaged patients, find the existence of this virus highly inconvenient. They just might have some control over funding and publication.