Another possible explanation for the discrepant BWG results (that has probably been mentioned, but I don't recall seeing) is a failure in the coding/decoding of samples. Because the entire post hoc analysis depends on accurate coding, a failure here would produce meaningless results that might appear meaningfully damaging to a hypothesis, even if the remainder of the experimental process is sound.
I've always found it odd that such a crucial element of these studies is generally glossed over. There seems to be an assumption that the coding is competent . Yet, we know nothing about how it was done, who did it, and how its integrity was ensured and secured. As far as I can tell from the supplemental materials, the final coding was done at BSRI. Call me skeptical, but all the political fanfare they engaged in around publication didn't exactly garner much trust with me.
There are a few peculiarities in the BWG data that certainly don't prove anything but seem to line up with this possibility. Imagine, for the sake of argument, that the positives found within each panel were actually from positive patients but mis-coded as coming from a random selection of patients:
Both Ruscetti and WPI each found 22 positives/intermediates by serology (where each had 30 positive patient samples and 72 total samples). This would represent 73% detection rate among positives and 31% overall. Strange that whatever they were detecting was being found at the same overall percentage between labs. Even if they weren't finding "XMRV," and even if the coding was correct, why was there no concern given to the fact that they were finding something at equal percentages?
Ruscetti found 9 non-control positives by culture (where they had 15 non-control positive patient samples). This would represent 60% detection rate among non-control positives and 30% of non-control samples overall (70% and 40% respectively if positive controls are included). So we see that the aggregate detection percentages concord roughly with serology though the coding-dependent sources do not. (Another tangential question here: why weren't the the results of any of the positive cultures sequenced? This seems like an appalling lack of investigative curiosity.)
In each PCR panel, the WPI had the exact same number of non-control positives as they did control negatives. This could be explained by mis-coding.
In summary: If looked at in aggregate and without concern for the patient source of individual samples, the serology and culture findings have positivity percentages that concord with not only one another but also with what could reasonably be expected given the underlying count of positive and negative samples. It is only when these findings are mapped to the underlying sources, a process entirely dependent upon the integrity of coding, that this concordance breaks down.
It is on the basis of this latter sample-source discordance that the entirety of the paper's analysis was conducted and upon which the results are deemed damaging to the HGRV hypothesis, yet there is not an ounce of data about the coding process, the essential part of the study that transforms potentially meaningful aggregate results into apparently meaningless source results. This hypothesis also has the advantage that the results in BWG 3 would make sense in the light of results produced by BWG 1 and 2. Currently the results in BWG 3 make no sense when viewed in the light of the results produced by BWG1 and 2
I've always found it odd that such a crucial element of these studies is generally glossed over. There seems to be an assumption that the coding is competent . Yet, we know nothing about how it was done, who did it, and how its integrity was ensured and secured. As far as I can tell from the supplemental materials, the final coding was done at BSRI. Call me skeptical, but all the political fanfare they engaged in around publication didn't exactly garner much trust with me.
There are a few peculiarities in the BWG data that certainly don't prove anything but seem to line up with this possibility. Imagine, for the sake of argument, that the positives found within each panel were actually from positive patients but mis-coded as coming from a random selection of patients:
Both Ruscetti and WPI each found 22 positives/intermediates by serology (where each had 30 positive patient samples and 72 total samples). This would represent 73% detection rate among positives and 31% overall. Strange that whatever they were detecting was being found at the same overall percentage between labs. Even if they weren't finding "XMRV," and even if the coding was correct, why was there no concern given to the fact that they were finding something at equal percentages?
Ruscetti found 9 non-control positives by culture (where they had 15 non-control positive patient samples). This would represent 60% detection rate among non-control positives and 30% of non-control samples overall (70% and 40% respectively if positive controls are included). So we see that the aggregate detection percentages concord roughly with serology though the coding-dependent sources do not. (Another tangential question here: why weren't the the results of any of the positive cultures sequenced? This seems like an appalling lack of investigative curiosity.)
In each PCR panel, the WPI had the exact same number of non-control positives as they did control negatives. This could be explained by mis-coding.
In summary: If looked at in aggregate and without concern for the patient source of individual samples, the serology and culture findings have positivity percentages that concord with not only one another but also with what could reasonably be expected given the underlying count of positive and negative samples. It is only when these findings are mapped to the underlying sources, a process entirely dependent upon the integrity of coding, that this concordance breaks down.
It is on the basis of this latter sample-source discordance that the entirety of the paper's analysis was conducted and upon which the results are deemed damaging to the HGRV hypothesis, yet there is not an ounce of data about the coding process, the essential part of the study that transforms potentially meaningful aggregate results into apparently meaningless source results. This hypothesis also has the advantage that the results in BWG 3 would make sense in the light of results produced by BWG 1 and 2. Currently the results in BWG 3 make no sense when viewed in the light of the results produced by BWG1 and 2