@RustyJ
Your question brings up the question of selection effects. If there weren't so many blatant problems in other aspects I think this would be a major point of contention. Permit me to rant a little on just how pervasive these were.
The original intake of patients referred for treatment for "CFS" by NHS doctors, who have long had material generated by the study authors to guide them, was 3158. Some of the same authors have argued that 30% of CFS diagnoses are in error, while implicitly assuming they are infallible, but we will overlook that. (Who said "we never make mistakes", the KGB perhaps?)
Out of that intake, the authors chose nearly 900 they wished to participate, but got only 641. The authors assume those people who declined simply had a perverse desire to remain ill. If we had some idea of the change in activity needed to participate we might be able to judge how many declined to participate because they lacked the marginal energy levels required. If such a reason had any influence on the process, it would mean that patients without the ability to offload or displace activity were underrepresented in the trial. They would probably have been near the lower bound for entry. We already know about selection problems near the upper bound, due to the change in entrance criteria.
We then divide the participants into four different arms of the study, yielding 150-160 in each. After that we simply omit about 1/3 who don't supply complete objective data from which we might measure improvement or decline. This means the only remaining objective measure is based on a something like 100-120 individuals in each arm. At this point the effect of about 3 individuals is enough to substantially shift mean values. In terms of initial intake this is about 0.1%, and even pathologists admit to such an error rate in diagnoses.
If these samples are normally distributed you can predict the number of outliers you might see without falsifying the assumption of a normal distribution. This is where individual data, though not necessarily identifiable individuals, become important. If the distributions violate the assumption of normality then those reported measures of significance don't mean much.
It happens that the population distribution being sampled for comparison was far from normal, even being one-sided. The assumption that selection was independent enough to create normally-distributed sample groups is untested. As far as I can tell selection boiled down to the same people saying "we want this individual in the study" or "we don't want that individual in the study" before we get to the randomization in assigning them to particular arms of the study. If you weight all the dice you don't have to worry about any particular die.
A normal distribution is a stable distribution completely characterized by two parameters. All study groups including the control showed an increase in variance/standard deviation over time, which argues against stability. The idea that these group distributions were completely determined by only two parameters remains an untested and unproven assumption. Detailed anonymous data would make it possible to test this. Such has not been provided.
In arguing that this study is suitable for setting national policy the authors implicitly assumed that the same illness found in a small percentage of that intake of 3158 affected all of them. They also assumed all these patients had enough marginal activity available to participate in therapy, despite the real possibility around 1/3 of those they wanted in the study might not have met this unstated criterion.
If I can use the effect on 3 individuals in a study, who may or may not have the same disease as others, to set national policy I can accomplish all kinds of things of dubious scientific validity.