http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3665909
It is somewhat confusing which thresholds were actually used:
We defined clinically useful improvements within each person as having a difference of 2 points on the total Chalder Fatigue Scale and a difference of 11 points on the SF-36 physical function subscale between the baseline and follow-up measurements. We chose these cut-offs because they equated to 0.5 SD of the distribution of the baseline measurements.
As
Dolphin pointed out earlier, the nearest whole number for 0.5SD of baseline score (i.e. 5.2) is 3 not 2. Furthermore, they later use the > symbol which should mean "greater than":
About 74% (620 of 834) of patients had a decreased Chalder Fatigue score at follow-up and 64% (534 of 834) had improved by >2 points (our definition of a clinically useful improvement). In contrast, only 50% (416 of 834) of patients had an increased SF-36 physical function score at follow-up and only 16% (131 of 834) had improved by >22* points.
* The 22 is a typo, it should be 11.
Usage of the ">" symbol makes me wonder what thresholds were actually used, did the NOD study have a higher threshold than PACE for clinical improvement due to higher baseline SDs, i.e. >2 or >=3 points in fatigue compared to PACE's >=2 points, and >11 or >=12 points in physical function compared to PACE's >=8 points?
If so, it may help to partly explain some of the difference in 'response' rates, but it still does nothing to explain the much lower group average improvements in physical function score. Irrespective of the above confusion, when the range for the physical function score is 0-100, the scale is in 5 point increments, so an individual cannot score 8 points (PACE) or 11 or 12 points (NOD), but must score 10 points (PACE) or 15 points (NOD) to reach the threshold.
So it would appear that the NOD threshold for an individual clinical response in physical function (>=15 points) was higher than the PACE equivalent of >=10 points. Again, this does not explain the much less improvement in physical function score for NOD than for PACE.
The reported baseline scores were very similar between NOD and PACE, but the NOD study lost half of the patients to follow-up and had no natural course control group or a SMC-alone group.
Bob. I agree that a direct comparison cannot be reliably be made to the PACE Trial, but if a comparison is made, then it would be with the
treatment plus the control.
However, just as it seems plausible that the application of CBT/GET in NOD was not as good as in PACE, it also seems plausible that the equivalent of SMC (if given at all) was also not as good. The efficacy-effectiveness gap between RCTs and routine clinical practice is a well-known phenomenon and there is evidence this applies here as well. The patient population and staff competence and quality control are often different between these two settings.