PACE Trial and PACE Trial Protocol

anciendaze · Mar 14, 2011

wild idea

This idea is scarcely half-baked, but maybe it will stimulate useful thoughts. If we can't directly demonstrate the absurdity of some assumptions going into this study, maybe we can produce absurdities coming out. For the moment, forget the lower bounds. If you take the numbers in the study as meaningful, what do they tell you about the healthy population, including allegedly fully-recovered ME/CFS patients? My first thought is that those experiencing recovery to full health become outliers rare enough to ignore.

oceanblue · Mar 14, 2011

Dolphin said:
This partly builds from something Anciendaze said. Maybe it has been said by others.

As people probably sat the questionnaires altogether, another way of doing it would to compare the figures of those who did the 6-minute walk test versus those who did the questionnaires (a mean value of 587):

Analyze a 2x2 contingency table
Outcome 1 Outcome 2 Total
Did questionnaires (mean) 587 53 640
Did 6-min walking test 462 178 640
Total 1049 231 1280

Fisher's exact test
The two-tailed P value is less than 0.0001
The association between rows (groups) and columns (outcomes)
is considered to be extremely statistically significant.

Click to expand...

What sort of person would be more likely not to do a 6 minute walking test? My guess is on average a person who either felt iller at the time of the test (than the average person in the group) or felt iller (than the average person in the group) after doing this test earlier in the trial (at baseline and/or 24 weeks). So the final figures for the group as a whole might be lower again if everyone had been included. They could calculate such a figure by using "last value carried forward" (which has often been used in previous CBT and GET studies).

Wow, that looks a very important finding; 6MWT is the one test that you would expect people to opt out of because they didn't feel up to it. The PACE group almost have a flair for not picking up on odd findings like this; I'm sure they would have noticed it.

oceanblue · Mar 14, 2011

Hi Dolphin, I'd like to get to the bottom of this as I think it could provide another good way to challenge the PACE findings.

Dolphin said:
The primary outcomes are what are reported in the abstract.

Primary outcomes reported in the abstract were mean fatigue and physical function scores compared with the SMC group, i.e. differences between means. And this is the only outcome measure relating directly to these primary outcome:

A clinically useful difference between the means of the primary outcomes was defined as...

The secondary outcome measures you mention are all simply using sf36 and cfq, but not to specifically look at the difference between means.
My point is they have set out their stall on evaluating primary outcomes and we shouldn't now let them off this hook.

They are merging together two concepts in my mind: average differences and "clinically useful differences" which are more commonly thought of in terms of the percentage of people who actually have them, I would have thought.

However, it is interesting to know the size of the difference of the means... So they could argue they are giving information by discussing this difference.

As far as I can tell, clinically useful differences are used both to qualify the size of differences between means as well as the benefit to individuals - and I think the authors are doing exactly what you say.

Statistically significant & clinical difference
When I talked to a life science professor at the weekend about this paper he stressed the importance at looking at the confidence intervals rather than just the means. Obviously, if the confidence intervals of the means overlap, they are not significantly different. Likewise, if the difference between the confidence intervals of the lower mean and the higher mean is less than the targeted difference (eg CUD of 8 for SF36) then that targeted difference is not met.

So I wonder if what we can safely say now is this:

1. GET and CBT were significantly better on primary outcomes for fatigue and physical function than SMC.
2. However, at 95% confidence levels, neither GET nor CBT achieved a clinically useful difference compared with SMC.

I think it's telling that when the authors say:

Mean differences between groups on primary
outcomes almost always exceeded predefined clinically
useful diff erences for CBT and GET when compared
with APT and SMC

they do this in the discussion not the results, which gives them a more leeway to do little things like ignore statistical significance.

Why this matters
The authors say:

We suggest
that these findings show that either CBT or GET, when
added to SMC, is an effective treatment for chronic
fatigue syndrome, and that the size of this effect is
moderate

(Note 'suggest', but this suggestion still finds it's way into the absract under 'interpretation'.) However, if the point about the clinically useful difference not being statistically significant falls, then so too does the assertion that the effectiveness of CBT/GET is moderate.

That would leave the conclusion that CBT/GET have only a small effect. I could live with that.

Angela Kennedy · Mar 14, 2011

oceanblue said:
That would leave the conclusion that CBT/GET have only a small effect. I could live with that.

But at least it's 'safe'?

Dolphin · Mar 14, 2011

Well, you seem to have summed up the issue fairly well, oceanblue.

Like you point out, they do bring in "moderately" into the abstract.

I just had the Guyatt et al. paper lined up to read - whether that gives extra information on the issue, I don't know.

I know researchers these days are encouraged to report effect sizes and the more common effect sizes I have seen in the past in the ME/CFS literature are cohen's d and the like.
So basically I have seen few papers that I can recall using "clinical useful differences" so am not best qualified to comment.

I know one person sent in a letter which said a "clinical useful difference" - I can't remember the reason they give now - I could look back but they might prefer it wasn't mentioned in case their letter gets published.

But you could be right that "moderately" is vulnerable/very vulnerable.

oceanblue · Mar 14, 2011

Dolphin said:
I'm saying if we had the actual means and standard deviations for figures 2F and 2G that there might not be a statistically significant difference between the SMC score and the CBT score and the SMC score and the GET scores in 2F and 2G. Somebody said to me just by looking at it one can tell there is no difference

I guess that's because the error bars for CBT/GET at 52 weeks greatly overlap the error bars for SMC (fig 2).

However, they may be using some fancy (but valid) statistical technique that looks at the whole time series from baseline to 52 weeks and perhaps that shows statistically significant differences - I'm not sure.

Dolphin · Mar 14, 2011

oceanblue said:
I guess that's because the error bars for CBT/GET at 52 weeks greatly overlap the error bars for SMC (fig 2).

However, they may be using some fancy (but valid) statistical technique that looks at the whole time series from baseline to 52 weeks and perhaps that shows statistically significant differences - I'm not sure.

Yes, but it would be good to see the figures. What they may have done is like you say (and I accepted initially), they look at the slope of the graphs (say) and say they're similar enough to the overall figures so there is no statistical difference. However, such a method might not be sensitive enough to check for the small differences in particular items (what they use may involve the APT for example which might decrease the apparent changes) which means that SMC is no longer different.

I never like it when researchers only give data in graph form and don't give the figures.

Somebody I talked to on the day the paper came out (if I picked it up correctly) said that one would get several figures/p-values, not just one figure, if looking at this problem normally (I think it was looking at the group (i.e. intervention) x Time x criteria interaction).

oceanblue · Mar 14, 2011

biophile said:
Another fine post, Biophile.

Selected quotes from the GET therapist manual:
the PACE results paper does not mention how many people in the GET group actually managed to increase their activity (correct?), it is quite possible that most people in the GET group did not. Without actigraphy we may never know for sure, but perhaps the 6 minute walking distance is a smoking gun.[/B]

I think this and several of your other points highlights a general issue with the trial results, not just that we don't know if anyone acutally recovered (since they ditched the recovery measure) but that we don't know the distribution of the data generally.

Eg, for each measure, how many changed:

a lot better, a little better, minimal change, a little worse, a lot worse

and this makes it hard to interpret the data.
(we have it for CGI, but not in such detail and it relies on particpant recall of how they were at the start of the trial).

More from the GET therapist manual: "By week 4, most participants will be able to commence aerobic exercise." I find it very unusual that a group of people who on average are allegedly ready for (light) "aerobic exercise" after only 4 weeks of GET with the gradual aim of several sessions a week of moderate exercise, cannot even break the 400m barrier on a single 6 minute walking distance after 52-weeks of GET when healthy people (including sedentary people) are scoring 600-700m!

I completely agree.

Rates of fully recovery were not given in the results (correct?), but you can be sure that if there were an impressive rate of "recoveries" the authors would be proudly announcing them.

I'm sure the authors wouldn't withhold data unless there was a very good reason and I've no doubt it's for our own good.

At this stage it does seem that the GET rationale of deconditioning has been thoroughly discredited or at least massively exaggerated. ...may be why biopsychosocialists (like those who wrote the editorial and conducted the meta-analysis) while not admitting a major chunk of their hypothesis has been debunked (fear-avoidance and deconditioning) are now focusing more on "cognitions" and "perceptions" about symptoms.

Yes, debunked, but you wouldn't guess it from reading any coverage of this trial.

As I understand it, pacing it not about avoiding all exacerbations. PACE has set up a false dichotomy between GET (allegedly as a gentle approach to pushing the boundaries) vs a strawmanned version of pacing (APT avoiding all exacerbations). Avoiding all activity related exacerbation as encouraged by APT is impractical, it is no surprise that PF/SF-36 scores were the worst in the APT group.

Love your straw man analogy. Yes, their version of APT with participants told to stay within 70% of their limits is bizarre: particpants were practically doomed to make no progress.

Dolphin · Mar 14, 2011

oceanblue and others,

And to repeat something that has been said at least once, if not more, the changes figures of 8 (SF-36 PF) and 2 (CFQ) used for clinically useful difference are artificially small because they are based on the SD which was artificially small because the same items were used for entry criteria.

oceanblue · Mar 14, 2011

Dolphin said:
They didn't report the "recovery" figures; I agree with you - I think if they had got good rates of recovery we would have heard it.

Given the model for GET (i.e. symptoms/deconditioning are temporary and reversible), I think there should be an obligation on them to report the figures or otherwise one doesn't know if the model has been tested:

I think it would be a good idea if someone - ideally one of the ME charities - formally wrote to the authors asking for publication of data promised in the protocol but curiously absent in the paper e.g. recovery rates. If that fails, there would then be the option of going to the MRC who funded to the trail to a massive extent and whose Trial Steering Group approved the protocol. It might be tricky for the MRC to turn down such a request.

oceanblue · Mar 14, 2011

Dolphin said:
oceanblue and others,

And to repeat something that has been said at least once, if not more, the changes figures of 8 (SF-36 PF) and 2 (CFQ) used for clinically useful difference are artificially small because they are based on the SD which was artificially small because the same items were used for entry criteria.

I know, it was said by me, amongst others, if I recall correctly!

Dolphin · Mar 14, 2011

oceanblue said:
I think this and several of your other points highlights a general issue with the trial results, not just that we don't know if anyone acutally recovered (since they ditched the recovery measure) but that we don't know the distribution of the data generally.

Eg, for each measure, how many changed:

a lot better, a little better, minimal change, a little worse, a lot worse
and this makes it hard to interpret the data.
(we have it for CGI, but not in such detail and it relies on particpant recall of how they were at the start of the trial).

In Fulcher & White (1997), they gave the complete data: http://www.bmj.com/content/314/7095/1647/T1.expansion.html

Dolphin · Mar 14, 2011

oceanblue said:
I know, it was said by me, amongst others, if I recall correctly!

Yes, just that I believe it was generally mentioned in this thread in the context of the percentages getting improvement so I thought it was worth highlighting now given as you point out, it is also used in the abstract "moderately improve" in a different context i.e. covering all the participants.

oceanblue · Mar 14, 2011

Dolphin said:
In Fulcher & White (1997), they gave the complete data: http://www.bmj.com/content/314/7095/1647/T1.expansion.html

Ah, now that looks like the right way to present data.

anciendaze · Mar 14, 2011

Outliers

Another inference from that assumed normal distribution is that a few ME/CFS patients should be running marathons, while those at the other extreme exhibit negative physical activity. Considering the estimated total of sufferers, there shouldn't be any problem finding a few of each in the UK.

Angela Kennedy · Mar 15, 2011

Should be repeated here: The authors reply to David Tuller.

Note their claim of SAFETY proven for ALL groups. It doesn't matter how efficacious really- at least CBT/GET is SAFE, that's their message.

http://forums.aboutmecfs.org/showthread.php?10601-NYT-letter-to-editor-re-David-Tuller-article

Marco · Mar 15, 2011

Clinically useful improvement.

If you were running a trial for example on a drug intervention for diabetes you may be able to monitor blood sugar levels and if you're lucky you may also be able to measure other objective changes like insulin dosage required etc (apologies if I misunderstand the mechanics of diabetes - its unimportant for the purposes here).

You may also be able to prove that any improvements are statistically significant compared to placebo.

However the decision on whether these changes are clinically significant or clinically useful usually depends on the overall impact on health and this is often measured by patients' subjective responses on how they feel their overall perception of health and function has changed. Subjective experiences therefore help validate objective physiological measures.

In the case of PACE, validation is difficult if not impossible.

The primary outcome measures, SF 36 pf and CFQ, are both based on subjective responses of how people 'feel' while three of the interventions tested are specifically aimed at changing patients' subjective feelings about their symptoms and levels of physical function/fatigue; either positively in the case of CBT and GET (symptoms are benign and the condition reversible) or negatively in the case of APT (symptoms should be heeded and the condition is not reversible).

These results alone cannot be considered as clinically useful without external validation, regardless of the degree of improvement.

External validation by an objective measures could be taken as support for a clinically useful improvement however the only objective measure reported was the 6MWT where marginal improvements were found.

Even if these improvements, while small, were statistically significant, they still cannot be validated by the two primary outcome measures for the reasons stated above and therefore cannot be considered as clinically useful.

Ergo, none of the results reported can be proven to be clinically useful.

oceanblue · Mar 15, 2011

wdb said:
This is a very interesting paper and hugely important I think. It looks to me like Wessely's own data indicates that placebo response alone in CFS trials is typically 15.4%–23.7%
Does this not mean that in this context the PACE trials 15% response above no treatment is completely meaningless.

The Placebo Response in the Treatment of Chronic Fatigue Syndrome: A Systematic Review and Meta-Analysis

[...

Results: The pooled placebo response was 19.6% (95% confidence interval, 15.4–23.7), lower than predicted and lower than in some other medical conditions. The meta-regression revealed that intervention type significantly contributed to the heterogeneity of placebo response (p = .03).

Conclusion: In contrast with the conventional wisdom, the placebo response in CFS is low. Psychological-psychiatric interventions were shown to have a lower placebo response, perhaps linked to patient expectations.

Thanks for posting this, wdb. It looked very promising but unfortunately on reading the paper it turns to dust (like much CFS research).

First, it looks at placebo on its own, not compared to a 'no treatment' group, so the placebo effect will include any natural recovery as well as problems like regression of the mean. So the 'true' placebo effect would have been lower.

More importantly, as the authors say:

The major limitation of the review was the heterogeneity of
the outcome measurement systems across the trials. Different
scales and instruments were used to define and measure the
endpoint, clinical improvement.

They are comparing apples with pears which undermines the whole effort, especially when you are trying to quote figures like a 19% typical effect.

Also they looked at a very mixed bag of mainly quite small studies that are very hard to compare properly in a meta-analysis. You can tell the authors are struggling when they say things like:

In this sense, the result of this meta-analysis seems to be meaningful.[!]

Sadly it looks like we can't rely on this study to question the limited effect of CBT etc in PACE.

wdb · Mar 15, 2011

oceanblue said:
Thanks for posting this, wdb. It looked very promising but unfortunately on reading the paper it turns to dust (like much CFS research).

First, it looks at placebo on its own, not compared to a 'no treatment' group, so the placebo effect will include any natural recovery as well as problems like regression of the mean. So the 'true' placebo effect would have been lower.

More importantly, as the authors say:
They are comparing apples with pears which undermines the whole effort, especially when you are trying to quote figures like a 19% typical effect.

Also they looked at a very mixed bag of mainly quite small studies that are very hard to compare properly in a meta-analysis. You can tell the authors are struggling when they say things like:

Sadly it looks like we can't rely on this study to question the limited effect of CBT etc in PACE.

Having read through the whole study a bit more I tend to agree with you. They seem to have used some odd methods to analyze the data, I'm not even sure exactly what the 19% figure is referring to, I think it may actually be the percentage of people that they consider met some recovery criteria through placebo treatment alone.

On the plus side though the list of studies included may still be useful, some of those individually may provide good evidence that even sham treatments with no therapeutic value when delivered convincingly can still lead to significant reported subjective improvements in patients, possibly comparable in magnitude to CBT/GET results.

Full text is here if anyone is interested http://www.psychosomaticmedicine.org/cgi/reprint/67/2/301

anciendaze · Mar 15, 2011

oceanblue said:
Thanks for posting this, wdb. It looked very promising but unfortunately on reading the paper it turns to dust (like much CFS research)....

Don't expect experienced psychobabblers to print any falsifiable hypotheses. Somehow an entire branch of medicine has lost sight of a central feature of science.

The level of reasoning reminds me of an acquaintance who was explaining how Scorpios were unusually sexually active. When I countered with reference to one we both knew who was conspicuously abstinent, he said it was either one thing -- or the exact opposite!

PACE Trial and PACE Trial Protocol

Senior Member

Guest

Guest

Senior Member

Senior Member

Guest

Senior Member

Guest

Senior Member

Guest

Guest

Senior Member

Senior Member

Guest

Senior Member

Senior Member

Grrrrrrr!

Guest

Senior Member

Senior Member