'PACE-GATE: An alternative view on a study with a poor trial protocol' by Bart Stouten in JHP

Yogi · May 12, 2017

http://journals.sagepub.com/doi/abs/10.1177/1359105317707531?journalCode=hpqa

Abstract
The controversies surrounding the effectiveness of cognitive behavioural therapy and graded exercise therapy for chronic fatigue syndrome are explained using Cohen’s d effect sizes rather than arbitrary thresholds for ‘success’. This article shows that the treatment effects vanish when switching to objective outcomes. The preference for subjective outcomes by the PACE trial team leads to false hope. This article provides a more realistic view, which will help patients and their doctors to evaluate the pros and cons.

AndyPR · May 12, 2017

Paywall at moment but MEA Facebook page has stated that it will go to open access soon.

Yogi · May 12, 2017

I didn't understand the bit about influenced by the funding source??

I presented three other cases where the trial protocols have been questioned. In the first example, the trial protocol was influenced by reviewers of the funding source. In the second example, the final analysis seems inconsistent with the trial protocol.

I like the last bit:

We are living in the era of Internet and big data, where information is more accessible than ever before. It is refreshing to see patients ask- ing critical questions and claim access to data that are generated by publicly funded studies. I hope they will use parts of my contribution to further investigate PACE-gate and other CFS studies. I admire their perseverance and look forward to see their upcoming publications.

trishrhymes · May 12, 2017

@Yogi, I assume your quotes are from behind the paywall. Thanks for giving us a flavour of the article.

alex3619 · May 12, 2017

Yogi said:
I didn't understand the bit about influenced by the funding source??

My best guess, without further information, is that the people paying for the study were involved in reviewing the study.

By way of example, with the PACE trial's first two Cochrane reviews something similar appears to have happened. The first one had the PACE authors as reviewers, the second had PD White draft the analysis protocol, at least from my recollection of what came out last year.

Snow Leopard · May 12, 2017

This article shows that the treatment effects vanish when switching to objective outcomes. The preference for subjective outcomes by the PACE trial team leads to false hope.

Exactly.

A.B. · May 12, 2017

One could say that the treatment effects are all in the mind

(of the PACE authors).

panckage · May 13, 2017

Yogi said:
I didn't understand the bit about influenced by the funding source?

There was a discussion about an insurance organization being involved. Patients considered cured can be kicked off of disability to save the organization money.

Anybody remember the source for this?

Esther12 · May 18, 2017

This is now open access: http://journals.sagepub.com/doi/full/10.1177/1359105317707531

This piece is critical of PACE, but I wasn't so sure about some of it.

They defended the use of these therapies with arguments based on a series of false dilemmas: treatments are either effective or ineffective; the result is either black or white; the opponents are wrong and they are right. Unfortunately, they have not shown how effective CBT and GET are. I believe this is the crucial point in the debate between Geraghty and White et al. Let us consider the shades of grey by studying Cohen’s d effect sizes.

Have they shown CBT and GET are genuinely effective, if only slightly? I don't think that they have.

For pragmatic reasons, I decided to use the 0123 coding scheme in my effect size analysis: the data are readily available from White et al., and it produces more precise results for fatigue than the 0011 scheme.

Likert and bimodal scoring of the Chalder Fatigue questionnaire are measuring different things, and I don't think it's right to just assume that likert is a more precise measure of fatigue than bimodal.

edit: While the Chlader Fatigue Scale is so rubbish it doesn't really matter, I'd suspect that bimodal scoring might be better at mitigating some of the problems with it than likert, and so would be a 'more precise' approach. Maybe looking at PACE data and seeing how likert/bimodal scoring correlates with more objective outcomes could provide some evidence on this?

This leads to the interesting hypothesis that the effect size of CBT and GET reduces as the objectiveness of the outcome increases.

To investigate this hypothesis, I added to the analysis the only objective test which I could find in White et al.’s study: the distance covered in a 6-minute walking test after 12 months.

I don't really understand why he didn't also use the fitness and employment data that has been released.

This has been discussed elsewhere, eg: http://www.bmj.com/content/350/bmj.h227/rr-10

Stulemeijer et al. studied the effects of CBT on subjective fatigue, subjective functional impairment and school attendance in young people with CFS. Their control group consisted of patients on a waiting list for receiving CBT. To deal with issues around missing data, they carried forward the last observations for all variables, except for school attendance. Their rapid response reveals that the final choice of the method for analysing school attendance was made after inspecting the trial data (Stulemeijer et al., 2005). This suggests that their analysis was not in line with the trial protocol. If they had carried forward the last observations for the missing school attendance data too, the results for CBT would have shown that it was not an effective treatment for this primary outcome (Stouten, 2004).

I thought that this was interesting, and I don't think I'd read the earlier BMJ RR he'd written on this:
http://www.bmj.com/rapid-response/2011/10/30/question-statistical-advisors-bmj

After publication, the authors agreed that, according to my suggestion, recoding the Chalder fatigue scale from 0011 to 0123 gives more precise results (Stouten, 2010; Wearden et al., 2010b). Wearden et al. (2010b) subsequently demonstrated a modest improvement in fatigue that is statistically significant in favour of pragmatic rehabilitation.

Yet this seems to be contradicted by the results released in the Larun Cochrane review, supposedly calculated from raw FINE data. It would have been good if this had been mentioned.

Maybe I'm being overly-critical on this, as my expectation is that this issue should be an easy win for us, but I feel like this paper uneccessarily muddies the water by omitting some important pieces of infomation. As the author cites some of his own comments from 2004, I wondered if maybe he has been paying less attention to issues recenty, and is a bit relying on old knowledge? If I was speaking with the author I would express my gratitude for him writing it, but I didn't think that this was great tbh, and I have some concerns that it gives the PACE authors some valuable tools for their own response.

trishrhymes · May 18, 2017

I skimmed through it. I think it's a valuable contribution in that it makes clear that the more objective the data is, the more the so called improvements disappear, and cites studies that seemed to show improvement, but once they looked at the actometer data, there was none. This point the finger at PACE for abandoning actometers.

Esther12 · May 18, 2017

trishrhymes said:
I skimmed through it. I think it's a valuable contribution in that it makes clear that the more objective the data is, the more the so called improvements disappear, and cites studies that seemed to show improvement, but once they looked at the actometer data, there was none. This point the finger at PACE for abandoning actometers.

Yeah, that's true. It's just that this is a point I've seen so many people make more persuasively before that it's a little frustrating that this is the version published in a medical journal.

The use of only 6mwt data from PACE, ommitting the two other objective outcomes that were null results for CBT and GET, is pretty odd.

I felt that if a draft of this had been posted on PR for comment we could have helped it be a lot better than it was.

trishrhymes · May 18, 2017

Esther12 said:
The use of only 6mwt data from PACE, ommitting the two other objective outcomes that were null results for CBT and GET, is pretty odd.

True, but we only have the raw data for the walk, not the other two. And only a graph, no figures for the step test. I haven't read the paper closely enough to work out whether the author of this paper used the raw data.

user9876 · May 18, 2017

Esther12 said:
Likert and bimodal scoring of the Chalder Fatigue questionnaire are measuring different things, and I don't think it's right to just assume that likert is a more precise measure of fatigue than bimodal.

I think you are right. More precise suggests it is the same measure and just has more accuracy however, some people got worse with one marking scheme and improved with the other - so I would say that cannot be more precise.

user9876 · May 18, 2017

Esther12 said:
The use of only 6mwt data from PACE, ommitting the two other objective outcomes that were null results for CBT and GET, is pretty odd.

Given we have the 6mwt data but not the other measures (with the step test only being reported as a graph) then I'm not surprised he omits them.

Esther12 · May 18, 2017

user9876 said:
Given we have the 6mwt data but not the other measures (with the step test only being reported as a graph) then I'm not surprised he omits them.

He does refer to other (non-PACE) resulst that we do not have the raw data for, so I'd have thought it would make sense to refer to these results too. Ah well.

I think I may have been expecting too much when I sat down to read this. I did have some interesting info in it I was not aware of before.

BurnA · May 18, 2017

I am not a PACE expert but as time goes by I realise I don't need to be, I know it was unblinded and subjective therefore it's unreliable. I know patient selection criteria was dubious which calls the whole trial into question.

Sure it's interesting to see just how many different ways they messed it up and twisted things to suit the results they wanted to get, but the more in depth an argument gets, the more it seems like minor details that an outside person might just assume is a difference of opinion.

Who is to say six minute walking tests 12 months apart is anything to do with ME. Is that question asked anywhere?

Articles like this are useful to have in our armory but I would be hoping never to need to refer to this paper to convince someone that PACE was flawed.

RogerBlack · May 19, 2017

BurnA said:
Who is to say six minute walking tests 12 months apart is anything to do with ME. Is that question asked anywhere?

The amount you can do once, at picked point in time is relevant.
Two day walk test (or CPET) would of course be better.

BurnA · May 19, 2017

RogerBlack said:
The amount you can do once, at picked point in time is relevant.
Two day walk test (or CPET) would of course be better.

Relevant to what though.

Good days and bad days, natural fluctuations in severity - how are they accounted for.
How much effort was expended prior to test that day etc.

RogerBlack · May 20, 2017

BurnA said:
Relevant to what though.

Good days and bad days, natural fluctuations in severity - how are they accounted for.
How much effort was expended prior to test that day etc.

For a whole group, with enough people that the averages of both, and the differences between provide robust evidence.

Dolphin · Jun 18, 2017

For CBT, the beneficial effect over SMC vanished when using the objective outcome measure. In other words, though patients think they are able to walk more after CBT, they fail to actually do so.

'PACE-GATE: An alternative view on a study with a poor trial protocol' by Bart Stouten in JHP

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Hibernating

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member