The PACE Trial – The Results

Bob · Aug 25, 2012

Number Needed to Treat - scientific consensus

I've just discovered that there seems to be a scientific consensus to round up the "number needed to treat" figures whereas, in my opening post, I rounded to the nearest integer.

So, in my opening post, my NNT figures place CBT and GET in a slightly more favourable light than they deserve.

I think I will change them when I make a revision.

When rounded up, the 'number needed to treat' figures, in my opening post, would be as follows (I've bolded the changes):

CBT physical function 13% (NNT = 1 in 8)
CBT fatigue 11% (NNT = 1 in 10)

GET physical function 12% (NNT = 1 in 9)
GET fatigue 15% (NNT = 1 in 7)

Average for CBT/GET = 13% (NNT = 1 in 8)

Bob · Aug 25, 2012

Simon said:
I'm pretty sure all CUD thresholds are secondary, even those in Table 3:

A secondary post-hoc analysis compared the
proportions of participants who had improved between
baseline and 52 weeks by 2 or more points of the Chalder
fatigue questionnaire, 8 or more points of the short
form-36, and improved on both.

Click to expand...

The 0.5 SD was used to assess if there was a clinically useful difference between the means of the primary outcomes.

Oh gosh, there is so little clarity in published paper! It's muddled and confusing, partly because they dropped the proposed primary outcome measure of a 'positive outcome', and they changed the main measure of clinical effectiveness (CID), and they altered the proposed methodology (bimodal to Likert scoring, and actometers etc.), and also because they have obfuscated some of the results in their discussions.

Table 3, of the published paper, is labelled as displaying the 'primary outcomes' and it shows the 'numbers improved from baseline'. This is where my NNT figures come from.

I had always interpreted Table 3 as meaning that the redefined CUD was used as a primary outcome measure in the final published paper.

But I've now done some more reading of the paper, and thinking, and I think I agree with you. Please see my post to biophile, below.

Even so, it is important to point out that these are the main measures of clinical effectiveness used in the paper.

To keep our perspective, I agree that we should always remember that the CUD is a post-hoc definition, and what we really need is the results for the original proposed primary outcome measure of a "positive outcome".

They have deprived us of proper well-planned data, and have successfully reduced us to picking over the scraps!

(Note to self: add "positive outcomes" to my list of 'unpublished data'.)

biophile · Aug 25, 2012

Hi Bob. If you are rounding up for the NNT of CBT and GET for a clinical response in both fatigue and physical function, it should be 7.14 (or 8) for CBT and 6.25 (or 7) for GET. You are correct that the average for both is 7 when rounded up (even when accounting for slightly different total group numbers between CBT and GET). Also, note that the NNT for a "positive change" in CGI is also 6.25 (or 7).

The primary outcomes for fatigue and physical function were changed, from dichotomous scores ie meeting 50% relative improvement or meeting an absolute threshold score (which were stricter than the post-hoc thresholds of CUD), to continuous scores as shown in Figure 2. Everything else is additional or "secondary". Also, as you would be aware, fatigue scoring on the CFQ was changed from bimodal to Likert scoring, but note that the latter was already a secondary outcome in the 2007 protocol and became primary. The authors should still publish the results based on the original goalposts, it would literally only taken up one paragraph.

The authors predicted response rates of 10% for SMC, 60% for CBT and 50% for GET. However, their definition of a "response" was watered-down. CBT and GET could only approach those rates of response when the threshold was low, but this inflated the SMC response rate. Conversely, the predicted response rate for SMC was probably accurate when using the original thresholds, but would demonstrate that CBT and GET performed relatively poorly compared to expectations. The authors cannot have it both ways, and the original thresholds would not have allowed news articles to parrot the supposed "60%" response rate for CBT/GET while failing to mention the 45% response rate for SMC.

Bob · Aug 25, 2012

biophile said:
Hi Bob. If you are rounding up for the NNT of CBT and GET for a clinical response in both fatigue and physical function, it should be 7.14 (or 8) for CBT and 6.25 (or 7) for GET. You are correct that the average for both is 7 when rounded up (even when accounting for slightly different total group numbers between CBT and GET).

Thanks for that, biophile. T'was an oversight. Have corrected.

biophile said:
Also, note that the NNT for a "positive change" in CGI is also 6.25 (or 7).

I'm not sure what you mean by "CGI"?

biophile said:
The primary outcomes for fatigue and physical function were changed, from dichotomous scores ie meeting 50% relative improvement or meeting an absolute threshold score (which were stricter than the post-hoc thresholds of CUD), to continuous scores as shown in Figure 2. Everything else is additional or "secondary". Also, as you would be aware, fatigue scoring on the CFQ was changed from bimodal to Likert scoring, but note that the latter was already a secondary outcome in the 2007 protocol and became primary.

From the wording of the paper, it seems that the Likert scoring method for Chalder is at least intended to be a primary outcome measure.
It can be legitimate to change the protocol or design of a study, after the protocol is published.
Whether it is a legitimate change, or not, I think depends on the contractual rules of the MRC.
For an MRC-funded study, there are certain procedures that should be carried out, in order for changes to be legitimate, and I don't know if these were met or not.

The 'improvement rates', based on the CUD, were included in Table 3, under the heading "primary outcomes."
But I agree that it does not state that it is a primary outcome, except where it is included as a measure in Table 3. But it does not state that it is a secondary outcome measure either, whereas it does clarify this for at least one other post-hoc measure.

But, studying it further, I think I agree that the primary outcome measures were just the participant scores for Chalder Fatigue and SF-36 physical function. (And in the protocol, "positive outcomes" were also included as primary measures.)

The post-hoc definition of a CUD was used for the main analyses in the published paper, such as the 'mean difference from SMC', and the 'improvement rates'.
So it would have been better for me to describe it as the 'main primary outcome analysis'.
I will change the wording in my revision when I get around to revising it.

There are lots of question marks hanging over the methodology used to determine the CUD.
But my opening post is just intended to clearly demonstrate the main published results, as published.
It's not a critique of the methodology.

biophile said:
The authors should still publish the results based on the original goalposts, it would literally only taken up one paragraph.

Yes, agreed. We might need more FOI requests to get the info out of them.

biophile said:
The authors predicted response rates of 10% for SMC, 60% for CBT and 50% for GET. However, their definition of a "response" was watered-down. CBT and GET could only approach those rates of response when the threshold was low, but this inflated the SMC response rate. Conversely, the predicted response rate for SMC was probably accurate when using the original thresholds, but would demonstrate that CBT and GET performed relatively poorly compared to expectations. The authors cannot have it both ways, and the original thresholds would not have allowed news articles to parrot the supposed "60%" response rate for CBT/GET while failing to mention the 45% response rate for SMC.

Good points.

biophile · Aug 25, 2012

Quick response to ...

Bob said:
I'm not sure what you mean by "CGI"?

Clinical Global Impression.

From the wording of the paper, it seems that the Likert scoring method for Chalder is at least intended to be a primary outcome measure.

White et al (2001) state: "The two participant-rated primary outcome measures were the Chalder fatigue questionnaire (Likert scoring 0, 1, 2, 3; range 0–33; lowest score is least fatigue) and the short form-36 physical function subscale (version 2; range 0–100; highest score is best function). Before outcome data were examined, we changed the original bimodal scoring of the Chalder fatigue questionnaire (range 0–11) to Likert scoring to more sensitively test our hypotheses of effectiveness."

It can be legitimate to change the protocol or design of a study, after the protocol is published. Whether it is a legitimate change, or not, I think depends on the contractual rules of the MRC. For an MRC-funded study, there are certain procedures that should be carried out, in order for changes to be legitimate, and I don't know if these were met or not.

Hooper may know. I haven't really looked into that side of it yet. To me, maybe it was all technically above board, but that doesn't mean the changes were "good". I also wonder how how "independent" the independent bodies are, and within them there could be several "fellow travelers" on the cognitive-behavioural road.

The 'improvement rates', based on the CUD, were included in Table 3, under the heading "primary outcomes." But I agree that it does not specifically state that it is a primary outcome, except where it is included as a measure in Table 3. But it does not state that it is a secondary outcome measure, whereas it does for other measures. A measure can be a post-hoc and a primary measure. But, studying it further, I think I agree that the primary outcome measures were just the participant scores for Chalder Fatigue and SF-36 physical function. (And in the protocol, "positive outcomes" were also included as primary measures.)

Most of the data in Table 3 is visually presented in Figure 2. These are the "continuous" scores in primary measures. Perhaps you're correct that applying CUD to the group average is post-hoc primary. However, White et al (2011) do suggest that the application of CUD to individual scores is secondary: "A clinically useful difference between the means of the primary outcomes was defined as 0·5 of the SD of these measures at baseline, equating to 2 points for Chalder fatigue questionnaire and 8 points for short form-36. A secondary post-hoc analysis compared the proportions of participants who had improved between baseline and 52 weeks by 2 or more points of the Chalder fatigue questionnaire, 8 or more points of the short form-36, and improved on both."

Bob · Aug 25, 2012

It's best that my online posts are read, or responded to, rather than any email notifications, because I often edit my posts extensively soon after posting them.

biophile, thanks for your response. I'd edited the post that you responded to above, before you responded to it.

Firestormm · Aug 26, 2012

Gods. You HAVE to luv statistics don't you?!

Soooo transparent.

I think we need to be cautious with the way in which any 'defence' is mounted. Simple is always best in my experience and whilst the calculations used in PACE (esp. the headline attracting figures) might not be straightforward, anything 'we' produce should be as clearly explained as possible.

Now I'm recovering from my meeting, I'd be happy to help try and gain some endorsement from any one of the main charities. I'll try and establish if they even want to go down this route first I think. I'm actually not sure what the charities are all doing in preparation for the NICE Review at the end of 2013 but it seems clear that at least the MEA are preparing themselves for some defence of the PACE findings, and of the provision for the pertinent 'management strategies' contained therein (or not).

alex3619 · Aug 26, 2012

A reply to some of the criticisms to this paper is here:
http://www.plosone.org/annotation/listThread.action?root=53411

Thanks go to Co-Cure for alterting me to this.

Bye, Alex

PS The comments in reply to this are interesting.

Bob · Aug 26, 2012

Firestormm said:
Gods. You HAVE to luv statistics don't you?! Soooo transparent.

Yes, if I was a cynical type, then I might think that the PACE Trial paper had been made as muddled as possible, so that it is almost impossible to understand the results!

If even the Lancet doesn't understand the results, then how is everyone else supposed to? Maybe that's why it got published.... Because the Lancet didn't understand it.

Firestormm said:
I think we need to be cautious with the way in which any 'defence' is mounted. Simple is always best in my experience and whilst the calculations used in PACE (esp. the headline attracting figures) might not be straightforward, anything 'we' produce should be as clearly explained as possible.

Yes, there is a balance to be struck between simplicity, and absolutely scientifically correct wording.
I've gone for simplicity in my explanation, which I think is most helpful for patient advocates and for most political purposes.
For example, to say that only 13% of patients improved as a result of CBT/GET, is slightly over-simplistic, as has been discussed, but to express it like that is useful for its simplicity and impact. And it cannot be refuted, as it is as good as accurate.

For a scientific audience, a more nuanced and detailed explanation is necessary. (Shame that the PACE Trial authors don't understand that!)

What I should have made clear when I made my post, was that it was intended for a patient audience, so I simplified the explanations to avoid scientific jargon and heavy detail. Saying that 13% improved, sets out the results clearly and unambiguously for a non-scientific reader, although of course, for a scientific readership, more detail would be helpful.

Firestormm said:
Now I'm recovering from my meeting, I'd be happy to help try and gain some endorsement from any one of the main charities. I'll try and establish if they even want to go down this route first I think. I'm actually not sure what the charities are all doing in preparation for the NICE Review at the end of 2013 but it seems clear that at least the MEA are preparing themselves for some defence of the PACE findings, and of the provision for the pertinent 'management strategies' contained therein (or not).

If you are thinking of using any of my text, then hold off for a week or two, Firestormm.
I'm in the process of creating a revised version, which will hopefully add more clarity, and will use slightly more scientifically oriented wording, whilst trying not to add complexity.

MEA is preparing for the NICE guidelines review with quite a sophisticated patient survey into CBT, GET, and pacing etc.
From reading their magazine, they seem to think that the PACE Trial was a raving success, and that their survey will be necessary to rebut the results. This seems a bit misguided to me, as I think that all the past patient surveys (into CBT and GET etc.) place CBT and GET in a better light than the PACE Trial has done so, if we ignore those who report harm and only look at those who reported benefit.

Bob · Aug 26, 2012

Firestormm said:
In my wee meeting yesterday, one of my colleagues, actually said at one point (quite out of the blue and it made me rather angry) that 'only 65% of those in PACE improved' thinking her comment was a positive point in her argument.

By the time I had thought about it she was on to something else. I tell you Bob the message hasn't even gotten through to even other patients.

She also came up with a statistic (out of the blue) that '13% of diagnoses were reported as wrong' citing some UK data storage bank. The name of which escapes me. BANE? Is that it? I'd never heard of it of course and it rather flies in the face of the two studies last year that I had built in to our written submission (the 40 and 50% alternate diagnosis ones).

I tell you. There are too many people singing from too many hymn sheets. Gods only know what anyone else thinks of the way in which we present ourselves. Damn worrying. And I'm still in a mood.

Yes, I know exactly what you mean, Firestormm!
It is soooo frustrating seeing patients unknowingly misrepresenting the results, and quoting 30% or 60%. (It means that our community is doing the work of the psychiatrists for them!)
That's exactly what provoked me into writing my post...
But it's not the patients' fault, and so we do need to try to get the correct info out into the public's awareness.

Bob · Aug 26, 2012

Simon said:
Technically, this is more or less accurate, but it does give a slightly skewed picture, which I will try to illustrate with an example. Let's say 60% of the SMC group improved and instead of GET/CBT, the trial was of NewWonderDrug. And let's say 95% of patients on NewWonderDrug improved, a pretty decent result. The approach in quotes would present that as "65% of patients did not benefit from NewWonderDrug/Only 35% improved. By contrast 60% of the SMC group showed a clinically useful outcome". I'm not sure that's a sound way of presenting the data.

Bob said:
So, your example is exactly how I would present the results if that's the only data that was presented, and if that's how the study was designed.
95% of patients did not respond to the NewWonderDrug, as far as we know, and it would be misleading to say so. Only 35% improved as a result of the drug, as far as the data tells us, and 60% in the control group improved.

Hi Simon,
I don't know why I was arguing with you...
I think I sometimes get into a very inflexible mind-set when I'm concentrating on something.
Of course I agree with you, in terms of absolute scientific accuracy, and I wouldn't use such an explanation if I was aiming my explanation at scientific readers.
But I designed my post to explain the results in the most straightforward and simple terms possible, for a patient advocate audience, whilst maintaining accuracy.
I think it needed to be simple, direct, and to have impact.

Bob · Aug 27, 2012

Version 3.

Thanks to everyone for the encouraging feedback.
And thanks to Simon and biophile for the helpful constructive criticism.

I've now created a revised version 3 which can be viewed and downloaded here, which is more suitable for patient advocates to present to third parties.

I think that I've incorporated nearly all of the feedback on this thread, except that some of my phrasing might still be a little loose, in order to make an impact. As this is intended for political use, I think that simplicity and impact are helpful and that technical exactness is not necessary in every sentence, throughout the text. But all the results are explained in precise terms repeatedly in the text. I think the results, as I've presented them, cannot be refuted, and I think I have presented the results with far more clarity and accuracy than I believe is present in the published paper and elsewhere.

I don't expect anyone to read through it all again, but if anyone is inclined to read any of it, then any feedback would be very welcome. I will wait for any comments (if there are any), and I'll read it through again myself, afresh, before I replace my opening post with version 3.

Nothing of substance has changed. I've just tidied it up, improved the wording, and added some short sections. Except, I've rounded up two of the 'NNT' figures, as I've only just found out that it is common practise to always round up.

If I decide that I'm still happy with it, pending re-reading it, and any feedback, then I might send it around to patient organisations etc., just to make sure that they are aware of the results.

I had only intended to write a couple of pages, to explain the basic results, and I've just counted, and it's 19 pages long!!! How the heck did that happen???!!!

It's probably not best to publish and then edit afterwards, but it's the only way to get things done sometimes! It's a very motivating way to get a project finished!

Firestormm · Aug 27, 2012

Nice one Mr B. Will take me a while to go through it all from the start. I shall make this now a priority I think. If I can understand it anyone can

Bob · Aug 27, 2012

Firestormm said:
Nice one Mr B. Will take me a while to go through it all from the start. I shall make this now a priority I think. If I can understand it anyone can

Thanks Firestormm. It's not massively different to the original version. Don't wear yourself out with it!

Mark · Aug 29, 2012

I've moved the posts on the NICE Guidelines Review to a new thread:
http://forums.phoenixrising.me/index.php?threads/nice-guidelines-review-aug-2013.19105/

I've also deleted some posts on this thread about that the creation of that new thread.

Let me know if there are any that I've missed, or any other tidying up required.

Bob · Sep 10, 2012

This is the first time I've ever seen the correct figures published anywhere...
It's in the latest issue of ME Research UK's 'Breakthrough' magazine.
Talking about CBT, it says: "... benefiting around 10 to 15% of patients over and above the benefit of standard medical care, as shown in the results of the 2011 PACE Trial..."
They've been doing their homework!

Firestormm · Sep 10, 2012

Bob said:
This is the first time I've ever seen the correct figures published anywhere...
It's in the latest issue of ME Research UK's 'Breakthrough' magazine.
Talking about CBT, it says: "... benefiting around 10 to 15% of patients over and above the benefit of standard medical care, as shown in the results of the 2011 PACE Trial..."
They've been doing their homework!

Bob, have they agreed with your own work? Can I access a copy of Breakthrough? Cheers

Bob · Sep 10, 2012

Firestormm said:
Bob, have they agreed with your own work? Can I access a copy of Breakthrough? Cheers

Yes, their conclusions are exactly the same as what I've written in my analysis.
(But it's their own work - it was nothing to do with me.)

Breakthrough magazine is published online, but the latest issue has not been placed on there yet:
http://www.meresearch.org.uk/information/breakthrough/index.html

But they don't say much more than what I've quoted in my previous post... They only mention it in passing, when discussing something else.

Bob · Sep 10, 2012

This is the whole quote that's relevant to the PACE Trial results:

"Today there is an emerging consensus that CBT can moderately improve outcomes in a minority of people with ME/CFS, benefiting around 10 to 15% of patients over and above the benefit of standard medical care, as shown in the results of the 2011 PACE Trial, and in the finding of ME charities' surveys of their members."

My analysis shows an 11 to 15% 'benefit', so I think their figure of "10%" is a minor error.

I don't agree that CBT moderately improves outcomes though, based on the results of the PACE Trial.

Sean · Sep 10, 2012

Great work, Bob, and everybody who helped.

The PACE Trial – The Results

Senior Member

Senior Member

Places I'd rather be.

Senior Member

Places I'd rather be.

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member