• Welcome to Phoenix Rising!

    Created in 2008, Phoenix Rising is the largest and oldest forum dedicated to furthering the understanding of, and finding treatments for, complex chronic illnesses such as chronic fatigue syndrome (ME/CFS), fibromyalgia, long COVID, postural orthostatic tachycardia syndrome (POTS), mast cell activation syndrome (MCAS), and allied diseases.

    To become a member, simply click the Register button at the top right.

PACE Trial and PACE Trial Protocol

anciendaze

Senior Member
Messages
1,841
I see. I'm pretty sure the right figures to use - in terms of statistical validity - are the ones comparing the difference in gains between baseline and 52 weeks, not the difference in values 52 weeks.
I would agree, if I thought the numbers had any significance.

Unfortunately, measures of statistical significance used here depend on accurate values for parameters describing a particular distribution. That distribution has known characteristics: 1) it is symmetrical; 2) mean, median and mode are very close together; 3) a predictably small number of outliers exceed a given number of standard deviations from the mean. None of these characteristics is true for the population used to establish "normal" as defined by this study. Some gross departures are readily apparent.

One measure of the validity of an assumed standard deviation is stability in a control group. This also fails here. Standard deviations increased over time. Considering the central role of standard deviation in establishing criteria for "recovery", as well as statistical significance, weakness in this pillar of the statistics slips by with surprisingly little concern voiced by investigators.

The mean value from which deviations should be measured also seems uncertain. If the two parameters carrying all possible information about an assumed distribution are as indefinite as these appear, no confidence in calculated values is warranted.

What the numbers released show is a heterogeneous cohort created by entry requirements dispersing over time. Random walk models would show similar changes.

Selection effects are apparent at multiple stages. Roughly half those they attempted to enroll declined. Classic values for bias created by dropouts are hidden by treating those who did not complete the only objective measure presented at the end of the trial as though they fully completed. Data from Actometers was simply dropped. With some 30% of those 'completing' having no objective data whatsoever for improvement, plus dispersal of a meaningless cohort, there should be no problem getting numbers in the range claimed as "moderate improvement".

Lack of selection shows up in one area, adverse events. There were more than 3,000 adverse events distributed over 600 participants. Any drug trial with this rate of adverse events would be halted. An attempt to redefine adverse events during a trial, to make them less likely, as was done here, will typically get you investigated. We are told few were "serious adverse events" based on review by three independent assessors who turn out to be less than independent.

Acceptance of this study by a medical establishment does more to undermine confidence in mainstream medicine than to validate the results. If medical professionals can't tell when they are getting bamboozled, great savings are possible in funding medical research. Legislators take note.
 

Bob

Senior Member
Messages
16,455
Location
England (south coast)
I see. I'm pretty sure the right figures to use - in terms of statistical validity - are the ones comparing the difference in gains between baseline and 52 weeks, not the difference in values 52 weeks.

That's not the way I see it...
If the additional average improvement in the GET or CBT groups, over and above the SMC-alone group, is insignificant (or it's a negative change), then the GET and CBT can be considered to be useless.
So, for example, if the SMC+GET group had a worse overall improvement after 52 weeks than the SMC-alone group, then it would be reasonable to assume that the GET was harmful.
 

oceanblue

Guest
Messages
1,383
Location
UK
That's not the way I see it...
If the additional average improvement in the GET or CBT groups, over and above the SMC-alone group, is insignificant (or it's a negative change), then the GET and CBT can be considered to be useless.
So, for example, if the SMC+GET group had a worse overall improvement after 52 weeks than the SMC-alone group, then it would be reasonable to assume that the GET was harmful.
Maybe you could spell this out. As far as I can see, the SMC+GET group improved more over 52 weeks -relative to its baseline - than the SMC alone group and that's the mean difference. If what you have said is correct then you conclusions are correct too - but I don't think that's the case.
 

Bob

Senior Member
Messages
16,455
Location
England (south coast)
Maybe you could spell this out. As far as I can see, the SMC+GET group improved more over 52 weeks -relative to its baseline - than the SMC alone group and that's the mean difference. If what you have said is correct then you conclusions are correct too - but I don't think that's the case.

Sorry, my last post just added confusion to the discussion... I was just giving an example of why I thought the mean difference was an important measurement, after you said that the right figures to use, in terms of statistical validity, are the ones comparing the difference in gains between baseline and 52 weeks. I wasn't saying that the actual mean differences were negative. It was just an example of why they are important.
As you said, the only statistically insignificant improvement, according to the mean differences stated in the paper, was the CBT for physical function.

I think I might have missed the meaning of your earlier post anyway... I'm not sure what you meant by saying "in terms of statistical validity"?
 

Dolphin

Senior Member
Messages
17,567
But the biggest mistake was in publishing the paper in the first place. That was the Lancet editor's major mistake.
Not to mention the myriad of 'mistakes' in the methodology of the paper itself!
The editor of the Lancet seems to be pretty ignorant and uninformed about ME and science, so what he says is pretty meaningless.
He's backed himself into a corner, and seems to be saying all sorts to defend his indefensible position, including lashing out at the entire ME community.
Hooper's paper seems like a pretty heavyweight and substantial piece of work to me.
I think we can allow him one or two errors, if there are any, and I'm not aware of any yet (please see my next post.)

(Esther, if this post comes across as a bit ranty, it isn't aimed at you, but just at the Lancet and the authors!)
Maybe my expectations are low because a large percentage of the hundreds of full papers I have read are on ME or CFS, but I'm not sure this paper is bad enough not to be published somewhere and I'm not convinced it's bad enough to be retracted (not that I'm an expert on what gets retracted). ETA: Maybe I'm being over-generous and just taking the paper on face value rather than the fact that they have changed outcome measures.

However if being published in a high impact journal is supposed to mean that a paper is of very high quality, then I don't think the Lancet should have published it.
 

oceanblue

Guest
Messages
1,383
Location
UK
Another one from Cella in the pipeline that could well be using PACE data.
Cella, M., Chalder, T., White, P. Does the heterogeneity of chronic fatigue syndrome moderate the response to cognitive behaviour therapy? An exploratory study. Psychotherapy and Psychosomatics Publisher URL [Accepted]
 

Dolphin

Senior Member
Messages
17,567
You guys know that more PACE data has been published here right:
http://www.sciencedirect.com/scienc...75601ba2a26d16e20b97fd93de38a844&searchtype=a
It looks like they might be running away (in all papers so far*) from this measure that was in the protocol paper:
11. The Client Service Receipt Inventory (CSRI), adapted for use in CFS/ME [31], will measure hours of employment/study, wages and benefits received, allowing another more objective measure of function.
e.g. one would think one would look for correlations between that and the WSAS.

Also, I wonder is there any significance to the fact that they only give us the data on change for CBT rather than across all the groups or looking to see if there were different patterns for different groups. I'm afraid I don't trust PACE trial authors with much.

* I think it will get mentioned somewhere eventually.
 

Dolphin

Senior Member
Messages
17,567
Comment from a PACE Trial participant

FWIW, a comment from a PACE Trial participant:

http://meagenda.wordpress.com/2007/...tement-nhs-collaborative-conference/#comments

dot tritschler Says:
December 23, 2008 at 12:48 am

I am most disappointed that AFME has endorsed the Pace Trial. I was randomly selected to CBT via the trial, and it was quite apparent that the treatment was flawed from the outset.

a) The therapist misled me by saying he had a 99% recovery rate.

b) He could not answer basic questions as to how he measured recovery.

c) I had been told by Dr. Andrews (the doctor I see at the WGH) that the therapist was a clinical psychologist, only to find out he is only a psychiatric nurse who has then done a diploma in psychotherapy; I received a letter of apology re this only after bringing it to her attention and pointing out the discrepancy via Edinburgh University Pace Trial Website.

d) After I told the therapist that I was disengaging from the trial, he phoned me 3 times to attend a meeting with him - although it states that you can leave the trial at any time and don`t even have to give a reason.
Although the therapist had said the purpose of the meeting was to wish me well for my future, he was very angry and defensive at the meeting due to me disengaging; he obviously had pressure on him to keep his numbers up - but that was no reason to treat me in such a way.

e) It was quite apparent during the 6 sessions I had with the therapist that he was more interested in his research findings than genuinely helping me and my CFS. All in all I found the whole experience to be quite damaging, particularly as my expectations were falsely raised and the therapist behaved quite unethically at the last meeting - no doubt due to pressure upon him to get the desired results via his research subjects.
I think it is incorrect for Action for ME to support and endorse such a trial, and am most disappointed that is does so.
 

Esther12

Senior Member
Messages
13,774
I started reading this new paper, and had trouble getting much useful info out of it. Should I print it out and give it a proper go when I'm feeling sharper, or is it just lacking in info that would let us judge the efficacy of CBT/GET?

It looks like table 2 might be worthwhile for understanding the correlation between different outcome measures... but it looks like all treatment groups are just lumped in together, so a lot of info is lost.
 

Dolphin

Senior Member
Messages
17,567
What would the CBT PACE Trial results look like if we stripped out response biases

This is of course speculative. But it is transparent so people can decide if any of the jumps are too big.

I have been reading:
Measuring disability in patients with chronic fatigue syndrome: reliability and validity of the Work and Social Adjustment Scale.

Journal of Psychosomatic Research xx (2011) xxxxxx

Matteo Cella, Michael Sharpe, and Trudie Chalder

This contains the baseline data for all of the PACE Trial divided into quartiles based on WSAS scores.
The second quartile has a mean (SD) 6MWD of 334.9 (82.1).
This is very like the mean for the CBT group at baseline 333 (86) (I think this is for only those who did the 6 MWT at 52 weeks.
Of course, some of the individuals are the same but it is probably only around 20% (the baseline figure of 333 is for 123 it appears (there is the possibility it is for 161).

Anyway, the untreated CBT group is quite similar in other ways to this 2nd quartile (WSAS) group:
CBT vs Everyone (who make up 2nd quartile)

CDC total: 4.6 (1.8) vs 4.5 (1.7)

Jenkins Sleep scale: 12.5 (4.9) vs 11.9 (4.3)

Fatigue: 27.7 (3.7) vs 27.6 (3.6)

SF-36 PF: 39.0 (15.3) vs 41.2 (13.9)

HADS-D: 83 (37) vs 7.6 (3.4)

HADS-A: 8.1 (4.3) vs 7.8 (3.8)
-------------

Many of us think that we can't trust the subjective data in the PACE Trial paper because of all sorts of response biases - people wanting to please their therapist, believe that all the effort was worthwhile, are more optimistic, etc.

The nearest we have to an objective outcome measure is the 6MWT.

In the CBT group, for the group we had data for, this went up from 333 to 354 which was actually a tiny bit less than the specialist medical care group.

One could assume that there was no chance whatsoever and this was completely to do with response bias e.g. being willing to push oneself more as well as having more practice at the test.
In which case, the "real scores" would be the baseline scores.

However, it is possible that the patients did actually improve a small bit. This could be due to the passage of time, for example - for example, quite a large percentage of people whose illness started with EBV get better over time.

The Cella, Sharpe & Chalder (2011) paper doesn't give us data on what people are like without treatment if their baseline score is 354m.
What it does give us is what people's scores are like if their baseline score is 349.8m. Maybe we could say that would be a more accurate baseline score as after CBT, people might possibly be willing to push themselves a tiny bit harder, or their might be a training effect from having done the test before. So all we're saying is alloting 4.2m for this.

We then have information about what the final score for the CBT participants would represent, compared to what the CBT participants reported:
Chalder Fatigue score: 26.6 (4.1) vs 20.3 (8.0)
SF-36 PF: 45.2 (14.5) vs 58.2 (24.1)
HADS-A: 7.21 (3.7) vs 68 (42)
HADS-D: 6.6 (2.8) vs 62 (37)
PHQ-15 (physical symptoms): 12.5 (4.2) vs <not reported in Lancet paper>
Jenkins Sleep scale: 11.5 (4.8) vs 9.9 (5.3)
CDC total number of symptoms: 4.2 (1.8) vs 3.4 (2.3)
I think the first two measures are probably the most interesting. I'd be "willing to let them keep" the slightly better HADS-A and HADS-D scores as SMC and/or CBT may help those perhaps a little more than a similar improvement in fitness due to the passage of time but no CBT. Perhaps I'd "give them" the improvement in sleep as SMC and/or CBT may help there also.
 

Dolphin

Senior Member
Messages
17,567
6MWT - looks like those who did it at 52 weeks were the same as average at baseline

This paper:
Measuring disability in patients with chronic fatigue syndrome: reliability and validity of the Work and Social Adjustment Scale.

Journal of Psychosomatic Research xx (2011) xxxxxx

Matteo Cella, Michael Sharpe, and Trudie Chalder
contains the baseline data for all but one of the PACE participants (the WSAS score was missing for this one person).

From Table 2, one can calculate that the average was 6 Minute Walking Distances (6MWD) of 320.7683881m for the participants at baseline.

Table 6 in the Lancet PACE Trial papers gives data on the 6MWD at baseline and at 52 weeks.
If one calculates the average for the figures given, it is 321.2796875. This may not be exactly accurate as the numbers were rounded to the nearest natural number so it could be slightly higher or lower.

I presume the baseline data in Table 6 is just for those who completed the test at 52 weeks and not for everyone.

Of course, baseline data still doesn't tell us whether people who were not doing as well at 52 weeks didn't do the repeat test which seems perfectly plausible.
 

oceanblue

Guest
Messages
1,383
Location
UK
Re possible evidence of response bias in PACE

This is of course speculative. But it is transparent so people can decide if any of the jumps are too big.

I have been reading:Measuring disability in patients with chronic fatigue syndrome: reliability and validity of the Work and Social Adjustment Scale.

This contains the baseline data for all of the PACE Trial divided into quartiles based on WSAS scores.
The second quartile has a mean (SD) 6MWD of 334.9 (82.1).
This is very like the mean for the CBT group at baseline 333 (86) (I think this is for only those who did the 6 MWT at 52 weeks.
...

Many of us think that we can't trust the subjective data in the PACE Trial paper because of all sorts of response biases - people wanting to please their therapist, believe that all the effort was worthwhile, are more optimistic, etc.

The nearest we have to an objective outcome measure is the 6MWT.

We then have information about what the final score for the CBT participants would represent, compared to what the CBT participants reported:
This is very interesting. As I understand it what you're saying is this:
  • By chance, the WSAS 2nd quartile has very similar figures to the overall CBT group at baseline, including the one objective measure, 6MWT, of 335m
  • Similarly, by chance, the WSAS 1st quartile has a 6MWT of 350m, very similar to the 6MWT of the CBT group at 52 weeks.
  • So you're proposing we use Q2 as a proxy for the CBT baseline group and Q1 as a proxy for the CBT group at 52 weeks...
  • ... then compare the subjective scores, esp SF36 and CFQ with the objective 6MWT scores
  • and we see that the SF36/CFQ scores associated with the Q1 baseline group (proxy for CBT at 52 weeks) are much lower than those seen in the real CBT @ 52 weeks group, despite having the same 6MWT.
  • providing some evidence that the CBT figures for SF36 & CFQ at 52 weeks have been artifically inflated by response bias.
Is that a fair summary?
It's fascinating stuff and possibly quite brilliant :D. I have a hunch there may be some flaw, but I can't see what it is and on the face of it this is evidence of response bias. Bit complicated to explain though, as you and I have shown.
 

Esther12

Senior Member
Messages
13,774
That is interesting... and all the more reason to want to have access to this data divided up in to the different treatment groups.
 

oceanblue

Guest
Messages
1,383
Location
UK
CBT results for cohort 2 similar to PACE on fatigue

This study gives us results for 384 patients from a secondary care specialist CFS clinic, which must be one of the largest groups reported on, though I'm not sure how many of these had CFQ fatigue data for before and after. The follow-up looks very similar to PACE at 1 year, and had a similar number of CBT session: 10-15. From Fig 1 there was a roughly 6 point fall in mean fatigue scores from 24 to 18, compared with a 7.4 point fall for PACE from 27.7 to 20.3. The PACE group were more fatigued to start with and may have been ill for less time, but they look broadly similar results. Nb these are pre/post results, with no control. The point is that the results are hardly spectacular after a year, and leave the average particpant far from healthy (with no objective measure of progress reported). Yet more data indicating that CBT does not seem to be the answer.
 

Dolphin

Senior Member
Messages
17,567
Dolphin said:
This is of course speculative. But it is transparent so people can decide if any of the jumps are too big.

I have been reading:
Measuring disability in patients with chronic fatigue syndrome: reliability and validity of the Work and Social Adjustment Scale.

Journal of Psychosomatic Research xx (2011) xxxxxx

Matteo Cella, Michael Sharpe, and Trudie Chalder

This contains the baseline data for all of the PACE Trial divided into quartiles based on WSAS scores.
The second quartile has a mean (SD) 6MWD of 334.9 (82.1).
This is very like the mean for the CBT group at baseline 333 (86) (I think this is for only those who did the 6 MWT at 52 weeks.
Of course, some of the individuals are the same but it is probably only around 20% (the baseline figure of 333 is for 123 it appears (there is the possibility it is for 161).

Anyway, the untreated CBT group is quite similar in other ways to this 2nd quartile (WSAS) group:
CBT vs Everyone (who make up 2nd quartile)

CDC total: 4.6 (1.8) vs 4.5 (1.7)

Jenkins Sleep scale: 12.5 (4.9) vs 11.9 (4.3)

Fatigue: 27.7 (3.7) vs 27.6 (3.6)

SF-36 PF: 39.0 (15.3) vs 41.2 (13.9)

HADS-D: 83 (37) vs 7.6 (3.4)

HADS-A: 8.1 (4.3) vs 7.8 (3.8)
-------------

Many of us think that we can't trust the subjective data in the PACE Trial paper because of all sorts of response biases - people wanting to please their therapist, believe that all the effort was worthwhile, are more optimistic, etc.

The nearest we have to an objective outcome measure is the 6MWT.

In the CBT group, for the group we had data for, this went up from 333 to 354 which was actually a tiny bit less than the specialist medical care group.

One could assume that there was no chance whatsoever and this was completely to do with response bias e.g. being willing to push oneself more as well as having more practice at the test.
In which case, the "real scores" would be the baseline scores.

However, it is possible that the patients did actually improve a small bit. This could be due to the passage of time, for example - for example, quite a large percentage of people whose illness started with EBV get better over time.

The Cella, Sharpe & Chalder (2011) paper doesn't give us data on what people are like without treatment if their baseline score is 354m.
What it does give us is what people's scores are like if their baseline score is 349.8m. Maybe we could say that would be a more accurate baseline score as after CBT, people might possibly be willing to push themselves a tiny bit harder, or their might be a training effect from having done the test before. So all we're saying is alloting 4.2m for this.

We then have information about what the final score for the CBT participants would represent, compared to what the CBT participants reported:
Chalder Fatigue score: 26.6 (4.1) vs 20.3 (8.0)
SF-36 PF: 45.2 (14.5) vs 58.2 (24.1)
HADS-A: 7.21 (3.7) vs 68 (42)
HADS-D: 6.6 (2.8) vs 62 (37)
PHQ-15 (physical symptoms): 12.5 (4.2) vs <not reported in Lancet paper>
Jenkins Sleep scale: 11.5 (4.8) vs 9.9 (5.3)
CDC total number of symptoms: 4.2 (1.8) vs 3.4 (2.3)
I think the first two measures are probably the most interesting. I'd be "willing to let them keep" the slightly better HADS-A and HADS-D scores as SMC and/or CBT may help those perhaps a little more than a similar improvement in fitness due to the passage of time but no CBT. Perhaps I'd "give them" the improvement in sleep as SMC and/or CBT may help there also.
This is very interesting. As I understand it what you're saying is this:
  • By chance, the WSAS 2nd quartile has very similar figures to the overall CBT group at baseline, including the one objective measure, 6MWT, of 335m
  • Similarly, by chance, the WSAS 1st quartile has a 6MWT of 350m, very similar to the 6MWT of the CBT group at 52 weeks.
  • So you're proposing we use Q2 as a proxy for the CBT baseline group and Q1 as a proxy for the CBT group at 52 weeks...
  • ... then compare the subjective scores, esp SF36 and CFQ with the objective 6MWT scores
  • and we see that the SF36/CFQ scores associated with the Q1 baseline group (proxy for CBT at 52 weeks) are much lower than those seen in the real CBT @ 52 weeks group, despite having the same 6MWT.
  • providing some evidence that the CBT figures for SF36 & CFQ at 52 weeks have been artifically inflated by response bias.
Is that a fair summary?
It's fascinating stuff and possibly quite brilliant :D. I have a hunch there may be some flaw, but I can't see what it is and on the face of it this is evidence of response bias. Bit complicated to explain though, as you and I have shown.
Thanks. Yes, you've explained it well.

I think it works better for some measures more than others: six minute walking distance is probably measuring something quite a lot different from depression and anxiety so unfortunately, just because the 6MWD in quartile one (pre-treatment) is similar to the 6MWD at the end of CBT, doesn't mean that one can tell say that the anxiety and depression scores should be the same.

However, I find it quite unsatisfactory that the CBT group can have a 6MWD of 333m before treatment, that it only goes up to 354m but the SF-36 Physical Functioning scores jump from 39.0 to 58.2. While another group of CFS patients whose 6MWD is 349.8 give as their level of SF-36 PF physical functioning 45.2.

SF-36 Physical Functioning results are often called in the CFS world (in the literature, in talks, etc.) as patients' level of physical functioning. Often from the way it is said, people might think it is the result of some objective test like an exercise test, actometers, energy usage using double labelled water, or whatever.

It is similar to the result we keep pointing to in the Wiborg et al. (2010) review which found that the CBT and controls groups both only increased by a small amount on the actometer readings but the CBT had much larger decreases in fatigue (and when one looks at the same studies, in two of them (as I recall) the CBT groups had much better physical functioning scores).
 

Dolphin

Senior Member
Messages
17,567
The SMC group are on 348m at 52 weeks so I thought I'd add in their results too at 52 weeks also (I've added in the CBT group at 52 weeks to highlight the difference):
6MWD: 349.8m vs 348m vs 354m

SF-36 PF: 45.2 vs 50.8 vs 58.2

first quartile (as measured by the WSAS) (pre-treatment) vs SMC (52 weeks) vs CBT (52 weeks)
 

Dolphin

Senior Member
Messages
17,567
Technically, there is a slight flaw in this use of data from the Cella et al. (2011) paper: it's based on not all the data. Ideally, what one would like is for regression to be performed on all the baseline data and see what it represents. Although as I suggested, the distance walked at 52 weeks might be slightly inflated due to issues like people perhaps being willing to push themselves more (perhaps only applies to those who did GET and CBT (maybe to a lesser extent)) and perhaps a bit of a learning effect from doing the test at 0, 24 and 52 weeks. Also, people may be more used to going for continuous walks so might know what pace they can cope with more than when they do the test originally.

Anyway, if anything, the value of 45.2 might be a slighly inflated value for a 6MWD of 349.8 at baseline because these people are the people who scored the top quartile of WSAS so may have a slight tendency to portray themselves as more able than some others with similar 6MWD scores.

With a group in the second lowest quartile of scoring on the WSAS (who might have a tendency to mark themselves a bit lower on questionnaires), they had a 6MWD of 309.1 and a SF-36 PF score of 35.5. The baseline scores of APT and GET were respectively 314m & 312m and 37.2 & 36.7.