PACE Trial and PACE Trial Protocol

Dolphin

Senior Member
Messages
17,567

"Psychiatry update
9.35 am
Chronic Fatigue Syndrome: Credible and treatable
...
Professor Peter White, Professor of Psychological Medicine, Barts and the London"

---
By treatable, I think we can presume he is referring to GET and CBT. Except as we generally agree here the results from objective tests from the PACE Trial don't show this.
And even the subjective results don't show signs it is treatable in the sense of "reversible", which is what is claimed by the treatment manuals.

And "great" to be in the psychiatry section. :(
 

Bob

Senior Member
Messages
16,455
Location
England (south coast)
Does anyone know if the PACE Trial protocols proposed to measure clinical 'effect size' of the treatments, and if so, what the proposed methodology was? I've just had a search of the long and short protocols, and I don't think it is proposed.

Does anyone know if the omission of a proposal to measure clinical effect size suggests that it wasn't intended, or is it such an obvious measure of clinical efficacy, that it wasn't necessary to mention it?
 

biophile

Places I'd rather be.
Messages
8,977
This is what the authors expected for response rates and effect sizes (from the 2007 protocol):

A "positive outcome" for about 60% of patients undergoing CBT, 50% for GET, 25% for APT, and 10% for SMC. A "positive outcome" was defined as one of the three following goalposts/thresholds: (a) for fatigue either a 50% reduction in fatigue score or a score of 3 or less on the CFQ (bimodal scoring); (b) for physical function either a 50% increase in physical function score or a score of 75/100 points or more on the PF/SF-36 subscale; (c) a self-rating of 1 or 2 ("very much better" or "much better") on the Clinical Global Impression (CGI, range:1-7). Participants who improved in both primary outcome measures (fatigue and physical function) were to be regarded as "overall improvers". A "clinically important difference" was defined as between 2 and 3 times the improvement rate of SMC.

"'Recovery' will be defined by meeting all four of the following criteria: (i) a Chalder Fatigue Questionnaire score of 3 or less, (ii) SF-36 physical Function score of 85 or above, (iii) a CGI score of 1, and (iv) the participant no longer meets Oxford criteria for CFS, CDC criteria for CFS or the London criteria for ME."

The only original goalpost they have published to date is for the CGI ("positive change", same as the "positive outcome" for CGI above), which was obviously lower than expectations. When using the actual trial data on the original definitions, the goalposts were about 9-14/33 points in CFQ (Likert) and 20-30/100 points in PF/SF-36 (although the 50% reduction was for individual scores). The post-hoc "clinically useful difference" (0.5SD baseline score) instead gave 2 points in CFQ (Likert) and 8 points in PF/SF-36.

I do not recall the reasoning for those goalposts, but the response rates were based on previous studies, so perhaps the effect sizes were too, and were more impressive in the earlier studies.
 

Dolphin

Senior Member
Messages
17,567
I do not recall the reasoning for those goalposts, but the response rates were based on previous studies, so perhaps the effect sizes were too, and were more impressive in the earlier studies.
This one:
"a Chalder Fatigue Questionnaire score of 3 or less"
is based on a validated definition of fatigueness based on previous research of theirs:
Development of a fatigue scale. AU. Chalder T, Berelowitz G, Pawlikowska T, Watts L, Wessely S, Wright D, Wallace EP. SO. J Psychosom Res. 1993;37(2):147. Free full text: http://wwwcache1.kcl.ac.uk/content/1/c6/01/47/68/PDF-109.pdf
 

Dolphin

Senior Member
Messages
17,567
This is what the authors expected for response rates and effect sizes (from the 2007 protocol):

A "positive outcome" for about 60% of patients undergoing CBT, 50% for GET, 25% for APT, and 10% for SMC. A "positive outcome" was defined as one of the three following goalposts/thresholds: (a) for fatigue either a 50% reduction in fatigue score or a score of 3 or less on the CFQ (bimodal scoring); (b) for physical function either a 50% increase in physical function score or a score of 75/100 points or more on the PF/SF-36 subscale; (c) a self-rating of 1 or 2 ("very much better" or "much better") on the Clinical Global Impression (CGI, range:1-7). Participants who improved in both primary outcome measures (fatigue and physical function) were to be regarded as "overall improvers". A "clinically important difference" was defined as between 2 and 3 times the improvement rate of SMC.

"'Recovery' will be defined by meeting all four of the following criteria: (i) a Chalder Fatigue Questionnaire score of 3 or less, (ii) SF-36 physical Function score of 85 or above, (iii) a CGI score of 1, and (iv) the participant no longer meets Oxford criteria for CFS, CDC criteria for CFS or the London criteria for ME."

The only original goalpost they have published to date is for the CGI ("positive change", same as the "positive outcome" for CGI above), which was obviously lower than expectations. When using the actual trial data on the original definitions, the goalposts were about 9-14/33 points in CFQ (Likert) and 20-30/100 points in PF/SF-36 (although the 50% reduction was for individual scores). The post-hoc "clinically useful difference" (0.5SD baseline score) instead gave 2 points in CFQ (Likert) and 8 points in PF/SF-36.

I do not recall the reasoning for those goalposts, but the response rates were based on previous studies, so perhaps the effect sizes were too, and were more impressive in the earlier studies.
I don't have the time to read/follow this too closely but as I recall the PACE Trial Identifier (which was the protocol in the early/mid 2000s e.g. when they were looking for funding) https://listserv.nodak.edu/cgi-bin/wa.exe?A2=ind0404B&L=CO-CURE&P=R3461&I=-3 (scroll down) gives some extra information on the rationale for the figures, cut-offs, etc.
 

biophile

Places I'd rather be.
Messages
8,977
... so there was no proposal to measure average clinical effect size, as far as you know? Isn't it usually an essential part of a medical trial like this, to assess the effect size?

It does not seem clear to me. White et al "proposed" that CID would be between 2 and 3 times the improvement rate of SMC, which sounds vague and could be problematic if the SMC improvement rates were very small. I'm not sure if they intended on applying it to both individual scores and group scores. However, if "improvement rate" was based on dichotomous outcomes or response rate for the group, then this could be the equivalent of measuring average clinical effect size between the groups. Also, if the authors published the original goalposts (either 50% improvement or meeting a threshold), effect sizes like odds ratio and NNT could be calculated, although these are not mentioned. Maybe they were initially so confident about the superiority of CBT and GET that they did not feel the need to have more sensitive measures.

Dolphin. Thanks. I think the PACE Trial identifier is very similar to the 2007 protocol in that respect, except the identifier cites only Jenkins et al 1993 to claim that 75 or more points in physical function is [mean minus SD] and therefore "normal", while the protocol also cites Bowling et al 1999 to claim [mean minus SD] is 70 depending on the study, but still use 75 as the threshold for "positive outcome". I also noticed (was reminded of) that the original inclusion criteria for physical function was once 75 or less. IIRC, that was later changed to 65 so that there would be a difference between eligibility and "positive outcome" (then changed again to 60 to increase recruitment). Of course, the post-hoc definition of "normal" made a mockery of having a difference and went in the opposite direction of an overlap instead!
 

Simon

Senior Member
Messages
3,789
Location
Monmouth, UK
Estimates of distribution of the SF36 PF change scores

One of the criticisms of the PACE results paper is that they give mean change scores but not much idea of the distribution/spread of scores, not least the proportion of those who either got much worse or much better (or even recovered). So I've tried to estimate the distribution of the SF36 Physical Function change from baseline sccores. For the SMC control group, the mean change from baseline is 11.6 but more information comes from 3 sources:
  1. % improving in Physical Function by more than the Clinically Useful Difference (CUD). Although the CUD was 8, individuasl can only score in 5 point increments, so this will capture people improving by 10 or more points from baseline.
  2. % improving by CUD in both PF and the Chalder Fatigue scale. I've assumed that on average those that improve in both PF & Fatigue will have a higher PF gain than those that improved on PF alone. The CUD for Chalder Fatigue is 2 (0-33 scale, effectively 11-33, excluding the superhuman)
  3. % rating themselves better or worse overall at 52 weeks on the CGI scale. Same/little better/little worse are all counted as 'minimum change' in the results. As the smallest groups, I have assumed that 'Worse' have the lowest PF change scores of all (and negative) while 'Better' have the highest.
This gives 5 different categories for change, which is easiest to see in pictures. Graph on left shows data for SMC, on right is a comparison of the spread among the 5 categories between SMC and CBT.
SF36bands.jpg


Next, I estimated the average score for each of the 5 categories. The constraints I used were that the combined caetgory scores had to give the correct mean of 11.6 for SMC; they were ranked as above plus some logic eg those who didn't improve by the PF CUD of 10 or more had to improve by 5 or less. I came up with this:
SF36scores.jpg


I'd welcome commen/suggestions on my guessed scores, and you can play with the spreadsheet if you pm me for a copy.

A couple of things strike me:
  • A minority - 20% for the SMC group - improve by an average of 28 points, ie really quite a lot
  • A good chunk of those who improve in both Physical Function and Fatigue (see 1st graph) don't rate themselves as better/much better. This suggests that the 'clinically useful difference' set by the researchers isn't aligned with patient's view of improvement. Of course, 'better' figures exclude 'a little better', but that's on the ground that such change is likely to be an unreliable measure, especially compared to one year earlier.
Note that the Mean of 11.6 falls roughly at the 50th percentile, which is where it should be if the scores were normally distributed (I suspect change scores are, but we don't know for sure). And all these figures are, of course, self-reports.

Feedback welcome: other scoring schemes are possible, this is just my guesstimate.
 

Dolphin

Senior Member
Messages
17,567
Well done, Simon. Interesting.
----------------------
Not sure if this has been highlighted before on this thread or not:

(From 2009)

Free full text:
http://www.psychologytoday.com/files/attachments/51945/most-false.pdf

Are most positive findings in health psychology false.... or at least somewhat exaggerated?

James C. Coyne*1, 2
------------
This is only 2.5 pages and I thought it contains a lot of good tips (for researchers) and astute observations. However, if you are not familiar with some concepts, some bits might be not straightforward.
It talks about reporting bias where some things aren't reported, while multivariate calculations are done to get positive results.

Started a thread on it at:
http://forums.phoenixrising.me/inde...-or-at-least-somewhat-exaggerated-2009.19120/
 

Bob

Senior Member
Messages
16,455
Location
England (south coast)
I've think I might have seen some people mention recently that the 6MWDT failed to show a significant improvement for GET, or failed to demonstrate clinical effectiveness for GET. I was just wondering if anyone has come to that opinion using a statistical methodology?
 

biophile

Places I'd rather be.
Messages
8,977
I've think I might have seen some people mention recently that the 6MWDT failed to show a significant improvement for GET, or failed to demonstrate clinical effectiveness for GET. I was just wondering if anyone has come to that opinion using a statistical methodology?

GET demonstrated an (adjusted) advantage of 35.3m over SMC, which was highly statistically significant (p=0.0002). However, if a "clinical useful difference" was defined as 0.5 SD of baseline score, then the 6MWD did not reach CUD, as the SD at baseline was 87m for the GET group and the pooled SD for all groups was 90m. 35.3m is about 10% more than what the SMC group scored at 52 weeks (348+/-108m).

White responded to Hooper's criticism on this issue:
Walking test (page 33) - The interpretation of the walking test results seems to be one of scientific debate. Statistical testing takes into account variability. The GET group were still significantly different than the SMC and APT groups despite a large amount of variability in the measure. In addition, one cannot focus solely on absolute metres walked for individual trial arms as these may or may not be influenced by treatment. The valid comparisons are between trial arms. We did not ask participants to undertake a practice walking test for the reason mentioned in the complaint; post-exertional fatigue being a characteristic feature of CFS."

http://www.meactionuk.org.uk/whitereply.htm
 

Esther12

Senior Member
Messages
13,774
If CBT had led to the same improvements in 6mwdt, then that would have been a more 'significant' (in non-mathematical terms) than it was with GET imo. A programme which encourages people to devote their limited time and energy to exercise will lead to some improvements in capacity to exercise... but that doesn't mean it should be seen as a 'treatment': it can just as easily be explained as the imposition of a particular set of preferences upon the sick.

I'm surprised by how ineffective GET was for 6mwdt, even if we assume that deconditioning plays almost no role in disability levels seen with CFS. Even if people were just substituting activities they had decoded to do themselves for the exercises recommended by GET, surely that would have led to some improvement in ability to perform those sorts of exercises? Maybe people tended to be doing very different activities to walking? Didn't they have quite a bit of missing data for this test too? I'd tend to assume that those patients who did least well with the treatments would also be those most likely not to turn up for testing, but who knows?
 

biophile

Places I'd rather be.
Messages
8,977
On GET, the 2011 Lancet paper states that "The most commonly chosen exercise was walking." Participation rates for the pre/post 6MWT were lowest for the GET group (n=110) and highest in the CBT group (n=123). 12 months is more than enough time to make major improvements. 35.3m is pathetic when considering the underlying rationale of deconditioning and what healthy people are scoring.
 

Bob

Senior Member
Messages
16,455
Location
England (south coast)
GET demonstrated an (adjusted) advantage of 35.3m over SMC, which was highly statistically significant (p=0.0002). However, if a "clinical useful difference" was defined as 0.5 SD of baseline score, then the 6MWD did not reach CUD, as the SD at baseline was 87m for the GET group and the pooled SD for all groups was 90m. 35.3m is about 10% more than what the SMC group scored at 52 weeks (348+/-108m).

Thanks for that biophile. I don't know how I managed to miss that.
 

user9876

Senior Member
Messages
4,556
On GET, the 2011 Lancet paper states that "The most commonly chosen exercise was walking." Participation rates for the pre/post 6MWT were lowest for the GET group (n=110) and highest in the CBT group (n=123). 12 months is more than enough time to make major improvements. 35.3m is pathetic when considering the underlying rationale of deconditioning and what healthy people are scoring.

I assume they had method of testing all participants on the 6MWT and then after 12 months tested those who would agree to be tested. I would assume that the results they quote are for those who completed both tests. The reliability of the results of this test are determined by why participants droped out. In quoting the results as they do in the PACE paper they are making the assumption that those performing the test represent a good sample of the whole set. There will be an error on the result based on how valid this assumption is. It would be interesting to look at drop outs against serious reactions and survey results. Its hard to say any more with no data.

I also think it would be interesting to see the average (median) change in distance walked (as well as the distribution of changes).

I came across this abstract where a group had carefully tried to look at patients having CBT where they conclude results are ambiguous.

http://ukpmc.ac.uk/abstract/MED/19213007/reload=0;jsessionid=UoZMPW2olRj0vKff5Nfx.0

Cognitive-behavior therapy in chronic fatigue syndrome: is improvement related to increased physical activity?

(PMID:19213007)
Friedberg F, Sohl S

Stony Brook University. Fred.Friedberg@stonybrook.edu
Journal of Clinical Psychology [2009, 65(4):423-442]
Type: Journal Article, Research Support, Non-U.S. Gov't, Research Support, N.I.H., Extramural
DOI: 10.1002/jclp.20551
qmark.png

Abstract Highlight Terms
qmark.png

Diseases(3) Genes/Proteins(1)
This multiple case study of cognitive-behavioral treatment (CBT) for chronic fatigue syndrome (CFS) compared self-report and behavioral outcomes. Eleven relatively high-functioning participants with CFS received 6-32 sessions of outpatient graded-activity oriented CBT. Self-report outcomes included measures of fatigue impact, physical function, depression, anxiety, and global change. Behavioral outcomes included actigraphy and the 6-minute walking test. Global change ratings were very much improved (n=2), much improved (n=2), improved (n=5), and no change (n=2). Of those reporting improvement, clinically significant actigraphy increases (n=3) and decreases (n=4) were found, as well as no significant change (n=2). The nature of clinical improvement in CBT trials for high-functioning CFS patients may be more ambiguous than that postulated by the cognitive-behavioral model.
 

Dolphin

Senior Member
Messages
17,567
I assume they had method of testing all participants on the 6MWT and then after 12 months tested those who would agree to be tested. I would assume that the results they quote are for those who completed both tests. The reliability of the results of this test are determined by why participants droped out. In quoting the results as they do in the PACE paper they are making the assumption that those performing the test represent a good sample of the whole set. There will be an error on the result based on how valid this assumption is. It would be interesting to look at drop outs against serious reactions and survey results. Its hard to say any more with no data.

I also think it would be interesting to see the average (median) change in distance walked (as well as the distribution of changes).

I came across this abstract where a group had carefully tried to look at patients having CBT where they conclude results are ambiguous.

http://ukpmc.ac.uk/abstract/MED/19213007/reload=0;jsessionid=UoZMPW2olRj0vKff5Nfx.0

Cognitive-behavior therapy in chronic fatigue syndrome: is improvement related to increased physical activity?

(PMID:19213007)
Friedberg F, Sohl S

Stony Brook University. Fred.Friedberg@stonybrook.edu
Journal of Clinical Psychology [2009, 65(4):423-442]
Type: Journal Article, Research Support, Non-U.S. Gov't, Research Support, N.I.H., Extramural
DOI: 10.1002/jclp.20551
qmark.png

Abstract Highlight Terms
qmark.png

Diseases(3) Genes/Proteins(1)
This multiple case study of cognitive-behavioral treatment (CBT) for chronic fatigue syndrome (CFS) compared self-report and behavioral outcomes. Eleven relatively high-functioning participants with CFS received 6-32 sessions of outpatient graded-activity oriented CBT. Self-report outcomes included measures of fatigue impact, physical function, depression, anxiety, and global change. Behavioral outcomes included actigraphy and the 6-minute walking test. Global change ratings were very much improved (n=2), much improved (n=2), improved (n=5), and no change (n=2). Of those reporting improvement, clinically significant actigraphy increases (n=3) and decreases (n=4) were found, as well as no significant change (n=2). The nature of clinical improvement in CBT trials for high-functioning CFS patients may be more ambiguous than that postulated by the cognitive-behavioral model.
A few of the comments on the PACE Trial protocol quote that study: http://www.biomedcentral.com/1471-2377/7/6/comments/
e.g.

Friedberg and Sohl [1] have just published the results of a study on an intervention involving Cognitive Behavior Therapy (CBT) which included encouraging patients for going for longer walks. It found that on the SF-36 Physical Functioning (PF) scale, patients improved from a pre-treatment mean (SD) of 49.44 (25.19) to 58.18 (26.48) post-treatment, equivalent to a Cohen's d value of 0.35. On the Fatigue Severity Scale (FSS), the improvement as measured by the cohen's d value was even great (0.78) from an initial pre-treatment mean (SD) of 5.93 (0.93) to a 5.20 (0.95) post-treatment.

However on actigraphy there was actually a numerical decrease from a pre-treatment mean (SD) of 224696.90 (158389.64) to 203916.67 (122585.92) post-treatment (cohen's d: -0.13). So just because patients report lower fatigue and better scores on the SF-36 PF scale, doesn't mean they're doing more, which is what GET and CBT based on GET claim to bring about. These results seem particularly pertinent for this study given the primary outcome measures are the SF-36 PF scale and a fatigue scale.
 
Back