PACE Trial and PACE Trial Protocol

Graham

Senior Moment
Messages
5,188
Location
Sussex, UK
Actually Bob is deceiving you all here, folks. He has been editing my posts from the very start. Haven't you noticed that there is a mix of relevant comment and utter rubbish? As he was responsible for me joining, he has been trying to improve my postings so that you don't all turn around and blame him. And what sort of a person puts on nearly two thousand posts, then complains that he can't edit the first thousand, if it isn't someone worried about his standing in this elite community? I rest my case (well it was getting heavy with so many books in it).
 

Graham

Senior Moment
Messages
5,188
Location
Sussex, UK
I'm just so slow on the uptake, aren't I. There is only Bob on this forum, because I am the only one daft enough to succumb to his persuasive manner and join up. Mind you, he really fooled me pretending to be Esther and Willow.
 

Graham

Senior Moment
Messages
5,188
Location
Sussex, UK
I would like to conduct a survey on members of this group in order to get some parameters for a simulation I am trying out to replicate the effect of good patches and bad patches on the PACE patients. Essentially, I would like to post up the Chalder questions with the Likert/continuous scoring and ask people to tell me how many of the 11 topics are relevant to their illness (the binomial score), to tell me their current Likert score and whether they are considered mild, moderate or severe (I'll put in the definitions). Then I would ask them to consider the last year, and give me the Likert score if they had hit a bad patch (for at least a couple of weeks), and again for if they had a good patch.

What I am trying to do is to assess whether the multiple ceilings on the Chalder Fatigue score actually force a significant improvement through random effects, but I need some better parameters than my guesswork. Do you think it would be reasonable to post this, and if so, where?
 

Bob

Senior Member
Messages
16,455
Location
England (south coast)
I would like to conduct a survey on members of this group in order to get some parameters for a simulation I am trying out to replicate the effect of good patches and bad patches on the PACE patients. Essentially, I would like to post up the Chalder questions with the Likert/continuous scoring and ask people to tell me how many of the 11 topics are relevant to their illness (the binomial score), to tell me their current Likert score and whether they are considered mild, moderate or severe (I'll put in the definitions). Then I would ask them to consider the last year, and give me the Likert score if they had hit a bad patch (for at least a couple of weeks), and again for if they had a good patch.

What I am trying to do is to assess whether the multiple ceilings on the Chalder Fatigue score actually force a significant improvement through random effects, but I need some better parameters than my guesswork. Do you think it would be reasonable to post this, and if so, where?

Totally reasonable.

I suggest you start a new thread and place a link to it in this thread.

I think it would be most appropriation in the Action Alerts! and Advocacy section:
http://phoenixrising.me/forums/forumdisplay.php?14-Action-Alerts!-and-Advocacy

Or in the symptoms section:
http://phoenixrising.me/forums/forumdisplay.php?37-Symptoms
 

Graham

Senior Moment
Messages
5,188
Location
Sussex, UK
Thanks Bob! I'll wait and see if there are any reasons against, and if not, do as you suggest. In the meantime you might like to hear that I sent an email to Ellen Goudsmit asking whether I had matched up the bimodal and Likert scores correctly, and have been sent a very supportive reply by a colleague Bart - who has even seen and liked our blog! They want us to let them know when the project is finished.
 

Dolphin

Senior Member
Messages
17,567
Sounds good, Graham. There is probably only a small following left on this thread so a new thread would be best as filling in a survey doesn't require any specific knowledge of PACE.
 

ukxmrv

Senior Member
Messages
4,413
Location
London
I did see and read your other thread Graham.

Currently I'm sitting and holding my head going "it hurts to think" (how many points does that score me?)

Promise to fill it in when my brain comes on line or if my husband helps me.
 

Dolphin

Senior Member
Messages
17,567
(Not important) (and probably a repeat)

For my sins, I was looking again at the Lancet paper. Some of the data is given in the form median (IQR). IQR=Interquartile Range i.e. the 25th and 75th percentiles.

If one presented such data for the SF-36 PF it would be something like 95 (75*-100) for the whole population. Presented in that way and people would very much question any claim that 60+ is normal. The figures for the working age population would be higher again.

*I'm not sure of the exact figure of the 25th percentile - is it out there? I imagine it's in the 70s anyway.
 

oceanblue

Guest
Messages
1,383
Location
UK
For my sins, I was looking again at the Lancet paper. Some of the data is given in the form median (IQR). IQR=Interquartile Range i.e. the 25th and 75th percentiles.

*I'm not sure of the exact figure of the 25th percentile - is it out there? I imagine it's in the 70s anyway.
Good point. IQR is 85-100 for the Bowling data; for working age it would be 95-100 ("educated guess").
 

Dolphin

Senior Member
Messages
17,567
I think this was said before by somebody but anyway: I was reading the paper again and noticed how when they made their sample size calculations, they assumed that 10% of the SMC would improve. In the final paper, 45% improved. This doesn't mean their initial calculations were wrong: all or most of it is presumably to do with how much they watered down the change required for improvement.
 

Esther12

Senior Member
Messages
13,774
Didn't they also expect 60% to 'improve' (old, strict defintion) and 30% to recover (old, strict definition). So they got their predictions right by totally changing the definitions - surely that shows their expertise is not quite as great as they had believed. (I could be wrong about this, I've not re-read the document recently).
 

Dolphin

Senior Member
Messages
17,567
Didn't they also expect 60% to 'improve' (old, strict defintion) and 30% to recover (old, strict definition). So they got their predictions right by totally changing the definitions - surely that shows their expertise is not quite as great as they had believed. (I could be wrong about this, I've not re-read the document recently).
Did they make estimates on recovery?

I'm just pasting some extracts - I don't expect people to read them all

Here's what we have (I hadn't looked at these before posting):

PACE Trial Identifier:
3.12 What is the proposed sample size and what is the justification for the assumptions underlying the power calculations?

Assumptions: At one year we assume that 60% will improve with CBT, 50% with GET, 25% with APT and 10% with UMC. The existing evidence suggests that at one year follow up, 50 to 63% of subjects with CFS had a positive outcome, by intention to treat, in the three RCTs of rehabilitative CBT,15,17,18 with 69% improved after an educational rehabilitation that closely resembled CBT.22 This compares to 18 to 63% improved in the two RCTs of GET,19,20 and 47% improvement in a clinical audit of GET.38 For usual medical care 6% to 17% improved by one year in two RCTs.15,22 There are no previous RCTs of APT to guide us," but we estimate that APT will be at least as effective as the control treatments of relaxation and flexibility used in previous RCTs, with 26% to 27% improved on primary outcomes.18,19 We propose that a clinically important difference would be between 2 and 3 times the improvement rate of UMC.

Power analyses: Our planned intention to treat analyses will compare APT against UMC, and both CBT and GET against APT. Assuming ? = 5 % and a power of 90 %, we require a minimum of 135 subjects in the UMC and APT groups, 80 subjects in the GET group and 40 in the CBT group.39 However these last two numbers are insufficient to study predictors, process, or cost-effectiveness. We will not be able to get a precise estimate of the difference between CBT and GET, though our estimates will be useful in planning future trials. As an example, to detect a difference in response rates of 50% and 60%, with 90% power, would require 520 subject per group; numbers beyond a realistic two-arm trial. Therefore, we will study equal numbers of 135 subjects in each of the four arms, which gives us greater than 90% power to study differences in efficacy between APT and both CBT and GET. We will adjust our numbers for dropouts, at the same time as designing the trial and its management to minimise dropouts. Dropout rates were 12 and 33% in the two studies of GET and 3,10, and 40% in the three studies of rehabilitative CBT.12,14 On the basis of our own previous trials, we estimate a dropout rate of 10%. We therefore require approximately 150 subjects in each treatment group, or 600 subjects in all. Calculation of the sample size required to detect economic differences between treatment groups requires data of cost per change in outcome, which is not currently available. Since costs are not expected to vary significantly between or within groups, the treatment determined number of 150 per arm is likely to find significant differences in cost-effectiveness.

What are the proposed outcome measures?

Primary efficacy measures:

Since we are interested in changes in both symptoms and disability we have chosen to make both fatigue and physical function primary outcomes. This is because it is possible that a specific treatment may relieve symptoms without reducing disability, or vice versa. Both these measures will be self-rated. The 11 item Chalder fatigue questionnaire measures the severity of symptomatic fatigue,23 and has been the most frequently used measure of fatigue in most previous trials of these interventions. We will use the 0,0,1,1 item scores to allow a categorical threshold measure of "abnormal" fatigue with a score of 4 having been previously shown to indicate abnormal fatigue.23 A Likert scoring (0,1,2,3) will also be used, as a secondary outcome measure, to better measure response to treatment. The SF-36 physical function sub-seal24 measures physical function, and has often been used as an important outcome measure in trials of CBT and GET. We will count a score of 75 (out of a maximum of 100) or more as indicating normal function, this score being one standard deviation below the mean score (90) for the UK working age population.29

Jenkinson C et al. Short form 36 (SF-36) Health Survey questionnaire: normative data from a large random sample of working age adults. BMJ 1993; 306: 1437-40.

(Aside: Interesting how it was working age from Jenkinson initially. Then Jenkinson and Bowling in protocol and by the end Jenkinson was dropped).

Protocol:

http://www.biomedcentral.com/1471-2377/7/6

Assumptions

The existing evidence does not allow precise estimates of improvement with the trial treatments. However the available data suggests that at one year follow up, 50 to 63% of participants with CFS/ME had a positive outcome, by intention to treat, in the three RCTs of rehabilitative CBT [18,25,26], with 69% improved after an educational rehabilitation that closely resembled CBT [43]. This compares to 18 and 63% improved in the two RCTs of GET [23,24], and 47% improvement in a clinical audit of GET [56]. Having usual rather than specialist medical care allowed 6% to 17% to improve by one year in two RCTs [18,25]. There are no previous RCTs of APT to guide us [11,12], but we estimate that APT will be at least as effective as the control treatments of relaxation and flexibility used in previous RCTs, with 26% to 27% improved on primary outcomes [23,26]. We propose that a clinically important difference would be between 2 and 3 times the improvement rate of SSMC.

Power analyses

Our planned intention to treat analyses will compare APT against SSMC alone, and both CBT and GET against APT. Assuming ? = 5% and a power of 90%, we require a minimum of 135 participants in the SSMC alone and APT groups, 80 participants in the GET group and 40 in the CBT group [57]. However these last two numbers are insufficient to study predictors, process, or cost-effectiveness. We will not be able to get a precise estimate of the difference between CBT and GET, though our estimates will be useful in planning future trials. As an example, to detect a difference in response rates of 50% and 60%, with 90% power, would require 520 participants per group; numbers beyond a realistic two-arm trial. Therefore, we will study equal numbers of 135 participants in each of the four arms, which gives us greater than 90% power to study differences in efficacy between APT and both CBT and GET. We will adjust our numbers for dropouts, at the same time as designing the trial and its management to minimise dropouts. Dropout rates were 12 and 33% in the two studies of GET [23,24] and 3, 10, and 40% in the three studies of rehabilitative CBT [18,25,26]. On the basis of our own previous trials, we estimate a dropout rate of 10%. We therefore require approximately 150 participants in each treatment group, or 600 participants in all. Calculation of the sample size required to detect economic differences between treatment groups requires data of cost per change in outcome, which is not currently available.

Primary outcome measures Primary efficacy measures

Since we are interested in changes in both symptoms and disability we have chosen to designate both the symptoms of fatigue and physical function as primary outcomes. This is because it is possible that a specific treatment may relieve symptoms without reducing disability, or vice versa. Both these measures will be self-rated.

The 11 item Chalder Fatigue Questionnaire measures the severity of symptomatic fatigue [27], and has been the most frequently used measure of fatigue in most previous trials of these interventions. We will use the 0,0,1,1 item scores to allow a possible score of between 0 and 11. A positive outcome will be a 50% reduction in fatigue score, or a score of 3 or less, this threshold having been previously shown to indicate normal fatigue [27].

The SF-36 physical function sub-scale [29] measures physical function, and has often been used as a primary outcome measure in trials of CBT and GET. We will count a score of 75 (out of a maximum of 100) or more, or a 50% increase from baseline in SF-36 sub-scale score as a positive outcome. A score of 70 is about one standard deviation below the mean score (about 85, depending on the study) for the UK adult population [51,52].

Those participants who improve in both primary outcome measures will be regarded as overall improvers.


Final paper:
We calculated sample sizes assuming 60% response to
CBT at 52 weeks, 50% response to GET, 25% response
to APT, and 10% response to SMC.10 We assumed APT
to be at least as eff ective as in previous trials of relaxation
and fl exibility therapies.20,22 For a two-sided test with
5% signifi cance level and 90% power, we calculated that
the number of participants needed to compare SMC
with APT was 135, SMC with GET was 80, and SMC
with CBT was 40. We increased group size to 150 per
group to allow for 10% dropout, to provide equality
between groups, and for secondary analyses.
 

Esther12

Senior Member
Messages
13,774
I was thinking of the identifier section you quoted, so I must have imagined the reference to recovery. Whoops. Thanks for pulling the quotes out though.
 

Dolphin

Senior Member
Messages
17,567
Again maybe this point was made before:

http://web.archive.org/web/20030611...led-trials.com/isrctn/trial/|/0/54285094.html
Sequential outpatients attending six chronic fatigue clinics in secondary care, who meet the Oxford criteria for chronic fatigue syndrome (CFS). We will operationalise CFS in terms of fatigue severity and disability as follows: a Chalder fatigue score of 4 or more and an SF36 physical function score of less than 75.
So they considered 75 and under would qualify as a CFS case but they tried to claim scores of 60+ were normal functioning.
 
Back