PACE Trial and PACE Trial Protocol

Dolphin

Senior Member
Messages
17,567
Further work by Malcolm Hooper.

Magical Medicine - the PACE Trial
UPDATE ON THE PACE TRIAL

Professor Malcolm Hooper
11th July 2012
http://www.investinme.org/Article 432 - PACE Trial Update July 2012.htm

(I haven't read it yet)
I just got around to reading this.

For me, because of reading so much before, not much was new, but a bit of revision is probably no harm.

Here are two bits that perhaps contain some data that hasn't been posted before (perhaps not):


To demonstrate further how the PIs’ inappropriate use of a standard deviation has led to an unrepresentative “normal range” for SF-36 physical function scores in the PACE Trial, the scores for other disorders show that stable congestive heart failure patients have a mean SF-36 physical function score of 79.2; hepatitis C patients have a mean SF-36 physical function score of 79.3 and patients with osteoarthritis of the hip have a mean score of 62.4; thus patients with serious health conditions have mean SF-36 scores of more than 60 that was designated by the PIs as the “normal range” for healthy adults.

---


The mean distance record by PACE participants who had undergone CBT was 354 metres (a 1.5 metre decrease compared with the SMC control group), meaning that CBT was ineffective.

Significantly, the CBT group managed less increase in walking distance than those who received nothing more than SMC (standard medical care).

CBT failed to improve average six minute walking distances and participants in all the intervention groups had, on average, significant disability at the end of the PACE Trial.
For those who had undergone GET, the mean distance was 379 metres (an increase of 67 metres from baseline).

In the six minute walking test, a normal healthy walking score is 500 metres; on brisk walking the average score is 650 metres, and on fast walking the score is 800 – 1,000 metres. The mean walking distance for healthy people aged 50 to 85 years is 631 metres (a score of 518 metres is deemed abnormally low for healthy but elderly people).

Patients with chronic obstructive pulmonary disorder (including those needing supplemental oxygen) are able to walk on average 60 metres further during the 6 minute walking test compared with those in the PACE Trial who had received GET plus SMC.

On average, PACE participants were able to walk less distance during the 6 minute walking test than people with traumatic brain injury.

PACE participants’ average 6 minute walking distance test scores were also lower than scores documented in many other serious diseases such as those awaiting lung transplantation, where a six minute walking test of less than 400 metres is regarded as a marker for placing a patient on the transplant list, and those in chronic heart failure (whose
mean score is 682 metres), those in heart failure class II (mean score 558 metres) and those in heart failure class III, whose mean score is 402 metres in six minutes.

After CBT or GET, PACE Trial participants (whose average age was under 40) did not even achieve a six minute walking distance of 518 metres that is lower than average scores for healthy people aged 50-85 years.

There will likely be some more papers published on the PACE Trial in the next year or two, so we will need the sort of people who know the stuff in this document/who have followed this thread to write letters to the editor, as it could be hard for "beginners".
 

Esther12

Senior Member
Messages
13,774
Somebody sent me this and said I could post it. They said I didn't need to credit them, etc.

Ta D and uncredited worker.

White and Chalder both had experience with these measures and the issues that surround their use, which makes it all the harder to come up with an innocent excuse for their use in the presentation of results from PACE.
 

Bob

Senior Member
Messages
16,455
Location
England (south coast)
This is a paper that I've posted about previously:

Individual-patient monitoring in clinical practice: are available health status surveys adequate?
C A McHorney and A R Tarlov
This is found on pages 293-307 inside a journal:
Quality of Life Research volume 4 1995
(Publisher = Oxford: Rapid Communications of Oxford)


The paper gives a 'statistically significant' SF-36 PF score for an individual patient's health (for example, if the questionnaire is used in a clinical setting), as 23 points for SF-36 PF. (In practise, this would be a score of 25 points, as it is not possible to have a score of 23 points.)

The paper says:
"Using physical functioning as an example, an individual patient's score would have to change by at least ... 23 points on the SF-36 ... for the change to be statistically significant."

Originally, I said that this would not apply to the PACE Trial, because the PACE Trial was looking at the average results for a large number of patients. It's only relevant when looking at one individual patient to assess their scores, such as in a clinical setting."

But now I can't work out if I was being stupid or not...

I was correct to say that it wouldn't apply to average scores, but would it apply to the 'improvement rates' in the PACE Trial? I think it would, as each individual participant must be assessed to see if they achieved a clinically significant improvement.

Any thoughts please?

I'm in a bit of a muddle with it, because I don't understand why I previously thought that it was not useful in relation to the PACE Trial. If it is usable, then it is a handy reference that we can use against the PACE Trial results.



Here are more details from the paper:

The 'statistically significant' score seems to be specifically based on the: "test-retest reliability estimates and the associated 95% CI of the SEM".

Referring to all the scales compared, the paper says: "the reliabilities of all scales fall far below the 0.90 to 0.95 standard, and the 95% CIs are very wide"

Table A.9 Reliability estimates and standard errors of measurement for the SF-36 scales.
SF-36 Physical Functioning:
Internal consistency reliability = 0.93 95% CI of the SEM = 13.8
Test-retest reliability = 0.81 95% CI of the SEM = 22.7


"Using physical functioning as an example, an individual patient's score would have to change by at least ... 23 points on the SF-36 ... for the change to be statistically significant."

In this paper, the analysis for the statistically significant score is based on this paper:
The MOS 36-Item Short Form Survey (SF-36). III. Tests of data quality, scaling assumptions and reliability across diverse patient groups.
McHorney CA, Ware JE, Lu JFR, et al.,
Med Care 1994; 32: 40-66
 

Dolphin

Senior Member
Messages
17,567
This is a paper that I've posted about previously:

Individual-patient monitoring in clinical practice: are available health status surveys adequate?
C A McHorney and A R Tarlov
This is found on pages 293-307 inside a journal:
Quality of Life Research volume 4 1995
(Publisher = Oxford: Rapid Communications of Oxford)


The paper gives a 'statistically significant' SF-36 PF score for an individual patient's health (for example, if the questionnaire is used in a clinical setting), as 23 points for SF-36 PF. (In practise, this would be a score of 25 points, as it is not possible to have a score of 23 points.)

The paper says:
"Using physical functioning as an example, an individual patient's score would have to change by at least ... 23 points on the SF-36 ... for the change to be statistically significant."

Originally, I said that this would not apply to the PACE Trial, because the PACE Trial was looking at the average results for a large number of patients. It's only relevant when looking at one individual patient to assess their scores, such as in a clinical setting."

But now I can't work out if I was being stupid or not...

I was correct to say that it wouldn't apply to average scores, but would it apply to the 'improvement rates' in the PACE Trial? I think it would, as each individual participant must be assessed to see if they achieved a clinically significant improvement.

Any thoughts please?

I'm in a bit of a muddle with it, because I don't understand why I previously thought that it was not useful in relation to the PACE Trial. If it is usable, then it is a handy reference that we can use against the PACE Trial results.



Here are more details from the paper:

The 'statistically significant' score seems to be specifically based on the: "test-retest reliability estimates and the associated 95% CI of the SEM".

Referring to all the scales compared, the paper says: "the reliabilities of all scales fall far below the 0.90 to 0.95 standard, and the 95% CIs are very wide"

Table A.9 Reliability estimates and standard errors of measurement for the SF-36 scales.
SF-36 Physical Functioning:
Internal consistency reliability = 0.93 95% CI of the SEM = 13.8
Test-retest reliability = 0.81 95% CI of the SEM = 22.7


"Using physical functioning as an example, an individual patient's score would have to change by at least ... 23 points on the SF-36 ... for the change to be statistically significant."

In this paper, the analysis for the statistically significant score is based on this paper:
The MOS 36-Item Short Form Survey (SF-36). III. Tests of data quality, scaling assumptions and reliability across diverse patient groups.
McHorney CA, Ware JE, Lu JFR, et al.,
Med Care 1994; 32: 40-66
Good you're still looking in to things, Bob.

The figure of 23 comes from the fact that people's scores vary a bit if you ask them to fill in the questionnaire more than once.
While one could argue that one should feel sure a result isn't due to chance, I don't recall Guyatt using test-retest scores in any way.
If one just gave an individual patient one treatment, one would sometimes want to be sure a change wasn't due to chance.
However, with a group of patients trying different treatments in a trial, one might be willing to set lower thresholds.

I'm just thinking of the top of my head. I have no idea how much if ever test-retest 95% CI figures are used.
 

Bob

Senior Member
Messages
16,455
Location
England (south coast)
If one just gave an individual patient one treatment, one would sometimes want to be sure a change wasn't due to chance.
However, with a group of patients trying different treatments in a trial, one might be willing to set lower thresholds.

Yes, that's a very good point, thanks for that Dolphin.
That might be the reason why I thought it wasn't relevant to the PACE Trial, but I just can't remember.
I think i'll have to read the paper again! My stupid memory! It seems to be getting worse these days! :( :cry: :ill:
 

Esther12

Senior Member
Messages
13,774
Maybe that's why I thought it wasn't relevant to the PACE Trial, but I just can't remember.

lol - I know that feeling. It would be so much easier if we could spend a large amount of time in one month on it, rather than having to fit little bits of reading in around illness and other demands.

I think I agree with dolphin, and that the paper is only likely to be vaguely relevant to the presentation of results from PACE. It's still good to have comments being posted here though, to save others doing similar digging.
 

Bob

Senior Member
Messages
16,455
Location
England (south coast)
I think I agree with dolphin, and that the paper is only likely to be vaguely relevant to the presentation of results from PACE. It's still good to have comments being posted here though, to save others doing similar digging.

I'll read the paper again (it might stimulate some of my stupid memory cells into action), and I'll report back.
I have been searching research papers (especially looking for 'reviews') for a higher figure for a clinically useful outcome for SF-36 physical function scores, and this is the only paper that I've found that comes close to it.
 

Bob

Senior Member
Messages
16,455
Location
England (south coast)
The PACE Trial paper determined the 'clinical useful difference' by using common methodology for calculating effect sizes. (The CUD was a 'moderate' effect size.) But 'effect sizes' relate to average changes, so are 'effect sizes' usually used to working out improvement rates? Does anyone know if that is a common methodology? If it is, then I think that explains why the paper that I've been talking about is irrelevant for the PACE Trial. I'm still re-reading it. I can't see anything in the paper, yet, in relation to determining improvement rates. I've only got a hard copy, so I have to read the whole thing. I can't search for key words. :(
 

Snow Leopard

Hibernating
Messages
5,902
Location
South Australia
However, from anecdotal reports pacing can decrease prevalence and severity of relapses. Its amazing nobody has actually done the research to investigate this using objective methods. You would think this would be an obvious thing to do.

The point is that a large majority of patients learn to pace themselves within a few months of illness onset.
 

Dolphin

Senior Member
Messages
17,567
The PACE Trial paper determined the 'clinical useful difference' by using common methodology for calculating effect sizes. (The CUD was a 'moderate' effect size.) But 'effect sizes' relate to average changes, so are 'effect sizes' usually used to working out improvement rates? Does anyone know if that is a common methodology? If it is, then I think that explains why the paper that I've been talking about is irrelevant for the PACE Trial. I'm still re-reading it. I can't see anything in the paper, yet, in relation to determining improvement rates. I've only got a hard copy, so I have to read the whole thing. I can't search for key words. :(
CUDs/MIDs refer to individuals. Effect sizes are based on the mean change relative to standard deviations e.g. cohen's d effect size uses baseline standard deviations. Sometimes I would see effect sizes where the control wasn't used (so just used the baseline standard deviation). However, other times I would see a pooled baseline figure: this would then be different from one group's s.d. Cohen's d effect sizes are usually classed as .20 for small, .50 for medium (or moderate), and .80 are large.
 

Bob

Senior Member
Messages
16,455
Location
England (south coast)
However, from anecdotal reports pacing can decrease prevalence and severity of relapses. Its amazing nobody has actually done the research to investigate this using objective methods. You would think this would be an obvious thing to do. Bye, Alex

The point is that a large majority of patients learn to pace themselves within a few months of illness onset.

I suppose it might be impossible to carry out an objective investigation into pacing, because everybody automatically paces themselves anyway. Even children pace themselves, consciously or unconsciously. And babies sleep when they are tired! It's built into us. It's a genetically built-in intuitive self-protection system. ME patients just learn to pace more carefully, more systematically, and to a greater degree.

That's also why it's impossible to carry out an objective study into GET: Because everybody paces themselves, and patients will automatically slow down, and avoid other activities, if the GET activities have made them more tired or exhausted.

I don't see why a study into pacing couldn't be carried out though. But it's usefulness might be limited.
 

Bob

Senior Member
Messages
16,455
Location
England (south coast)
CUDs/MIDs refer to individuals.

Thank you Dolphin. I think that answers all of my questions then, if that's a common methodology.

Maybe the 'statistically significant' improvement only refers to isolated individual scores, such as in a clinical setting.

Time for bed now. :sleep:

Thanks everyone for all the discussion and feedback. :thumbsup:
 

Graham

Senior Moment
Messages
5,188
Location
Sussex, UK
Hi Bob, can I add a bit to the "statistically significant" discussion?

Too late, you had your chance to object. Here it is.

Bob has his own lucky old-style penny, and bets on it coming down heads. He thinks it is weighted in his favour. If he tosses it 10 times and it comes down heads 6 times, there's nothing unusual about that: it's just the spin of the coin. So how weird can it be before you start to get suspicious? Well if you want 95% weird (which means that 1 person in 20 is weird), 7 heads would almost persuade you (94.5%), and 8 would almost crack it at the 99% level (where 1 person in 100 is weird). In other words he would need to get 70% heads (7 out of 10) before you started suspecting, at the 95% level, that the coin was biased. But if he tossed the coin 100 times, he woud only need to get 58 heads (58% heads) to reach the 95% mark, and with 1000 tosses, 526 would be enough (52.6% heads). In other words, with a big enough sample, very small differences become significant.

I don't like 95% confidence levels: I don't think 1 in 20 people are weird.

For an ME treatment, I would want to know what the odds were of me experiencing a significant improvement. Trial it with big enough patient numbers, and even trivial changes become statistically significant, but that doesn't make them worthwhile. That is why the PACE criteria should have been determined medically rather than statistically.

Here's a challenge for you. I used to tell my class that we would try an experiment. I would keep tossing a coin, they were to write down the results, then we would analyse them. No matter how the coin turned up, I would call out Heads, Heads, Heads, until someone objected. Normally I would get to around 6 or 7. The class would all agree that I had been cheating, so I would ask them when, and that was when the fun began. But the real problem started when I wrote on the board H H T H T T H and told them that was really what I got. They all believed me. But then I showed them that the chance of getting H H T H T T H is exactly the same as getting H H H H H H H (1 in 128) so why should they believe me this time?

Sorry, it's in the blood. 40 years worth! I can't give it up.
 

user9876

Senior Member
Messages
4,556
Hi Bob, can I add a bit to the "statistically significant" discussion?

Too late, you had your chance to object. Here it is.

Bob has his own lucky old-style penny, and bets on it coming down heads. He thinks it is weighted in his favour. If he tosses it 10 times and it comes down heads 6 times, there's nothing unusual about that: it's just the spin of the coin. So how weird can it be before you start to get suspicious? Well if you want 95% weird (which means that 1 person in 20 is weird), 7 heads would almost persuade you (94.5%), and 8 would almost crack it at the 99% level (where 1 person in 100 is weird). In other words he would need to get 70% heads (7 out of 10) before you started suspecting, at the 95% level, that the coin was biased. But if he tossed the coin 100 times, he woud only need to get 58 heads (58% heads) to reach the 95% mark, and with 1000 tosses, 526 would be enough (52.6% heads). In other words, with a big enough sample, very small differences become significant.

I don't like 95% confidence levels: I don't think 1 in 20 people are weird.

For an ME treatment, I would want to know what the odds were of me experiencing a significant improvement. Trial it with big enough patient numbers, and even trivial changes become statistically significant, but that doesn't make them worthwhile. That is why the PACE criteria should have been determined medically rather than statistically.

Here's a challenge for you. I used to tell my class that we would try an experiment. I would keep tossing a coin, they were to write down the results, then we would analyse them. No matter how the coin turned up, I would call out Heads, Heads, Heads, until someone objected. Normally I would get to around 6 or 7. The class would all agree that I had been cheating, so I would ask them when, and that was when the fun began. But the real problem started when I wrote on the board H H T H T T H and told them that was really what I got. They all believed me. But then I showed them that the chance of getting H H T H T T H is exactly the same as getting H H H H H H H (1 in 128) so why should they believe me this time?

Sorry, it's in the blood. 40 years worth! I can't give it up.
Of course if you do enough CBT and GET trials some will at random come up with statistically significant results!
 

WillowJ

คภภเє ɠรค๓թєl
Messages
4,940
Location
WA, USA
I am of the opinion that they always planned to do a bait-and-switch. Deviation from the protocol is the norm, not the exception in science after all. The only difference here is the cover-up.

So, you think they've always known they are practicing junk science and they are *not* self-decieved at all, but rather sadists?

I had rather thought they'd managed to decieve themselves into thinking that, because they'd had some or a few patients (with whatever condition) who (for whatever reason) improved while under their care (not necessarily as a result of CBT/GET, but at that point in time), that this could/should work for more people (even though there's plenty of evidence that it doesn't and shouldn't work in others and particularly if you take care to define diseases it doesn't and shouldn't work in us and in other patients they doubtless have in their clinics and studies)
 

WillowJ

คภภเє ɠรค๓թєl
Messages
4,940
Location
WA, USA
Claims can only be dismissed as a crazy conspiracy theory if they're critical of people in positions of power... it's a conspiracy!


that is an interesting point....

however, claims can be made for discrimination/hate crime only when directed at those in a non-powered position....
 

WillowJ

คภภเє ɠรค๓թєl
Messages
4,940
Location
WA, USA
Yes, that's what I was thinking, Enid.
I thought that the SMC advice might conflict with the instructions given to the GET group.
But actually, the GET group were instructed to avoid extremes of activity, and only increased their activity levels very carefully and incrementally. It was a very controlled and carefully monitored environment, and if there was a flare up after an incremental increase in activity, then the activity levels would be reduced.

Unfortunately, the media, doctors, and therapists in clinical settings often don't understand this, and think that telling patients to "get out and exercise", or to push themselves to their limits, will be good for them, based on the GET.

For example, based on GET, the Daily Mail said:
...scientists have found encouraging people with ME to push themselves to their limits gives the best hope of recovery"

My memory fails me with regards to how much is written about testing 'limits' in the PACE literature.
But Peter White has said the following, which I think was unhelpful:
"They [the results] imply that testing the limits of the illness is more effective than staying within them."
(The EACLPP / European Association for Consultation Liaison Psychiatry and Psychosomatics: Abstracts, oral presentations)

he evidently means to challenge Lenny Jason's envelope theory (PDF), and to imply that PWME are passive, wimpy creatures who never attempt to see what more they can do, than the amount already doing (instead, as we know, he thinks we spiral downward in a maladaptive pattern of seeing boogeyman symptoms and reacting with unnecessary rest, leading to deconditioning, leading to additional "threatening" symptoms...).

Rather than, as we actually do, trying to manage life as best we can and we actually do test our limits (sometimes unintentionally, lol, and we're left thinking, 'right, that's why I'm supposed to remember to pace more carefully'...)
 
Back