• Welcome to Phoenix Rising!

    Created in 2008, Phoenix Rising is the largest and oldest forum dedicated to furthering the understanding of, and finding treatments for, complex chronic illnesses such as chronic fatigue syndrome (ME/CFS), fibromyalgia, long COVID, postural orthostatic tachycardia syndrome (POTS), mast cell activation syndrome (MCAS), and allied diseases.

    To become a member, simply click the Register button at the top right.

A cost effectiveness of the PACE trial

Simon

Senior Member
Messages
3,789
Location
Monmouth, UK
Sorry, not following this thread, but I just noticed that SMC press briefing. Ignore if it's already been mentioned:


01 August 2012
Do the best treatments for CFS cost more?
Speakers:
Professor Michael Sharpe, Professor of Psychological Medicine, University of Oxford
Dr Paul McCrone, Professor of Health Economics, Kings College London
...The PACE group in 2011 demonstrated that Cognitive Behavioural Therapy (CBT) and Graded Exercise Therapy (GET) were the most effective therapies for CFS. Authors of a follow up paper came to the SMC [Science & MEdia Centre] to discuss a new paper which explores whether therapies that are most effective at treating CFS also the most cost-effective.

http://www.sciencemediacentre.org/pages/press_briefings/index.php?&showArticle=676
Unfortunately there is no info about what they said but in answer to the question"Do the best treatments for CFS cost more?" [edit: they mean CBT & GET]
  1. Yes, according to direct healthcare costs
  2. Yes, for GET according to societal costs while CBT is cost-neutral - if informal care is valued at the minimum wage
 

user9876

Senior Member
Messages
4,556
Unfortunately there is no info about what they said but in answer to the question"Do the best treatments for CFS cost more?"
  1. Yes, according to direct healthcare costs
  2. Yes, for GET according to societal costs while CBT is cost-neutral - if informal care is valued at the minimum wage
I particularly object to their use of best, or most effective when they have not compared their treatments with others such as Rituximab or Ampligen. In their paper they don't even acknowledge the existance of other possible treatments.
 

Bob

Senior Member
Messages
16,455
Location
England (south coast)
Figures 1 & 2 come from a statistical technique called bootstrapping. An educational article on it is at: http://www.stat.rutgers.edu/home/mxie/RCPapers/bootstrap.pdf . Basically, it means one doesn't have to know what type of distributions the data takes (e.g. Normal distribution, Poisson Distribution, etc.), one just uses a computer to sample randomly from the data one has, get the computer to do this numerous times, and collate the results. This means one doesn't have to do any fancy mathematical calculations to work out what a model suggests the results should be.

I have come across the technique a few times (i.e. am no expert), however, my impression is the problems advocates are likely to have with the paper are not here. [Although, as I pointed out before, if one removed CBT from the figures, SMC wouldn't have such low percentages (for a lot of the graph)].

Thanks for that, Dolphin.
 

Simon

Senior Member
Messages
3,789
Location
Monmouth, UK
Figures 1 & 2 come from a statistical technique called bootstrapping. An educational article on it is at: http://www.stat.rutgers.edu/home/mxie/RCPapers/bootstrap.pdf . Basically, it means one doesn't have to know what type of distributions the data takes (e.g. Normal distribution, Poisson Distribution, etc.), one just uses a computer to sample randomly from the data one has, get the computer to do this numerous times, and collate the results. This means one doesn't have to do any fancy mathematical calculations to work out what a model suggests the results should be.

I have come across the technique a few times (i.e. am no expert), however, my impression is the problems advocates are likely to have with the paper are not here. [Although, as I pointed out before, if one removed CBT from the figures, SMC wouldn't have such low percentages (for a lot of the graph)].
Thanks, For those who prefer a less mathematical explantion of Bootsrapping try this powerpoint one (2001) - still makes sense if you skip the maths slides and you only need the first 10 or so slides, + Summary/conclusion.


Idiot's guide to bootstrapping (or guide by an Idiot)
I think this is right, someone please put me right if not.

Take a data sample, 570 data points (patients) in the case of this paper and randomly resample the exact same number of data points, i.e. 570.

Crucially, eg point is resampled independently so some data points will be samples several times and others not at all - so all resamples will be different. Or in simpler terms, imagine 570 different balls in a bag. You take out one, note which one it is, then replace the ball in the bag and repeat 570 times. That gives you the first resample. note that some balls will be taken out several times, others not at all. Now repeat this whole process many times.
lottery%20balls_3.jpg
Applying bootstrapping to this paper
In this paper, 1,000 resamples (each of 570) were made of the original data for net QALY benefit. Figure 1 shows the results. So, where healthcare providers are willing to spend £30,000 to gain one QALY (the threshold usually used in the UK) they found that for 100 of those 1,000 resamples, APT came out as the most effectice, in about 250 of the resamples GET came out top and in almost all the rest CBT came out top. Note that each resample has slightly different data, which is why CBT 'won' out in some resamples while APT won out in others.

Is Bootstrapping a reliable way to evaluate data?
This from the powerpoint presentation above:
  • A very very good question !
  • Jury still out on how far it can be applied, but for now nobody is going to shoot you down for using it.
  • Good agreement for Normal (Gaussian) distributions, skewed distributions tend to more problematic, particularly for the tails, (boot strap underestimates the errors).
I think that's a 'maybe'.
 

Sam Carter

Guest
Messages
435
Someone asked me how the percentages in Table 4 had been calculated, because they're not equal to N/(total n)*100.

Have they been adjusted somehow? (Or are we both misreading the paper?)
 

Valentijn

Senior Member
Messages
15,786
Regarding bootstrapping - isn't there a fundamental problem with having the ability to "bootstrap" repeatedly until you randomly get the results you want? And why would be bootstrapping ever be a satisfactory replacement for doing the actual statistical analysis?
 

user9876

Senior Member
Messages
4,556
Thanks, For those who prefer a less mathematical explantion of Bootsrapping try this powerpoint one (2001) - still makes sense if you skip the maths slides and you only need the first 10 or so slides, + Summary/conclusion.



Applying bootstrapping to this paper
In this paper, 1,000 resamples (each of 570) were made of the original data for net QALY benefit. Figure 1 shows the results. So, where healthcare providers are willing to spend £30,000 to gain one QALY (the threshold usually used in the UK) they found that for 100 of those 1,000 resamples, APT came out as the most effectice, in about 250 of the resamples GET came out top and in almost all the rest CBT came out top. Note that each resample has slightly different data, which is why CBT 'won' out in some resamples while APT won out in others.

Is Bootstrapping a reliable way to evaluate data?
This from the powerpoint presentation above:
  • A very very good question !
  • Jury still out on how far it can be applied, but for now nobody is going to shoot you down for using it.
  • Good agreement for Normal (Gaussian) distributions, skewed distributions tend to more problematic, particularly for the tails, (boot strap underestimates the errors).
I think that's a 'maybe'.

Thanks for the explanation.

My guess is that the number of samples you need will depend on the complexity of the distributions that they are drawn from. Hence it works well for a normal distributions. I seem to remember that if you fit a distribution to a set of samples the amount of data you need grows exponentially with each variable. In the document that Dolphin pointed to (message #214) there are references to papers looking at sample size.

One of the things that worried me about the PACE trial results was the increased std for each group. My first thought was that the result pdfs were multimodal. They haven't published that kind of information so its impossible to tell.

However, as Dolphin suggested I don't think this is the major issue with the work. To me the major issue is the gap between actual and perceived function/fatigue. Its also interesting to look at the technicalities of the 3 different scales reported. The designers of the EQ-5d scale are very clear that the coding they use has no arithmetic meaning hence you cann't add up the score, yet this is the mistake that both the fatigue scale and the SF36-PF scale fall into. I have slightly more sympathy for the SF36-PF scale in that it is measuring a single factor - however, it suffers from edge effects and non-lineararities hence using means and std are not valid. Hence the clinically useful difference they use is also not valid.
 

Dolphin

Senior Member
Messages
17,567
Someone asked me how the percentages in Table 4 had been calculated, because they're not equal to N/(total n)*100.

Have they been adjusted somehow? (Or are we both misreading the paper?)
Can you point out ones which are out. Although the top figure gives sample sizes, it's still possible the sample size could be a little smaller for individual questions which were incomplete/unclear/spoiled in some way.
 

alex3619

Senior Member
Messages
13,810
Location
Logan, Queensland, Australia
Regarding bootstrapping - isn't there a fundamental problem with having the ability to "bootstrap" repeatedly until you randomly get the results you want? And why would be bootstrapping ever be a satisfactory replacement for doing the actual statistical analysis?

Hi Valentijn, potentially I think you are right. Keep retesting till you get a sequence of data with the result you want. Or, if this doesn't work and you are very unethical, scrap it and start again. Repeat until you get a sequence you like. Then stop testing. Whether this happens in any particular case might be very hard to judge however.

Bye, Alex
 

Dolphin

Senior Member
Messages
17,567
Applying bootstrapping to this paper
In this paper, 1,000 resamples (each of 570) were made of the original data for net QALY benefit. Figure 1 shows the results. So, where healthcare providers are willing to spend £30,000 to gain one QALY (the threshold usually used in the UK) they found that for 100 of those 1,000 resamples, APT came out as the most effectice, in about 250 of the resamples GET came out top and in almost all the rest CBT came out top. Note that each resample has slightly different data, which is why CBT 'won' out in some resamples while APT won out in others.
Thanks for that. However, the solid line is SMC alone not APT (see Figure 2) i.e. they found that for (around) 100 of those 1,000 resamples, SMC came out as the most effectice
 

Sam Carter

Guest
Messages
435
Can you point out ones which are out. Although the top figure gives sample sizes, it's still possible the sample size could be a little smaller for individual questions which were incomplete/unclear/spoiled in some way.

All the ones I've checked are incorrect; as an example, for APT (n=141)

Income benefits N (%)

6-month pre-randomisation period 28 (18) but 28/141*100=19.86

12-month post-randomisation period 33 (22) but 33/141*100=23.40

Assuming they've rounded percentages roughly +- 0.5 in either direction, then to derive the percentages shown, you would need n∈{152, 153, 154, 155, 156, 157, 158, 159, 160} for the first calculation, and n∈{147, 148, 149, 150, 151, 152, 153} for the second calculation.

I think you're right about the sample size varying as a consequence of incomplete data.

ETA: But if n (total n) varies, is it not somewhat misleading to present data in the form of N (%), when the table implies that n does not vary? Would it not be more accurate, and consistent, simply to give the percent rather than an absolute number, since if the absolute number is taken from a smaller sample it will, in this instance, potentially understate the numbers receiving welfare/income benefits, as those for whom data is missing might also be claiming some kind of benefit? In short, the absolute numbers given set a lower bound on the numbers of claimants; in reality it could be higher.
 

Bob

Senior Member
Messages
16,455
Location
England (south coast)
Someone asked me how the percentages in Table 4 had been calculated, because they're not equal to N/(total n)*100.

Have they been adjusted somehow? (Or are we both misreading the paper?)


Edit: Sorry everyone, it looks like I've got this wrong.
It looks like these details are incorrect. See my later post, and biophile's post, for explanation.

Hi Sam,
Eagle eyes! (Good find.)
It looks like the percentages relate to the original PACE Trial study numbers.
You can see the relevant numbers in Table 1 of this paper. (159, 161, 160, 160)

It's not very helpful.
I think it makes the percentages that they've given, slightly lower than they should be.
So the actual percentages are slightly higher.

I'm sure it's not an intentional 'error', to make the figures for benefit claimants, and insurance claimants, look better than they are.
 

Bob

Senior Member
Messages
16,455
Location
England (south coast)
Regarding Lost Employment, the paper says:
"There was no clear difference between treatments in terms of lost employment."
And yet the unadjusted differences between therapies and SMC are as follows:
Difference from SMC: Changes in Lost Employment: APT = 62, CBT = -1,157, GET = -711, SMC = 0

I'm not complaining about this, but it is a bit confusing.
I'm surprised that a £1157 unadjusted difference can be adjusted down to 'no clear difference'.
 

Simon

Senior Member
Messages
3,789
Location
Monmouth, UK
Regarding bootstrapping - isn't there a fundamental problem with having the ability to "bootstrap" repeatedly until you randomly get the results you want? And why would be bootstrapping ever be a satisfactory replacement for doing the actual statistical analysis?
Hi Valentijin
With bootstrapping, the more resampling (more bootsrapping) the more reliable the estimate is likely to be, so the problem would be with stopping early when you get a result you like, not keeping going until you like the results. 1,000 resamples (as in this paper) is a decent size so it's unlikey they stopped early to get the 'right' results in this case.
 

Simon

Senior Member
Messages
3,789
Location
Monmouth, UK
This is my understanding so far (corrections always welcome):

Table 6 shows figures for the 'difference' between (the changes in) each of the therapy groups, and the SMC control group... The 'difference' between (the changes in) each of the three therapies and the SMC control group, gives the 'incremental' costs, savings and improvements for the three therapies (as laid out in Table 6). The 'difference' between the groups effectively nullifies the effects of SMC, in order to only consider the changes that three therapy groups are responsible for. This is the correct way of going about it, as the control group is designed to adjust the results for natural fluctuations, etc.

I think that Figures 1 & 2 use this data (the 'differences', or the 'incremental changes' in Table 6) to plot the graphs. (Although I don't understand how they work out Figures 1 & 2.)
Hi Bob
I think your interpretation of Table 6 is spot on but I think Figures 1 & 2 are computed on a different basis, looking at costs/benefits vs baseline, rather than vs SMC. This from the method:
Interpretation of the cost-effectiveness results was made using
cost-effectiveness acceptability curves [18]. Net benefit values were
computed for each study participant, defined as the value of a
QALY multiplied by the number of QALYs gained minus the cost
(from both healthcare and societal perspectives). We used QALY
values ranging from £0 to £60,000 in increments of £5000. For
each QALY value, regression models were used to determine the
difference in net benefit between the four treatment arms,
controlling for baseline utility and costs. Bootstrapping with
1000 resamples allowed the proportion of resamples showing
APT, CBT, GET and SMC as having the highest net benefit
So to me it looks like they compute QALY gains for each individual vs baseline, rather than vs SMC. But I'm not entirely sure about that.
 

Bob

Senior Member
Messages
16,455
Location
England (south coast)
Thank very much to everyone for the info about Figure 1.

I think I've finally figured out how to work out the QALY-based net benefit per individual, that Figure 1 is based on.
I was making a basic error before.

Here is how they work out the QALY-based net benefit per individual:

"Net benefit values were computed for each study participant, defined as:
the value of a QALY
multiplied by the number of QALYs gained
minus the cost (from both healthcare and societal perspectives)."


So to make the calculations, you take the proposed value of a QALY (e.g. £30,000); multiply it by the incremental number of QALYs gained per individual for each therapy, given in Table 6 (this gives you the QALY-based gross total cost benefit per individual for each therapy); and then subtract the QALY-based individual healthcare cost for each therapy (this is calculated by taking the cost per QALY for each therapy, given in Table 6, multiplied by the numbers of QALYs gained for each individual, given in Table 6.)

This gives the QALY-based net benefit values for each individual for each therapy, relative to SMC, which I think Figure 1 is based on. My figures seem to correspond to Figure 1 anyway.

And here are my calculations for three different different QALY values (£30,000, £20,000, £0):

(Negative values are net costs. Positive values are net savings.)


£30,000 QALY value
SMC 0
APT (30000 x 0.0149) [447] - (55235 x 0.0149) [823] = -376 (net cost)
CBT (30000 x 0.0492) [1476] - (18374 x 0.0492) [904] = 572 (net saving)
GET (30000 x 0.0343) [1029] - (23615 x 0.0343) [810] = 219 (net saving)



£20,000 QALY value
SMC 0
APT (20000 x 0.0149) [298] - (55235 x 0.0149) [823] = -525 (net cost)
CBT (20000 x 0.0492) [984] - (18374 x 0.0492) [904] = 80 (net saving)
GET (20000 x 0.0343) [686] - (23615 x 0.0343) [810] = -124 (net cost)



£0 QALY value
SMC 0
APT 0 - 823 = -823 (net cost)
CBT 0 - 904 = -904 (net cost)
GET 0 - 810 = -810 (net cost)



At a proposed QALY value of £20,000 (see above), the net benefit values of CBT and GET are pretty close to zero (i.e. crossing from negative values to positive values). This is near to the (relative) 'zero' value of SMC. So this is why the CBT/GET/SMC lines cross over near £20,000, on Figure 1. This makes me think that I've got these calculations right this time.


Edit: Except, my numbers don't seem exact enough, so there is room for improvement here.
 

Bob

Senior Member
Messages
16,455
Location
England (south coast)
Hi Bob
I think your interpretation of Table 6 is spot on but I think Figures 1 & 2 are computed on a different basis, looking at costs/benefits vs baseline, rather than vs SMC. This from the method:

Interpretation of the cost-effectiveness results was made using
cost-effectiveness acceptability curves [18]. Net benefit values were
computed for each study participant, defined as the value of a
QALY multiplied by the number of QALYs gained minus the cost
(from both healthcare and societal perspectives). We used QALY
values ranging from £0 to £60,000 in increments of £5000. For
each QALY value, regression models were used to determine the
difference in net benefit between the four treatment arms,
controlling for baseline utility and costs. Bootstrapping with
1000 resamples allowed the proportion of resamples showing
APT, CBT, GET and SMC as having the highest net benefit

So to me it looks like they compute QALY gains for each individual vs baseline, rather than vs SMC. But I'm not entirely sure about that.

Hi Simon,
Thanku very much for that.
Our posts crossed over.
If you are interested in this, then see if you agree with what I've done with my previous post.
Bob

Edit: Table 6 seems to be based on the changes (from pre-randomisation to post-randomisation) over and above SMC, which they indicate as 'incremental' costs and effects.
 

user9876

Senior Member
Messages
4,556
Hi Bob
I think your interpretation of Table 6 is spot on but I think Figures 1 & 2 are computed on a different basis, looking at costs/benefits vs baseline, rather than vs SMC. This from the method:

So to me it looks like they compute QALY gains for each individual vs baseline, rather than vs SMC. But I'm not entirely sure about that.

They seem to be using a regression model.
For
each QALY value, regression models were used to determine the
difference in net benefit between the four treatment arms,
controlling for baseline utility and costs.
 

Simon

Senior Member
Messages
3,789
Location
Monmouth, UK
This gives the QALY-based net benefit values for each individual for each therapy, relative to SMC, which I think Figure 1 is based on. My figures seem to correspond to Figure 1 anyway.

And here are my calculations for three different different QALY values (£30,000, £20,000, £0):

(Negative values are net costs. Positive values are net savings.)


At a proposed QALY value of £20,000 (see above), the net benefit values of CBT and GET are pretty close to zero (i.e. crossing from negative values to positive values). This is near to the (relative) 'zero' value of SMC. So this is why the CBT/GET/SMC lines cross over near £20,000, on Figure 1. This makes me think that I've got these calculations right this time.
They seem to be using a regression model.
Thanks User & Bob
Based on our discussions here, my best/final guess is the Fig 1 & 2 were constructed as follows:
  • Net benefit (QALY value x change from baseline - cost) is calculated for each patient based on changes from baseline
  • 1,000 resamples are created in the bootstrapping process, and for each resample regression analysis is used to compute which therapy is best (giving the percentage liklihood of each therapy being best at each QALY).
Bob, your figures are calculated using average data for CBT, GET etc groups, rather than the calculation per individual (vs baseline not SMC) I think is used in Fig 1 & 2. But of course results based on the averages should give broadly similar results to results based on individuals which is, I think, why your calculations give similar answers to theirs. i.e. they have done the calculations right, and so have you.
 

Sam Carter

Guest
Messages
435
Hi Sam,
Eagle eyes! (Good find.)
It looks like the percentages relate to the original PACE Trial study numbers.
You can see the relevant numbers in Table 1 of this paper. (159, 161, 160, 160)

It's not very helpful.
I think it makes the percentages that they've given, slightly lower than they should be.
So the actual percentages are slightly higher.

I'm sure it's not an intentional 'error', to make the figures for benefit claimants, and insurance claimants, look better than they are.

Thanks, Bob - but not my eagle eyes!

I don't want to labour this point because it may be of no importance (and my calculations could well be wrong), but looking at the figures in the APT column, the (small) n of 141 gives the wrong percentage for all the (absolute) N provided; for the other columns it's possible to find one, and only one, percentage (p) such that N/n*100=p, but it isn't clear to me what the (small) n denotes in this context.