• Welcome to Phoenix Rising!

    Created in 2008, Phoenix Rising is the largest and oldest forum dedicated to furthering the understanding of and finding treatments for complex chronic illnesses such as chronic fatigue syndrome (ME/CFS), fibromyalgia (FM), long COVID, postural orthostatic tachycardia syndrome (POTS), mast cell activation syndrome (MCAS), and allied diseases.

    To become a member, simply click the Register button at the top right.

BBC Radio 4 - Ben Goldacre's - Bad Evidence

Firestormm

Senior Member
Messages
5,055
Location
Cornwall England
http://en.wikipedia.org/wiki/Blind_experiment

"A blind or blinded experiment is a scientific experiment where some of the people involved are prevented from knowing certain information that might lead to conscious or subconscious bias on their part, thus invalidating the results."

If you know what treatment you are getting then its not blinded. Period. To be double blinded then even the researchers cannot know.

SMC was a control. Thats something else. So far as I recall the patients were randomly allocated too. So its a randomized controlled experiment, but not blinded. There is however doubt about the value of the control group chosen and other important aspects of the experimental design and implementation.

OK. So PACE was a Random Controlled Trial. But blinding is a preference that is not always adopted? Your preference would be double blinding. When it comes to drugs then blinding is I believe an requirement. I still think - as we were looking at safety and clinical treatments - PACE could have been blinded using SMC or patients receiving no care.
 

user9876

Senior Member
Messages
4,556
Why not undertake an RCT to explore the e.g. financial costs and personal costs involved in reassessing someone who has been on invalidity benefit for X number of years; in migrating them to a new benefit; in assessing the effectiveness and fairness of the process; of following former claimants who are found 'fit for work' to see if they can find a job and hold onto it and/or if they are forced to return to benefits due to their health; to see what effect all of this upheaval has on a person's life before during and after etc. etc.

These kind of studies are rarely undertaken. This kind of consideration rarely happens. One reason is probably, time. Each Government has (now) 5 years. Each Government wants to stamp it's mark and reward it's supporters by at least trying to follow through on what it proclaims in it's manifesto. And yet we have a civil service that exisits largely intact throughout the change in Government. So such studies are possible I would suggest.

The question is can you generalise the results from the cases you study to other cased. The first step is to carefully specify the conditions under which the RCT took place. You can then start to examine what assumptions you are making as you generalise from the specifics in your trial to the general case. Without taking this step and I would argue doing it in a formal manner (i.e. with an underlying language with well specified semantics) then you are making arbitary and hidden generalisations and there is no discussion over the validity. I've not seen anyone trying to go in this direction yet it seems an essential to a scientific approach when doing things in a complex world.

Then there are all the measurement issues. If you can't measure you can't do science. Your conclusions need to be supported by an understanding of the accuracy of your measurement system - can something be statistically significant if it is within the error range of the measurement equipment (it probably depends on the error distributions). Surveys have many issues and are quite inaccurate especially when done badly. We should take real care.

My concern is that people follow a ritual that they claim is scientific without thinking it through and hence bad judgements are made. I don't see how anyone can go down the RCT type route for more than simple tests without having a decent underlying mathematics (inc logic) to support reasoning. Stats on results don't tell anywhere near the full story.

Then how do we talk and combine different peices of evidence? There are evidencial reasoning frameworks (largely based an bayes and extensions) but these again are ignored.
 

alex3619

Senior Member
Messages
13,810
Location
Logan, Queensland, Australia
Hi Firestormm, thats implementation of a control, not blinding. Even the control groups cannot know they are controls in a blinded experiment. Plus how do you get the clinicians to be unaware of who is getting what treatment? To be double blinded that is necessary.

RCTs without blinding is about all we can expect from psycho-social interventions. Double blinded placebo controlled randomized trials, the gold standard, works with drugs, but even there fails when patients or clinicians start to notice the presence or absence of side effects. Its not a valid protocol for psycho-social treatments.

Such studies can however be ranked in an EBM approach. That is all that EBM is at its core: a ranking system.

Bye, Alex
 

user9876

Senior Member
Messages
4,556
Economic data collection is much more established, codified and rigorous than that found in most other fields (although still problematic) from which economists can measure, correlate and devise multi factor regression equations to 'model' what they expect to happen in the short, medium and long-term. All very scientific you would imagine despite the fact fact many of the terms entered into the regressions are often no more than a finger in the air gut feeling. Policy makers would though be deeply impressed by the pages of calculus.

Economists seem to use very complex mathematics to express simple concepts. The economists I know tend to talk of models as being thought tools rather than being predictive. That is they provide ways of structuring thoughts and assumptions and then looking at the consequnces. I think this is useful in informing policy makers as long as they don't simply believe.

I have read a psychology paper around decision making which suggested decisions supported by numbers (and maths) were more likely to be made without thinking about the problem.
 

user9876

Senior Member
Messages
4,556
Marco. What about after-the-fact trials? You know how the newly elected party takes great strides to de-construct what has gone before - education being one area I think regularly gets a makeover with each new Minister. How about employing RCT's to look at existing policy in order to see how effective it really is? Or at least taking a more objective and quantitative approach to such things?

.

So you could do an RCT looking at the effects of smaller class sizes in primary schools but what is an appropriate measurement point. Performance on leaving primary school, performance at 16 (7 years later), performance at university (10 years later) or say overall earning potential (? years later). Then you need to control for your cohorts - I would argue that for each variable that you find it hard to control you should double the cohort size. So you need very big and long term trials. When do you make the decision. Ok any decision there is value in waiting until the uncertainty is reduced (since economics is mentioned real option theory can help frame this decision).

But its slow.

I don't see anything wrong with a post hoc analysis of what has happend, looking for failures and reacting. But that is how some conventional policy making happens (the non ideological part).
 

Firestormm

Senior Member
Messages
5,055
Location
Cornwall England
Thanks Alex. Nice find :)

Test, Learn, Adapt:
Developing Public Policy with Randomised Controlled Trials
Laura Haynes
Owain Service
Ben Goldacre
David Torgerson

Executive Summary:

Randomised controlled trials (RCTs) are the best way of determining whether a policy is working. They are now used extensively in international development, medicine, and business
to identify which policy, drug or sales method is most effective. They are also at the heart of the Behavioural Insights Teamʼs methodology.

However, RCTs are not routinely used to test the effectiveness of public policy interventions in the UK. We think that they should be. What makes RCTs different from other types of evaluation is the introduction of a randomly assigned control group, which enables you to compare the effectiveness of a new intervention against what would have happened if you had changed nothing.

The introduction of a control group eliminates a whole host of biases that normally complicate the evaluation process – for example, if you introduce a new “back to work” scheme, how will you know whether those receiving the extra support might not have found a job anyway?

In the fictitious example below in Figure 1, we can see that those who received the back to work intervention were much more likely to find a job than those who did not. Because we have a control
group, we know that it is the intervention that achieves the effect and not some other factor (such as generally improving economic conditions).

With the right academic and policy support, RCTs can be much cheaper and simpler to put in place than is often supposed. By enabling us to demonstrate just how well a policy is working, RCTs can save money in the long term - they are a powerful tool to help policymakers and practitioners decide which of several policies is the most cost effective, and also which interventions are not as effective as might have been supposed. It is especially important in times of shrinking public sector budgets to be confident that public money is spent on policies shown to deliver value for money.

We have identified nine separate steps that are required to set up any RCT. Many of these steps will be familiar to anyone putting in place a well-designed policy evaluation – for example, the need to be clear, from the outset, about what the policy is seeking to achieve. Some – in particular the need to randomly allocate individuals or institutions to different groups which receive different treatment – are what lend RCTs their power. The nine steps are at the heart of the Behavioural Insights Teamʼs ʻtest, learn, adaptʼ methodology, which focuses on understanding better what works and continually improving policy interventions to reflect what we have learnt. They are described in the box adjacent.

http://www.cabinetoffice.gov.uk/sites/default/files/resources/TLA-1906126.pdf

Published June 2012
 

Min

Guest
Messages
1,387
Location
UK
As far as I'm aware, Ben Goldacre is not a scientist, he's a psychiatrist who trained with Prof Simon Wessely.
 

Marco

Grrrrrrr!
Messages
2,386
Location
Near Cognac, France
The introduction of a control group eliminates a whole host of biases that normally complicate the evaluation process – for example, if you introduce a new “back to work” scheme, how will you know whether those receiving the extra support might not have found a job anyway?

Does this differ substantially from the common practice of using 'pilots' to test policy whereby the results from the pilot group (the new back to work scheme in the above example) are compared to outcomes for those not included in the pilot (the base case). If you suspect geographical variations may be a factor then you run a number of pilots in different locations.

Why dress this up as a RCT that most people would associate with the practice or blinding or double blinding as discussed above when in fact its established practice in policy development, monitoring and evaluation (which is often post-hoc as Firestorm suggested).

You could of course get over geographical or similar confounders by randomly allocating all participants in a scheme to either the new policy group or 'business as usual' but there are few areas of public policy (e.g, smaller school class sizes) where some malcontent won't bring a case under equality legislation that they or their nearest and dearest were denied an opportunity by being 'excluded' from the new policy.

At least with pilots you can usually get a way with 'excluding' non-participants as the concept is well understood and usually only involves a minority.

Really, laboratory methods don't translate well into more naturalistic settings.
 

biophile

Places I'd rather be.
Messages
8,977
PACE was completely non-masked, except for some of the evaluation and analyses. The SMC group was a mediocre control, better than nothing but not as closely monitored as the therapy groups, and different groups received different expectations. Interestingly, the Lancet does not refer to PACE as a RCT, but as a randomised trial.

PACE methodology was better than most CBT/GET studies but it was clearly not a "gold standard" by drug trial standards and should never be hinted as such even if it is impractical to be anything more. Alex has previously used the clever phrase "lead standard". If PACE was a drug trial, it would be dumped on by Goldacre, but since he is a believer in a powerful placebo effect, perhaps he would just overlook the pesky issues of masking and reactivity, instead praising the wonders of the mind to heal the body in CFS even though the results were modest and not supported by other important measures?
 
Messages
13,774
Economics is a very large driving force especially in today's climate. It was after all the driver being the austerity measures that kicked off the welfare reforms. But. When this Government decided to reform the welfare system they began with a decision to slash £20bn from the expenditure (figure from memory). Then they determined how they'd achieve it. Then they determined 'work is good'. I couldn't help but feel the cart was before the horse.

I agree with the general point, but the specifics are really off here. The BPS approach to managing disability has been building up for twenty years. This was not just a response to the current economic problems, but part of a long-term approach to justifying pushing those with health problems in to work or poverty in a way that people would normally be instinctively repulsed by.

It also seems really f-ing complicated. I feel like I've got a vague sense of things now, but still with a lot of uncertainty, but it's something one could spend a lifetime reading about, an still have more to learn.

This thread on BPS includes a section on one of Aylward's reports, which is cited by ministers as the justification for their reforms:

http://forums.phoenixrising.me/inde...-biopsychosocial-model-paper-from-2004.17783/

I think that something like this article gets to some of the truth:

http://internationalgreensocialist....strategies-for-getting-the-sick-back-to-work/

I'm not really confident about all of it though and think that I end up disagreeing with aspects of almost everything I read on the topic. The trouble is that this is politics and history, so it's a lot harder to 'prove' what happened and why. This means that it's easy to slip in to trying to give people with power and authority the benefit of the doubt (it feels so much nicer to do so!), in ways which distort one's view of reality. I think that this is a really important area, very directly tied to how CFS has been treated, that would benefit from a lot more critical research and reading. (I don't want to do it though - so much work... )
 

barbc56

Senior Member
Messages
3,657
As far as I'm aware, Ben Goldacre is not a scientist, he's a psychiatrist who trained with Prof Simon Wessely.

Yes you are correct. He is a psychiatrist. I think he went to the same program as Wessely or affiliated with the same department but I need to look that up.

This is all very fascinating but difficult to trudge through. But trudge I will. Thanks for all for the links.

The more I learn, the more I learn how little I know. -- Socrates
 
Messages
13,774
I just re-read some of that biopsychosocial thread I mentioned, and parts of it sounded quite relevant to the discussion here, eg:

re blame: I think that checks and balances require the potential for blame, and I don't think systematic problems absolve indviduals of personal blame. Our systems were set up by individuals, and individual researchers had a responsibility to be aware of the systems which they were operating within.

I'm possibly less optimistic about our systems potential to self-correct. As you said, I think that those of us who have been badly affected are still groping towards a firm understanding of exactly what has happened and why... and most people are not that interested in doing the work needed to understand these issues. It's really boring! I'm concerned that societies growing ability to pump out complicated and misleading research is going to lead to ever less genuine democratic participation, and much more potential for elites to be able to manipulate and control the social consensus - at least when it comes to matters which affect only a minority, and naturally tend attract little public attention and interest. Or it could be that people will decide that it's important to make a change, and impose more accountability upon those in power and authority. Fingers crossed.

http://forums.phoenixrising.me/inde...cial-model-paper-from-2004.17783/#post-270979

I'm not sure what I think about all this, but I can feel as if I can sort of see it from both sides. I think I had an undue faith in 'science', 'evidence', researchers, etc prior to taking a close look at the specifics of a lot of research. If I hadn't had bad personal experiences related to CFS, I think that I could have been quite easily suckered by badly done 'research' which claimed to show that policies which one would think were harmful to the poor and helpful to the rich actually helped the least well off. I think that a lot of people have a bit of a blind spot here, particularly those people who have tended to be lucky-ish in life, and avoid social problems (middle-class liberals who can pass exams okay) - and these tend to be the sorts of people who slip in to positions of power and authority.

Obviously, it's great to have access to more useful and meaningful data, but I do think that there are lots of reasons to be concerned about this sort of approach to policy development that are poorly appreciated by most of the people who 'matter'.

(Haven't listened to the programme yet - wouldn't it be wonderful of Goldacre carefully addressed all of these sorts of concerns? That would be such a relief!)
 

natasa778

Senior Member
Messages
1,774
RCTs are almost useless when applied to diseases with complex and not-completely-understood pathologies and manifestations unless they take into account the complexities of the issues. Which they cannot, by default.
 

barbc56

Senior Member
Messages
3,657
I haven't listened to the podcast, either. I would love to hear what people think.

Part of my training is in experimental design and statistics. I can't believe how much I've forgotten. It does come back but not as readily as before I got sick. There are all sorts of problems with applying studies to political policy. For that matter translating any soft science for a lot of issues, political policy being one of them is inherently difficult.

Barb
 

alex3619

Senior Member
Messages
13,810
Location
Logan, Queensland, Australia
I agree with the general point, but the specifics are really off here. The BPS approach to managing disability has been building up for twenty years. This was not just a response to the current economic problems, but part of a long-term approach to justifying pushing those with health problems in to work or poverty in a way that people would normally be instinctively repulsed by.

It also seems really f-ing complicated. I feel like I've got a vague sense of things now, but still with a lot of uncertainty, but it's something one could spend a lifetime reading about, an still have more to learn.

This thread on BPS includes a section on one of Aylward's reports, which is cited by ministers as the justification for their reforms:

http://forums.phoenixrising.me/inde...-biopsychosocial-model-paper-from-2004.17783/

I think that something like this article gets to some of the truth:

http://internationalgreensocialist....strategies-for-getting-the-sick-back-to-work/

I'm not really confident about all of it though and think that I end up disagreeing with aspects of almost everything I read on the topic. The trouble is that this is politics and history, so it's a lot harder to 'prove' what happened and why. This means that it's easy to slip in to trying to give people with power and authority the benefit of the doubt (it feels so much nicer to do so!), in ways which distort one's view of reality. I think that this is a really important area, very directly tied to how CFS has been treated, that would benefit from a lot more critical research and reading. (I don't want to do it though - so much work... )

Hi Esther12, yes, its an important area. The amount of work involved is why I guestimated it will take me 10 years. I commented on the Thorburn article when it first came out. One of the huge issues is that the theory about what those in BPS are doing, and the implementation of that, are not the same. Bye, Alex
 

alex3619

Senior Member
Messages
13,810
Location
Logan, Queensland, Australia
What RCTs are good for is prototyping simple changes in policy, like how a service is delivered. Even then you have to get your measures to reflect reality. You can optimize what you can measure this way, but you can't optimize things that cannot be measured. So a government might find a more streamlined optimized service could result, but the optimization might have made the service worse for the people it is supposed to help. Optimizing to number targets is an area where management is failing. Remember ATOS? Thats an extreme example, but recall that ATOS is highly secretive ... and therefore not appropriately transparent and accountable.

Having said that I am going back to finish reading the article later. What I have read so far indicates that its small things they want the RCTs to apply to, especially things for which government is already tracking data. This will mitigate many of the issues I have with this approach. They also mention that government needs to be transparent.

We have to know the data to be able to analyze things, and to properly optimize any service requires public input ... hence public data.

More to come in a later post.

Alex.
 
Messages
13,774
For that matter translating any soft science for a lot of issues, political policy being one of them is inherently difficult.

It can also be difficult to distinguish where 'hard' science blurs in to 'soft'.

Hi Esther12, yes, its an important area. The amount of work involved is why I guestimated it will take me 10 years. I commented on the Thorburn article when it first came out. One of the huge issues is that the theory about what those in BPS are doing, and the implementation of that, are not the same. Bye, Alex

We discussed it a bit in the thread I mentioned. (It's a bit shocking how much of what I write I forget!).

10 years is quite a project! Might be worth trying to break that up in to blog posts, or you'll have found everything's changed before you get it done. It is really difficult being ill though, as a 7 hour day can be so much more productive than 7 one hour days.
 

alex3619

Senior Member
Messages
13,810
Location
Logan, Queensland, Australia
Yes Esther12, my planned blogs are more or less in parallel with my book. Given the limitations of blogs I cannot put everything I do into them. Additionally its an incremental approach, which is just one way to deal with brain fog. I do not know that it can all be done piecemeal in blogs. What I might do from time to time is upload extended files, with the understanding that they are a work in progress not a final result and things could change a lot. Bye, Alex
 

alex3619

Senior Member
Messages
13,810
Location
Logan, Queensland, Australia
http://www.cabinetoffice.gov.uk/sites/default/files/resources/TLA-1906126.pdf

If you look at the optimization mentioned, such as with payment of fines in the court system, they got some positive results. However the outcome measure was court defined and specific. It is indeed possible to improve such measures, but what if we should be considering, for example, the disabled and their services and benefits? If you optimize for cash saved by that department, you will get results that look good on paper. However, the real question is whether the services better address the needs of BOTH government and the disabled.

The government works for society, and should be accountable to it. This is not what I see most of the time.

If disabled are forced off benefits, this might be a very bad thing not a good thing. For a start, are costs of constant appeals being taken into account? Appeals cost money.

Are medical costs being taken into account? A broke disabled person might not even be able to get to a doctor or hospital, or be otherwise unable to look after themselves adequately. Hence they might have more medical issues. On the one hand they might die, oh look, thats a big saving to government. On the other hand their health issues might worsen and take up more and more medical resources in an already overstretched system. Is that a saving?

What about legal issues? For example, a broke disabled person who has some physical capacity might turn to crime. This hurts society, and then court means the legal process is invoked ... at huge financial cost. Increase in policing and public relations costs also go with this.

Are secondary administrative costs being taken into account? How much extra paperwork is being generated at all levels of government including local councils?

What about secondary demands? They might put more pressure on other services, especially those involving food or accomodation. Are those costs taken into account?

A narrow outcome measure such as how much money is saved on disability payments has to be balanced against costs and benefits in other areas. Simply isolating that one figure and optimizing it might cost society and government MORE not LESS in the big picture. Is this being tracked?

How about intangible benefits? Public dissent is a factor in government policy. If government got things right more often they would not have to spend the huge fortune that is currently spent on public relations. The disabled would also be better served, and this would have a flow-on effect to their friends and family.

The paper also cites an example of RCTs saving lives, in this case the use of steroids in head injury. It had been presumed this saved lives, when in fact it increased the death rate. This is a valid use of an RCT, and is only one of several examples I am aware of. However the death rate is a clear issue, and the use of steroid therapy can be strictly defined.

They also stress that RCTs are a good way to show value for money. That this can save money I have no doubt, the question is about value. Saving money for identical or better results is great, saving a lot of money for slightly worse results might be justifiable, but outcomes often include intangibles that are not factored in, and other issues that are deliberately ignored.

Money is often at the heart of this. Money is easily represented as a number. How about social cohesion, justice or fairness? How do you enumerate those?

Step two of the test recommendations is where I have my major issues: "Determine the outcome that the policy is intended to influence and how it will be measured in the trial." A mistake here can lead to much worse results instead of better ones. For example, in an education policy they comment the results might be examination results. This then ties the test to an interpretation of what the exam results mean. Its not just about numbers. Better test scores in a more poorly educated student population are a risk of teaching to tests. The overal education strategy has to be more broad than just test results.

What RCTs can do is essentially the same as what pilot studies do, but offers a methodology to combine multiple pilot studies and compare results. Suppose there were three proposed small changes in education policy, all of which are optimized to exam results. An RCT with a control group would enable you to compare these policies. Its additional information to policy makers. All that information is useful, but the results might tend to be over-emphasized on the grounds the results are more "scientific" than other analyses. It could also be spun that way to push an agenda.

Step 7 discusses other issues, including secondary measure and qualitative data. Such measures can be useful in helping to interpret the study, and to some extent, for well designed and appropriate studies, will mitigate many of my concerns.

I assert again, to go down this road we first need accountable and transparent government.

The Test, Learn, Adapt strategy I have no problem with, in fact I will be promoting similar ideas. Its the implementation and how reliable such studies are considered to be that are the worry, particularly if the raw data is hidden and for issues driven by purely economic considerations.

Bye, Alex