• Welcome to Phoenix Rising!

    Created in 2008, Phoenix Rising is the largest and oldest forum dedicated to furthering the understanding of, and finding treatments for, complex chronic illnesses such as chronic fatigue syndrome (ME/CFS), fibromyalgia, long COVID, postural orthostatic tachycardia syndrome (POTS), mast cell activation syndrome (MCAS), and allied diseases.

    To become a member, simply click the Register button at the top right.

Paper on therapeutic allegiance of researchers affecting RCT results

Esther12

Senior Member
Messages
13,774
free full paper here: http://mentalhealthpros.com/mhp/pdf/Dodo-bird-meta-analys.pdf

I got a bit tired toward the end, but tried to pull out the bits which would be key to assessing the value of CBT/GET for CFS.

I've started looking more generally at the extent to which psychological interventions can be seen as 'placebo', and the extent to which 'placebo' can mean simply patients wanting to be polite and positive in questionnaires about those who have tried to help them, rather than the patients themselves seeing any significant improvement in their health problems. It seems like a lot of psychiatrists/therapists/etc see themselves as providing little more than placebo, but think that this is a wonderful and life changing thing. I'm not sure how evidence based this view is.

THE DODO BIRD VERDICT IS ALIVE AND WELL – MOSTLY
Lester Luborsky, Robert Rosenthal, Louis Diguer, Tomasz P. Andrusyna, Jeffrey S. Berman, Jill T.
Levitt, David A. Seligman, Elizabeth D. Krause

17 meta-analyses have been examined of comparisons of active treatments with each other, in contrast to the more usual comparisons of active treatments with controls. These meta-analyses yield a mean uncorrected absolute effect size for Cohen’s d of .20, which is small and non-significant (an equivalent Pearson’s r would be .10). Its smallness confirms Rosenzweig’s supposition in 1936 about the likely results of such comparisons. In the present sample, when such differences are then corrected for the therapeutic allegiance of the researchers involved in comparing the different psychotherapies, these differences tend to become even further reduced in size and significance, as shown in Luborsky et al. (1999).




The effect sizes were further reduced after corrections for the researcher’s allegiance and for other factors.

There is another major influence that can alter the typically modest and non-significant difference effect – it is the researcher’s allegiance effect. This effect is the association of measures of the researcher’s allegiance to each of the treatments compared with measures of the outcomes of the treatments. There had been hints of this effect for many years, as first noted in Luborsky et al. (1975). Now there is a really exhaustive review of the topic (Luborsky et al., 1999) that shows a well-established researcher’s allegiance effect – the correlation between the mean of 3 measures of the researcher’s allegiance and the outcome of the treatments compared was a huge Pearson’s r of .85 for a sample of 29 comparative treatment studies! The 3 measures, described in Luborsky et al (1999), are ratings of the reprint, ratings by colleagues who know the researcher’s work well, and self-ratings of allegiance by the researcher’ themselves.

This high correlation of the mean of the 3 allegiance measures with the outcomes of the treatments compared implies that the usual comparison of psychotherapies has a limited validity because so far it is not easy to rule out the presence of the large researcher allegiance effect. To make matters worse, it is not clear at all how the allegiance effect comes about. A variety of methods have been suggested by Luborsky et al. (1999) for reducing the intrusion of the researcher’s allegiance, but, even when they are implemented, the impact of such methods are likely to remain ambiguous in the precise amount of correction to be applied. Among the recommended precautionary steps: it might be valuable a) to include researchers with a variety of allegiances in the research group carrying out the study and b) to choose as a comparison to the preferred treatment, a treatment that is equally likely to be judged as credible. (Berman and Luborsky, in preparation; Berman and Weaver 1997).

A sample of the effects of corrections are noted as follows: When the uncorrected
correlations in Robinson et al. (1990) were corrected for researchers’ allegiance by the mean of their three corrected allegiance scores (the most common type of correction used here) (Luborsky et al, 1999), the correlations become lower and non-significant. The data from Smith, Glass, and Miller (1980) was corrected for reactivity (meaning, influencable by therapist or by researcher) and Luborsky et al. (1993) was corrected for the quality of the research design (Luborsky et al 1999).
The more exact changes can be seen by a comparison of the uncorrected with the corrected effect sizes in Table 1. For example, in Luborsky, Diguer, Luborsky, Singer, and Dickter (1993), the uncorrected comparison of two active treatments effect size was .00 (non-significant) and the effect size after correcting for research quality was similar in size: -.01 (non-significant).

To summarize these results, we compared the mean of the effect sizes of corrected
comparisons of active treatments from 11 meta-analyses in Table 1 – the 11 were all those for which we had data to compute corrections – with the mean of the corresponding uncorrected effect sizes We first converted all these effect sizes into Cohen’s d (Cohen, 1977) and then took the mean of the absolute value of the effect sizes. The mean uncorrected effect size with Cohen’s d was .20 but the mean corrected Cohen’s d effect size was only .12; the reductions of the corrected effect sizes meant they were no longer significant. Also, the median uncorrected effect size was .21 as compared to a corrected median effect size of .14; the reductions also meant they were no longer significant.


The main explanations for the “small” effect sizes for differences
in outcomes of active treatments

The effect sizes for comparisons of active treatments, both corrected and uncorrected, for
the 17 meta-analyses, were usually relatively “small” and non-significant. The adjective “small” is in quotes and preceded by the ambiguous qualifier “relatively” because the choice of a corresponding effect size level varies among the writers on the topic. Cohen (1977) for example, would call a d of .20 “small” (equivalent to a Pearson’s r of only .10) but Rosenthal (1990, 1995) would call it greater than small because the designation is somewhat dependent on the requirements of the situation, for example, if only 4 of 100 persons having a heart attack are saved by taking aspirin, that is not a small percentage if you are one of the 4 people!4 Now we are more ready to consider some probable explanations for this relatively small and non-significant relationship:

Explanation 1: The types of treatments do not differ much in their main effective ingredients and therefore “small” differences with non-significant effects are the rule.
The treatment components that are in-common between the treatments compared may be
the most influential basis for explaining the small and non-significant difference effect. This was the explanation offered by Rosenzweig (1936) and later restated by Frank and Frank (1991), Luborsky et al. (1975), Strupp and Hadley (1979), and Lambert and Bergin (1994). The last especially stressed the role of common factors across different psychotherapies in explaining the trend toward non-significant differences among the outcomes of different forms of psychotherapy. Elkin, et al., (1989) and Imber, et al. (1990) also considered the common factors across interpersonal and cognitive-behavioral psychotherapy in their explanations for the non-significant differences between different treatments in the NIMH Treatment for Depression Collaborative Research Program. This explanation emphasizes that the common components of different treatments may be so large and so much more potent than specific ingredients, that the comparisons result in small and nonsignificant differences. Other components have also been suggested as common across treatments:
the helping relationship with the therapist, the opportunity to express one’s thoughts (sometimes called abreaction), and the gains in self-understanding.

Explanation 2: The researcher’s allegiance to each type of treatment compared differs, sometimes favoring one treatment and sometimes favoring the other.

The researcher’s allegiance to each of the treatments in comparative treatment studies
appears to influence the small effect sizes of each treatment outcome in the expected direction, as shown in the comprehensive evaluation by Luborsky et al. (1999). To explain this more concretely:
Treatment A in a meta-analysis may be favored by the researcher’s positive allegiance in one study while in another study treatment A may suffer from a researcher’s negative allegiance.

Explanation 3: Clinical and procedural difficulties in comparative treatment studies may contribute to the non-significant differences trends.

There have been a series of rebuttals trying to explain the methodological problems that lead
to the Dodo bird trend – among these are Beutler (1991), Elliott, Stiles and Shapiro (1993),
Norcross (1995), and Shadish and Sweeney (1991). These discussions tend to agree that although research shows that the “small” and non-significant difference effect exists, the effects of different treatments may appear in ways that have not yet been studied. Kazdin (1986), Kazdin and Bass (1989), Wampold (1997), and Howard et al. (1997) further explain that non-significant differences between treatments may reflect procedural and design limitations in comparative treatment outcome studies. These limitations include the representativeness of the measures of treatment process and outcome and the statistical power of the findings. Howard et al. (1997) further suggests doing separate meta-analyses for each contrasting pair of types of treatments, such as we have done for Cognitive and Cognitive-Behavioral vs. Dynamic and other treatments (on p. 11).

Explanation 4: Interactions between certain patient qualities and treatment types, if not taken into account, may contribute to the non-significant difference effects.

Several studies, such as those by Beutler et al. (1991) and Blatt (1992); Blatt and Folsen
(1993); Blatt and Ford (1994), have shown that the match of the patient’s personality with
different treatments can then succeed in producing significant effects; when such matches are not taken into account, they may contribute to the non-significant difference effects.5

Comparisons of active treatments with each other often need a correction: The reexamination of 29 mostly newer studies by Luborsky et al. (1999) showed that a correction to the effect sizes is typically needed because researcher’s allegiance to each of the therapies compared is highly correlated with treatment outcomes – the correlation was a Pearson’s r of .85! Researcher’s allegiance is therefore a reasonable basis for correcting effect sizes. After corrections for researchers’ allegiance were applied, the effect sizes were usually reduced and non-significant.


Does anyone know whether the PACE researchers had any prior allegiance to any of the different treatments being assessed?

Ho ho.