Graham
Senior Moment
- Messages
- 5,188
- Location
- Sussex, UK
OK, now I need a statistician to explain Cohen's rules of thumbs, and to re-interpret the PACE Trial results, in language that I can easily understand
Always here for a challenge! If you file a coin so that it is lopsided and comes down heads with a probability of 0.501 (i.e. 501 times in a thousand, rather than 500), you would never detect it by tossing a coin 10 times, 100 times or even 1000 times. But if you tossed the coin millions of times, you would detect the slight bias in heads. The reason for that is simple. For ten tosses, 6H and 4T (60% heads) is only one out from 5H and 5H, so the proportion of heads to tails can change dramatically with just a few variations. If you tossed a coin a thousand times, you would need an extra hundred heads (600 to 400) to get 60% heads, which isn't so likely. So 501 heads out of a thousand isn't too unusual - just a single blip, but scaled up, 501000 heads out of a million needs an extra 1000 heads - more than just a blip. If you have large enough samples you can detect much smaller changes. These would be statistically significant (i.e. a statistician would say, that's pretty unusual, something must be up!), but pretty trivial in the real world.
Now men's heights are - mean 70 inches, s.d. 3 inches: women's are 64, s.d. 3 (roughly). Cohen says that if you calculate 70-64=6, then divide by the average standard deviation (3), 6/3=2, that final figure tells you in real world terms how noticeable the difference is. In other words, the average man's height is two standard deviations from the average woman.
Doing the same calculation to compare GET with SMC on the Chalder scale, we have outcomes of 20.6 (7.5) and 23.8 (6.6). So we calculate 23.820.6=3.2. Finding the "average" standard deviation isn't a question of adding and dividing by two though, because standard deviations are really multiplying or scaling factors. To average them you need to calculate the sq root of [(7.5 squared + 6.6 squared)/2] = 7.06 . Then 3.2/7.06 = 0.45, which Cohen would call a medium effect size - in other words the difference between the SMC group and the GET group is about half the standard deviation. (I think I have done the sums right, but I am a maths teacher, so I am used to people spotting my mistakes for me.) In effect this is what the PACE trial has done, by considering a difference of 1 s.d. to be significant.
So by now you will be feeling pretty disappointed. BUT (and it is in capitals) I don't think that is relevant. If every day you had 18 to 20 nuisance phone calls (say from a friend who wanted to chat about PACE results), but by having a word with the friend's partner, it was cut down to 15 or 16, that would count as a large effect change. It is the fact that you are not taking into account what is normal (i.e. no nuisance calls). If there were only people with ME in the world we would be doomed (perhaps that is what happened to the dinosaurs!), but the kind of change that the PACE trial recorded would be noticed. But all the time we are surrounded by people who are well, so the changes are tiny when compared with the population at large.
Statistics cannot really give us a value for that because it is a judgement about quality of life. The question is, how much of a move towards real health would you consider as significant?
Sorry for such a long posting.