Jonathan Edwards
"Gibberish"
- Messages
- 5,256
@Jonathan Edwards, I know you're not keen on the SF-36 scales, but I think SF-36 physical function has some benefits:
1. It's widely used, so you can compare scores with the general population and patients with other illnesses. That's been very useful when analysing e.g. the pace trial outcomes.
2. It doesn't have a ceiling effect in terms of poor health, or severely poor health. So deterioration and improvement in very ill patients can be monitored. (The chalder fatigue scale is a horrendously unhelpful scale because a significant proportion of patients score close to the maximum score, so deterioration can't be measured in all patients.)
3. It's a less relative/subjective questionnaire: It asks questions about engagement in specific activities rather than asking you to rate subjective symptoms in relation to a previous (long-forgotten) health status.
I think perhaps the the biggest negative is that, to keep things simple, the questions aren't very nuanced, and so it's not a very subtle measure of change.
I'm not familiar with many other self-report scales, so I can't compare it with many others, but I've always thought that SF-36 physical function is OK-ish for a broad indicator of changes in function.
I don't know if others would agree.
I am happy to accept that there is value to be found in these scores but I am still wanting to get clear in my head what each component of a composite would really do. So I am not so much disagreeing with anyone as throwing out thoughts about where things do not quite gel for me.
The potential for comparisons with other studies I can see is an advantage, although one might worry about context dependent biases. So gathering an SF-36 dataset might be a good thing for such secondary purposes.
There also seems to be an issue about remembering previous status. Some people here have suggested that it is easier to compare two levels of well being than give an absolute indicator at any one time. (I think the outcome measure has to be a comparison so that is fine from that angle.) But others have suggested that it is easy to forget what you were like at the beginning. What would seem sensible to me would be for the trialist and patient to go through all the issues that impact on the global illness problem, symptomatic and functional and to record them. SF-36 may be a good way to collect those data and if it is done again at the end then you have a pretty good idea of what the comparison is. (Looking back at the old score may help the patient come to an assessment of how much better they are.
The downside of using something like an SF-36 as the basis for scoring improvement in a stereotyped way, possibly using some pre-defined scoring system, is that the SF-36 can only ever be 'the closest standard dataset to an indication of what really matters for that patient'. By definition it cannot be better than sitting down and deciding, after mulling over the SF-36 results, what does really matter to the patient. Surely, the patient is likely to say 'that pretty much figures but actually for me this problem is much more important than that one now that I think about it'. Maybe SF-36 is clever enough to handle that but I doubt it.
This is where my individualised criteria come in - you sit down and agree what is really important for this person at the beginning. It needs to be within reasonable limits but as for lupus I suspect that a short list can be drawn up. You then have an outcome which cannot be charged with being irrelevant to patient needs. But if you do the SF-36 first to get your eye in I have no problem with that - and you still have your data for comparison with other groups.
On a wider issue I note that you suggest SF-36 is good for function. But in the framework I was suggesting it has to fall under number 1 because it is open to reporting bias. Number 1 is allowed to be largely or completely based on functional deficit but number 3 needs to be free of any reporting issues. It has to be what actually happened in terms of functioning - probably by some sort of actimetry. Not that this generates a problem. Using SF-36 function and whatever else as at least a preliminary sorting exercise for 1 still makes sense to me.
A final mischievous thought is that maybe assessments should be done monthly for a year and the time point of the outcome assessment decided by some third party and sealed in a brown envelope at the start (or you might need 12 brown envelopes, 11 saying no and one saying yes). Then neither the therapist nor the patient know which month is the one that matters. That should scotch any chance of patients struggling to get a better score on the assessment day, having rested for a week in preparation.