Good points. Personally, I think thresholds should be set individually: why should the threshold for recovery be the same for somebody who is 20 and somebody who is 60. Similarly, for some measures, I think thresholds should be broken down by gender. This process could either be automated using a program or else done individually, which again wouldn't take particularly long. It may (or may not) be for this reason that the mean+(-)1SD method was used i.e. mean [mixed sample of all ages]-1SD is likely to be mean-2SD, for all the different means for different ages. Personally, I don't like the mean+(-)1SD: (extending the point oceanblue made) a recovered group shouldn't be mostly or all "worse" than the mean: there should roughly the same better than normal as there are a bit worse than normal. The way to test this is to get the mean and standard deviation of the recovered group and see if it's significantly different from a healthy group. If it is, it suggests it's not a proper recovered group.