Has this study ever been done?
I mean there are all sorts of small scale studies of patient groups, which suggest high rates of infection prior to development of an autoimmune condition. Coincidence? Who knows.
It is fair to be skeptical, but I'm not convinced that the work (eg population cohort based study) to rule out the possibility has actually been done. Well, I looked and I couldn't find such a study, but I could be in error.
I wonder if the quality of the available data (eg specific questions, frequency of reporting etc) on longitudinal cohorts permits it?
You wouldn't really 'do a study' here. You look at the epidemiological data that has been collected over the last century and apply logic, as Stastny did and any of us can do. I think it is useful to compare Reiter's syndrome, which is quite clearly triggered by infection, and rheumatoid arthritis, which is almost certainly not. Reiter's syndrome is not autoimmune, in the sense that nobody has ever found an anti-self response.
Reiter's syndrome was in fact first reported by Engelman, as an epidemic. Reiter reported another epidemic on a troop ship where the soldiers all had dysentery - probably shigella. Since then epidemics have been of a rather different sort - most of the patients I saw came to me in about September having met new sexual partners on Spanish beach holidays in July and acquired another intracellular triggering agent - chlamydia.
Logic tells us to expect a disease based on genetics (for Reiter's it is HLA B27 or B7) and infection is likely to occur rather predictably at the earliest age you are likely to meet the infection. For Shigella that was for troops going to Africa. For chlamydia it is 18-25. Reiter's does occur after that but I would say 80% of cases are in young adults now. As for the troops, if the infection is something people do not often encounter one might expect the age of onset to go on into middle life (as for the soldiers it did) but it really ought to become less common in the elderly who tend not to travel any more. Where we know about infective triggers the pattern fits pretty reliably.
RA, in contrast, has a pattern of incidence that has been documented over populations of millions over many decades and does not look like this at all. It is rare in children but gets going after puberty and gradually becomes commoner and commoner in incidence as you get older right up to the age of 80, when it tails of slightly. So we are looking for a mechanism that gets more likely with time, and to make it more convincing we want to explain why it goes down at 80.
An interesting fact is that breast cancer is the same - it gets commoner as women get older but then tails off at 80ish. And there is a good mathematical explanation that covers both RA and breast cancer. This is a situation where incidence depends on a variable dose of genetics (known for both) and a 'multiple hit' random component (known for breast cancer in the form of multiple somatic mutations). Multiple hit random mechanisms should tend to get commoner with time - enough time to maximise the chances of multiple hits. But if the genetic risk is high enough, as for some breast cancer risk genes, then the time needed to get the hits may be quite short. A few women are born with an almost 100% chance of breast cancer - even on both sides - by the time they are sixty. What this means is that by 80 a proportion of the cases that were going to happen will have been pretty sure to happen by then. So the maths predicts a slight tail off.
So it seems reasonable to suggest that RA is due to genes plus a random multiple hit mechanism. We know the genes and we can probably track the hits. Studies of healthy people have shown that those that will get RA later acquire autoantibodies stepwise over a 5-10 year period. Each antibody represents a bunch of somatic mutations, just like breast cancer, so it seems to fit very well.
Another well studied aspect is time of onset in identical twins - which for both RA and rbreast cancer is all over the place - quite uncorrelated. If there was an external trigger you would expect some correlation, at least if the twins remained in an environment with the same infection risks.
But what about other autoimmune diseases? None of the others are so well documented, with the possible exception of type I diabetes. Type I diabetes may indeed be the one autoimmune condition (if it really is one, and that is not entirely clear) that is commonly precipitated by infection because it is common in childhood. Very few other true autoimmune diseases are. I am not aware of good studies indicating that other autoimmune conditions are triggered by infection. Retrospective information about infection or life events at disease onset is known to be no use. People always attribute their diseases to life events, even if they have forgotten that the sequence is the wrong way around!
This sort of analysis never 'rules out' anything at a stroke. On the other hand the weight of epidemiological data for RA is easily enough to say that an infective trigger of any sort one can conceive of in the ecosystem we know we occupy simply does not fit the epidemiological facts. If you look at an incidence graph for RA and try and think how you would get that from an infective trigger it just doesn't work - unless you propose the infection alters T cell repertoire decades before the symptoms appear - i.e. in infancy. There is in fact one paper suggesting that RA risk goes up if as a young child you have a cat sleeping on your bed, but nobody has ever confirmed it!