I am confused about the methodology too. It seems like it isn't really a replication.
Keller et al, and the other studies attempted to test maximal exertion, which takes around 15-20 minutes and involves exercising for several minutes, sometimes over 5 minutes past the lactate and ventilatory thresholds. Maximal exertion is exactly as it sounds - it's hard and it hurts!
A milder exercise test, which it seems they are trying to do might be more able to be generalised (as it's less harmful to patients), but, well, unless they're testing maximal exertion, they might not find a reduction as they're not truly testing the limits.