As a heads-up to those who are interested, there should be a major update to the Analyze My Genes program in the next week or two. It's been nearly two years since it was created, so I guess it's somewhat overdue
I forgot to mention - it will also fully support the newer V4 23andMe chip. The older version was designed only around the V3 chip (960,000 SNPs), though the SNPs did overlap quite a bit with the V4 chip and gave quite a few results for patients with V4 data.
But this may mean many more results now for those who have bought their 23andMe testing more recently and have the 600,000 SNP set of results. And the V3 will still be fully supported as well.
As a status update, currently we (well, mostly Mr Valentijn) have automated the process to build our local databases of dbSNP data, 23andMe data, a cross-referenced table of 23andMe data with the dbSNP data added, and tables of <=1% and <=10% 23andMe data. Everything is already basically working, but we ran into a problem with trying to figure out the minor allele for missense mutations with no frequency data. That's been solved, but now I have to build all of the tables again ... which is about 12 hours for the dbSNP data, though the others are much faster. And after all of that, I need to check to make sure it's creating accurate data, and figure out how to fix whatever is going wrong at that point
So the current plan is to have a default database of 1% data downloaded with the program, and an optional 10% file for those who want it, which is the same as the current set up with the older data. But the 10% file will hopefully be able to include an option to check to see missense mutations with 10+% frequency - probably a box to check in the output table, similar to the box for selecting homozygous results.
Additionally, we might offer the full 23andMe table correlated with the dbSNP data for download. So someone could automatically generate a new file with all of their 23andMe data in it, but with gene names, possible alleles, i numbers translated into rs numbers, mutation data, and allele frequency all automatically included. This could then make it easy for someone who is interested in specific genes to look at them all at once, instead of logging in to 23andMe and looking up data for 1 SNP at a time.
Any thoughts?