Discussion in 'Genetic Testing and SNPs' started by Valentijn, Aug 29, 2013.
"I" is an insertion and "D" is a deletion.
The location data (chromosome # and position on that chromosome) can be used to look up the rs number, but the rare allele program also includes the rs number "translation" when 23andMe uses an i number.
Yeah, it's something I've been thinking about. But it would mean another database which needs to be downloaded and added complexity to the program. I'll talk to the programmer about it
So the double II means both base pairs got inserted?
Yup. It's pretty common to have insertions and deletions, and it's usually not indicating anything interesting.
@Valentijn - does your program have a report for those mutations that are greater than the 10% rarity? I'm thinking specifically of those which are labeled with an "i" code by 23andme. I'm thinking these might also be interesting to identify?
Well, it will pick up heterozygous mutations where the minor allele is at 10% prevalence. That means that the heterozygous genotypes would be reported going up to a (calculated) 18% prevalence rate for that genotype.
It's also then easy to sort out the "i" numbers, if you save the results as a .txt file and open it in excel. Then you can sort based on the RSID (starts with "i" instead of "rs") or ETC columns (has an rs number instead of being empty). You could also do this with your full 23andMe file.
So I got round to seeing my GP today. I had to wait several weeks to get an appointment with her anyway. I explained about my 23andme results, which was hard work because she didnt know what 23andme was - so I had to explain that. She didn't seem to understand the difference between a predisposition and a mutation that definately results in a condition. I found that quite a surprise but anyway, she's referring me to a neurologist who should be more familiar with this stuff.
Genome analysis is fun. I have three homozygous SNPs (no clinical data available). All three are located in intron regions of my genome. Can we therefore conclude, that their importance is probably minor? I'd love to have my whole genome sequenced and run through this analysis tool.
Yeah, those are probably not important. Similarly, known missense mutations are a lot more interesting as well.
Val - Is there a database somewhere accessible that lists known missense mutations?
Probably, but I've never looked for it. I'd start with dbSNP, 1000 genome project, and/or OMIM.
this is neat, i analyzed my genome - now what? is there an easy way to interpret the SNPs? should we use SNPedia?
SNPedia is a good start. I prefer dbSNP, where I can easily see missense mutations and such, plus links to more of the research. Basically SNPedia is more user friendly, but contains less data and fewer SNPs.
This is such a great program. Thank you for creating it @Valentijn and for all you do to help others learn how to navigate their 23andMe data. Are you still collecting people's data and compiling it?
I am, but only for patients who have CCC/ICC-defined ME with post-exertional malaise if they exceed their limitations.
Looking at the criteria in an ICC primer I found on here, I do meet the ICC standards and I definitely have PEM. I have had a lot of the testing mentioned done, but have not been formally diagnosed so I will leave it up to you if you want to add it or not.
I remember how when CFS used to be on my charts, I'd be treated differently. They removed it when they found POTS, but little did I know, POTS seems to be more of a symptom than a cause of my problems in and of itself. For that reason, I am not quite ready to allow ME to be put on my charts since is lumped in with CFS and I am able to get treatment for symptoms/abnormalities without the label. I know this is not a new thought in this community, but I hold out hope that a reason for my symptoms will be found and worry a commonly misunderstood label will cause some doctors to not do testing they would otherwise do.
As a heads-up to those who are interested, there should be a major update to the Analyze My Genes program in the next week or two. It's been nearly two years since it was created, so I guess it's somewhat overdue
X, Y, and mitochondrial SNPs will be added. It will also automatically list the gene name for every result which is on a gene, and "intergenic" for the rest. Missense mutations will also be labled, with the amino acid abbreviation and position number (such S44L). If clinical significance has been officially indicated, such as Likely Benign or Pathogenic, that will also be included.
Also included will be the missense mutations which have no known prevalence data. In the previous version all SNPs with no prevalence data were excluded en masse, to avoid any "false positive" hits for SNP alleles which were actually common. So it was probably missing out on some relevant missense mutations where there was either no data or 0% prevalence rate as a direct result of those SNPs being highly pathogenic and/or extremely rare.
Non-rare missense mutations will also probably be included, though perhaps with an optional box to be checked if someone wants to include or exclude them from their data.
So there will be a lot more data, and the files will be somewhat larger as a result, though still relatively small. The old tiny version will also continue to be available.
Thanks @Valentijn !
I forgot to mention - it will also fully support the newer V4 23andMe chip. The older version was designed only around the V3 chip (960,000 SNPs), though the SNPs did overlap quite a bit with the V4 chip and gave quite a few results for patients with V4 data.
But this may mean many more results now for those who have bought their 23andMe testing more recently and have the 600,000 SNP set of results. And the V3 will still be fully supported as well.
As a status update, currently we (well, mostly Mr Valentijn) have automated the process to build our local databases of dbSNP data, 23andMe data, a cross-referenced table of 23andMe data with the dbSNP data added, and tables of <=1% and <=10% 23andMe data. Everything is already basically working, but we ran into a problem with trying to figure out the minor allele for missense mutations with no frequency data. That's been solved, but now I have to build all of the tables again ... which is about 12 hours for the dbSNP data, though the others are much faster. And after all of that, I need to check to make sure it's creating accurate data, and figure out how to fix whatever is going wrong at that point
So the current plan is to have a default database of 1% data downloaded with the program, and an optional 10% file for those who want it, which is the same as the current set up with the older data. But the 10% file will hopefully be able to include an option to check to see missense mutations with 10+% frequency - probably a box to check in the output table, similar to the box for selecting homozygous results.
Additionally, we might offer the full 23andMe table correlated with the dbSNP data for download. So someone could automatically generate a new file with all of their 23andMe data in it, but with gene names, possible alleles, i numbers translated into rs numbers, mutation data, and allele frequency all automatically included. This could then make it easy for someone who is interested in specific genes to look at them all at once, instead of logging in to 23andMe and looking up data for 1 SNP at a time.
You can also try a Google Site Search
Separate names with a comma.