Would anyone be interested in doing a patient collaborative study looking at whole genomes or am I a lone nutter? I have a small list candidate genes/risks that I want to verify prevalence of and also want to see if the algorithms can find other risks. I have also developed algorithms that can look at all copy number variants and structural variants in a genome as well. The algorithm could automatically assess loss of function, missense, CADD scores, Polyphen/SIFT deletriousness scores, etc.
Update rolled out with support for rare and uncommon SNPs that match ref and general fixes of bugs.
Made indel calling a little more liberal with a little magic so if you don't completely match either the ref or alt allele, the indel may show anyway! This increased sensitivity has seemed ok with my testing, but this increased sensitivity can be at the expense of specificity.
This indel algorithm is important, because quite a few Indels can be missed with typical techniques since the alleles can vary and aren't always consistent for the same indel!
If it says it's likely benign, then it's likely benign. However, the submitters that classified it could have been wrong. If there is only 1 submission and no research, you can't know for sure. But if the CADD score is over 20 without 1 submssion and no research, I would treat it as a suspect and bookmark it to see what happens with further research. If it's over 15, it's a potential suspect in my eyes, but variants with a score over 20 usually bear more weight from my observations. But that's not always the case as CADD scores can be off.
Also look at frequency. If it's not very rare, it probably isn't a causative variant. I personally haven't seen a true case of genetic EDS yet in the genomes I've looked at.
Ah, nevermind my previous post. Perhaps my mind is getting too creative. I guess I am just sick of things moving so slow. It's been a decade for me and I feel like I nearly gave up completely not too long ago.
Please don't give up. I'm not techy enough to understand all of what you're doing but it seems like some very good and tremendously important work. Dream big if you can. My faith, family and dreams (that I can't live out right now) are all I have left.
It's been almost 40 years for me and I keep hitting new baseline lows. Just when I think it can't get any worse, it does.
So for yourself and please, for us...Dream Big!!! (just pace yourself a bit so you don't completely wear out.)
Even with more recent files, some MT locations were offset by 1 and have have since been corrected. Well worth everyone with 23andMe data downloading a recent file before trying your app. It's from the same sample as before, but just cleaned up data as 23andMe learns more about mistakes in their chips or rsID's change.
HLA is a complex topic, and is a bit over my head, but the short answer is yes.
My app does not calculate HLA types, but they can be calculated from WGS data. There are tools that do this.
edit: I think specifically, you need to FASTQ files to calculate HLA genotype. At this point in time, an application on your local machine is better suited than a web application because of the amount of data.