Resource for converting some gene formats

Discussion in 'Genetic Testing and SNPs' started by MeSci, Dec 29, 2013.

  1. Hip

    Hip Senior Member

    By the way, the raw data text file from 23andme — which you can upload to Genetic Genie (the Genetic Genie methylation results upload page is here), and get one of those nice results tables produced by Genetic Genie — is in a pretty simple format. The 23andme raw data file format is the following (you can see that there is one line per SNP):

    # rsid chromosome position genotype
    rs4680 22 19951271 AG
    rs4633 22 19950235 CT
    rs769224 22 19951804 GG
    rs1801133 1 11856378 AA
    rs2066470 1 11863057 GG
    rs234706 21 44485350 GG

    These four columns must be separated by tabs (not spaces), and I think Genetic Genie only uses columns 1 and 4 (the rsid and genotype) to extract the data and produce its tabular results. So you could put any filler digits (like say "0000") into columns 3 and 3, and make you own raw data file to upload to Genetic Genie. If you uploaded the above example in a plain text document (.txt) to Genetic Gene, it would understand it, and would return your results in those nice tables that Genetic Gene makes.

    There are in fact nearly a million lines in the 23andme raw data file, but I have just shown 6 lines above.
  2. Sea

    Sea Senior Member

    NSW Australia
    @MeSci I replied on your results thread.

    Also, if your 2 letters are different from each other then yes, you are heterozygous for that snp. If your 2 letters are the same as each other you are either homozygous for the variation or homozygous for the normal version. Simply knowing your letters for a snp won't tell you whether you have the major or minor version, or whether there is some risk associated or not. It is not necessarily the least frequent or the changed allele which contributes risk. Only reading the research will tell you that information.

    Yasko (and Genetic Genie) have tried to make it a little easier by adding the + sign if your allele is one they consider the risky version or a - sign if it isn't. The snps Yasko has assigned a + to are not always the variation, not always the least frequent and often just as common as the normal version.

    Others use the + sign to note that you are positive for a variation and the - sign to note that you do not have the variation.

    The tricky thing is that not everyone agrees with Yasko's interpretation of risk. Some variations add risk, some are protective such that the normal one could be considered a risk, some variations make no difference and some add risk for something while the normal version adds risk for something different.
  3. MeSci

    MeSci ME/CFS since 1995; activity level 6?

    Cornwall, UK
    Thanks very much, @Hip and @Sea. Very useful and educational stuff.

    I am now in the process of comparing my results with @Hip's to ascertain which strands Sciona/Cellf used. It seems to be mostly the positive strand, but I have already found an exception, so perhaps they have not been consistent in their choices, as suggested in their patent application I linked to and quoted from.

    I'll have to look at some other results after that as @Hip hasn't listed all the variations I was tested for.

    I'll get there in the end!

