So, how do you tell if the protein structure has been altered in a meaningful way, or at all, if there is literally no clinical research on your SNP? Let's be honest - we would rarely be interested in going so far. However, looking at last week's missense mutation on Von Willebrand Clotting Factor - well, it'd be a good idea to know how seriously to take that SNP.
The first and quickest way to determine the impact of a missense mutation is to use a BLOSUM62 chart. The chart is based off of how dire it is to exchange one amino acid for another. If the amino acids are similarly-shaped (and behave similarly enough) even a missense mutation might not cause a major problem. Last time, we noted that my VWF SNP exchanged proline for serine, so we use the chart to find where proline and serine cross... The higher the score, the more likely the amino acids will behave similarly enough that there isn't a problem. The lower the value on the chart, the more the SNP is a potential concern.
(Hannes Rost, 2008)
Finding where proline and serine meet, we see that this has a value of -1. Here are the possible values from the chart:
This is not a negative 4, 3, or 2, so it's not dire - but it's not marvelous, either. This is the kind of inconclusive thing that is bothersome enough that we can take it to the final and most complicated step to discover if this mutation is really going to cause clotting issues...
Using a Three-Dimensional Modelling Program:
Let's start with a quote from Part II of this series:
Amino acids hook together to make every protein in your body. If the protein is still functional with the alteration, it's all good. If the alteration changes the shape of the protein to make it harder to fit into receptors or carry molecules - the way hemoglobin carries oxygen and carbon dioxide, for example - or not fit into the proper receptors at all - or makes it unstable so that the protein is fragile - then you could be in for a world of trouble.
What we're really looking for is the answer to the question: does this amino acid substitution significantly change the shape of the protein - enough to affect its function? Or perhaps the shift is minor but in a very important location, like at a receptor site?
Three-dimensional modelling can answer that question!
The SAP Disease-Association Predictor
I used the SAP Disease-Association Predictor. You will have to get an account with them using a valid email address, but the account is otherwise free, and so far they haven't sent me a single advertisement for anything. After you've sent them an email and they've replied back with a link to activate your account, you will find yourself here:
This is the introductory page. Read about what an SNP is and all kind of other introductory information! Or click right where it says 'Run sapred_seq' on the left-hand side.
Next, you want to be able to '* input fasta file'. But we don't yet have a FASTA file. We can, however, get one.
Generating the FASTA File:
Not actually tough, I promise. First, you're going to go to www.uniprot.org to search for the gene, but keep the previous window open. You'll need the gene's proper abbreviation, which is typically three letters. In the case of Von Willebrand Factor, it's VWF. Rather unsurprisingly. Put that into the search box and click 'search'.
You should see the following screen if you clicked on the proper thing:
This actually is pretty interesting; it says that VWF is actually involved in a variety of molecular and biological processes and you can learn a lot about them. I'd save that for if the VWF SNP is significant. If it's not, we're wasting precious research time.
So: see the bright blue area to the left that reads 'Display'? Find 'Sequences' and click on it.
A bunch of alphanumerics should pop up. Believe it or not, this actually represents the amino acids in the correct order in the VWF protein. This is what we want. Select everything on the screen and copy it.
Go back to your (hopefully still open) SAP Disease Predictor. Click on input fasta and paste your code into the area where it says 'OR paste into window'.
Naming the Mutation:
Finally, we need the name of the mutation. It will include the first letter of the 'correct' amino acid and the one that's taken its place in this SNP. It's locate-able through dbSNP most rapidly. Use dbSNP to search for rs61750615 and click on the link for rs61750615. Once again, you'll see that the mutation is Pro2063Ser. Ditch all but the first letter of the amino acid and this becomes P2063S.
This is the name of the mutation, so far as SAP Disease Predictor is concerned. For some amino acids it won't quite work this way; for example, both leucine and lysine can't be represented with 'L'. Go here to see how to write the correct one-letter symbol for the amino acid you need.
Generating the Report:
If you're planning on doing more than one of these, click where it says 'Untitled' and change the file name to something meaningful like 'Von Willebrand SNP'.
Input the name of the mutation where it says to do so.
So now we wait. VWF is actually a pretty large protein, so the analysis is going to take a few minutes. The program will send you an email once the analysis is complete, so do something that'll take about five minutes.
...to be continued
<---Start of the Series