Use the Needleman-Wunsch algorithm (global alignment) to find the optimal alignment of the protein sequence "LAD" and "CAVA". The scoring rule is: use the following scoring matrix (PAM250) for matches and mismatches (for example a RR matched pair gets '6', while RC mismatched pair gets '-4'), and -1 for each gap.
1. Please provide the alignment matrix showing the scores for the best path (10pts)
2. Trace back the best path (any one path that leads to the best score) and give the corresponding alignment (5pts)
If you decide to write your own program to do the above (just for fun), click here to download the PAM250 substitution matrix. Print out your code. (+3pts)
Hints for R:
PAM250 <- read.table("mypath/PAM250.dat",header=T) #read in the matrix
PAM250[2,1] #-2
PAM250['R','A'] #refers to the same cell as above, but by using col.names and row.names.
#other functions you may use include: nchar('LADD') for length of character string,
# substr('LADD', 2, 3) for a substring, matrix(0, m, n) for a matrix filled with zeros
#For traceback, you can use an extra matrix to keep track of the "arrows".
