Global Alignment with Scoring Matrix and Affine Gap Penalty
Mind the Gap
In “Global Alignment with Scoring Matrix”, we considered a linear gap penalty, in which each inserted/deleted symbol contributes the exact same amount to the calculation of alignment score. However, as we mentioned in “Global Alignment with Constant Gap Penalty”, a single large insertion/deletion (due to a rearrangement is then punished very strictly, and so we proposed a constant gap penalty.
Yet large insertions occur far more rarely than small insertions and deletions. As a result, a more practical method of penalizing gaps is to use a hybrid of these two types of penalties in which we charge one constant penalty for beginning a gap and another constant penalty for every additional symbol added or deleted.
Problem
An affine gap penalty is written as a+b⋅(L−1)a+b⋅(L−1), where LL is the length of the gap, aa is a positive constant called the gap opening penalty, and bb is a positive constant called the gap extension penalty.
We can view the gap opening penalty as charging for the first gap symbol, and the gap extension penalty as charging for each subsequent symbol added to the gap.
For example, if a=11a=11 and b=1b=1, then a gap of length 1 would be penalized by 11 (for an average cost of 11 per gap symbol), whereas a gap of length 100 would have a score of 110 (for an average cost of 1.10 per gap symbol).
Consider the strings “PRTEINS” and “PRTWPSEIN”. If we use the BLOSUM62 scoring matrix and an affine gap penalty with a=11a=11 and b=1b=1, then we obtain the following optimal alignment.
PRT—-EINS ||| ||| PRTWPSEIN-
Matched symbols contribute a total of 32 to the calculation of the alignment’s score, and the gaps cost 13 and 11 respectively, yielding a total score of 8.
Given: Two protein strings ss and tt in FASTA format (each of length at most 100 aa).
Return: The maximum alignment score between ss and tt, followed by two augmented strings s′s′ and t′t′ representing an optimal alignment of ss and tt. Use:
- The BLOSUM62 scoring matrix.
- Gap opening penalty equal to 11.
- Gap extension penalty equal to 1.
Sample Dataset
Rosalind_49 PRTEINS >Rosalind_47 PRTWPSEIN
Sample Output
8 PRT—-EINS PRTWPSEIN-
Solution:动态规划
本题题意就是第一次插入罚opening penalty分,之后连续的插入每次只罚extension penalty分。因此就是前一题的动态规划的变型。
from typing import List
ch2idx = lambda ch: ord(ch) - ord('A')
BLOSUM62 = """
A C D E F G H I K L M N P Q R S T V W Y
A 4 0 -2 -1 -2 0 -2 -1 -1 -1 -1 -2 -1 -1 -1 1 0 0 -3 -2
C 0 9 -3 -4 -2 -3 -3 -1 -3 -1 -1 -3 -3 -3 -3 -1 -1 -1 -2 -2
D -2 -3 6 2 -3 -1 -1 -3 -1 -4 -3 1 -1 0 -2 0 -1 -3 -4 -3
E -1 -4 2 5 -3 -2 0 -3 1 -3 -2 0 -1 2 0 0 -1 -2 -3 -2
F -2 -2 -3 -3 6 -3 -1 0 -3 0 0 -3 -4 -3 -3 -2 -2 -1 1 3
G 0 -3 -1 -2 -3 6 -2 -4 -2 -4 -3 0 -2 -2 -2 0 -2 -3 -2 -3
H -2 -3 -1 0 -1 -2 8 -3 -1 -3 -2 1 -2 0 0 -1 -2 -3 -2 2
I -1 -1 -3 -3 0 -4 -3 4 -3 2 1 -3 -3 -3 -3 -2 -1 3 -3 -1
K -1 -3 -1 1 -3 -2 -1 -3 5 -2 -1 0 -1 1 2 0 -1 -2 -3 -2
L -1 -1 -4 -3 0 -4 -3 2 -2 4 2 -3 -3 -2 -2 -2 -1 1 -2 -1
M -1 -1 -3 -2 0 -3 -2 1 -1 2 5 -2 -2 0 -1 -1 -1 1 -1 -1
N -2 -3 1 0 -3 0 1 -3 0 -3 -2 6 -2 0 0 1 0 -3 -4 -2
P -1 -3 -1 -1 -4 -2 -2 -3 -1 -3 -2 -2 7 -1 -2 -1 -1 -2 -4 -3
Q -1 -3 0 2 -3 -2 0 -3 1 -2 0 0 -1 5 1 0 -1 -2 -2 -1
R -1 -3 -2 0 -3 -2 0 -3 2 -2 -1 0 -2 1 5 -1 -1 -3 -3 -2
S 1 -1 0 0 -2 0 -1 -2 0 -2 -1 1 -1 0 -1 4 1 -2 -3 -2
T 0 -1 -1 -1 -2 -2 -2 -1 -1 -1 -1 0 -1 -1 -1 1 5 0 -2 -2
V 0 -1 -3 -2 -1 -3 -3 3 -2 1 1 -3 -2 -2 -3 -2 0 4 -3 -1
W -3 -2 -4 -3 1 -2 -2 -3 -3 -2 -1 -4 -4 -2 -3 -3 -2 -3 11 2
Y -2 -2 -3 -2 3 -3 2 -1 -2 -1 -1 -2 -3 -1 -2 -2 -2 -1 2 7
"""
cost = [[0] * 26 for _ in range(26)]
head, *nxts = (line.split() for line in BLOSUM62.splitlines() if line.strip())
# print(head, nxts)
# 处理表头
for base2, *nums in nxts:
for base1, num in zip(head, nums):
cost[ch2idx(base1)][ch2idx(base2)] = int(num)
openGAP, extensionGAP = -11, -1
class Solution:
def globalAligmment(self, s: str, t: str) -> int:
"""全局比对"""
m, n = len(s), len(t)
dp = [[[float('-inf')] * 3 for _ in range(n + 1)] for _ in range(m + 1)]
dp[0][0][0] = 0
# 1. 状态初始化
dp[1][0][2] = openGAP
for i in range(2, m + 1): # s != '', t = '', 对 s 而言是连续删除
dp[i][0][2] = dp[i-1][0][2] + extensionGAP
dp[0][1][1] = openGAP
for j in range(2, n + 1): # s = '', t != '', 对 s 而言是连续插入
dp[0][1][1] = dp[0][j-1][1] + extensionGAP
# 2. 状态转移
for i in range(1, m + 1):
for j in range(1, n + 1):
dp[i][j][0] = max(
dp[i-1][j-1][0], dp[i-1][j-1][1], dp[i-1][j-1][2]
) + cost[ch2idx(s[i-1])][ch2idx(t[j-1])]
# s 连续插入
dp[i][j][1] = max(
dp[i][j-1][0] + openGAP, # 起始罚分
dp[i][j-1][1] + extensionGAP, # 延伸罚分
dp[i][j-1][2] + openGAP,
)
# t连续插入=s连续删除
dp[i][j][2] = max(
dp[i-1][j][0] + openGAP, # 起始罚分
dp[i-1][j][1] + openGAP, # 起始罚分
dp[i-1][j][2] + extensionGAP, # 延伸罚分
)
def printPath(i: int, j: int, k: int, path1: List[str], path2: List[str]) -> None:
if i + j == 0:
print(''.join(reversed(path1)))
print(''.join(reversed(path2)))
print()
return
if 0 == k:
matchScore = cost[ch2idx(s[i-1])][ch2idx(t[j-1])]
for k in range(3):
if dp[i][j][0] == dp[i-1][j-1][k] + matchScore:
path1.append(s[i-1])
path2.append(t[j-1])
printPath(i - 1, j - 1, k, path1, path2)
path1.pop()
path2.pop()
elif 1 == k:
for k in range(3):
if dp[i][j][1] == dp[i][j-1][k] + (openGAP if k != 1 else extensionGAP):
path1.append('-')
path2.append(t[j-1])
printPath(i, j - 1, k, path1, path2)
path1.pop()
path2.pop()
else:
for k in range(3):
if dp[i][j][2] == dp[i-1][j][k] + (openGAP if k != 2 else extensionGAP):
path1.append(s[i-1])
path2.append('-')
printPath(i - 1, j, k, path1, path2)
path1.pop()
path2.pop()
ans = max(dp[m][n])
print(ans)
for k in range(3):
if dp[m][n][k] == ans:
printPath(m, n, k, [], [])
return max(dp[m][n])
fasta = """
>Rosalind_3162
WSPVKMGQCASMVGVEFEWSCRMQSPVQKLGLFKYDDGEVVGFKWHRNKRVHWMDFIPLN
YEYLLVQARSRSRTTKKITEKFKF
>Rosalind_5885
WSPVKMGQCMVGVEFDNSKYEKTNGDYDKLGCKKYDDGEHCGFKCDFRPKRFCWRWIAWE
HNPIAGEYMYLLVQARSRSRTKKITEKFKL
"""
"""
>Rosalind_49
PRTEINS
>Rosalind_47
PRTWPSEIN
"""
import re
fasta = [seq.replace('\n', '') for seq in re.split(r'>.*', fasta) if seq.replace('\n', '')]
print(Solution().globalAligmment(*fasta))
#include <bits/stdc++.h>
using namespace std;
int blosum62[26][26] =
{
{4, 0, 0, -2, -1, -2, 0, -2, -1, 0, -1, -1, -1, -2, 0, -1, -1, -1, 1, 0, 0, 0, -3, 0, -2, 0},
{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
{0, 0, 9, -3, -4, -2, -3, -3, -1, 0, -3, -1, -1, -3, 0, -3, -3, -3, -1, -1, 0, -1, -2, 0, -2, 0},
{-2, 0, -3, 6, 2, -3, -1, -1, -3, 0, -1, -4, -3, 1, 0, -1, 0, -2, 0, -1, 0, -3, -4, 0, -3, 0},
{-1, 0, -4, 2, 5, -3, -2, 0, -3, 0, 1, -3, -2, 0, 0, -1, 2, 0, 0, -1, 0, -2, -3, 0, -2, 0},
{-2, 0, -2, -3, -3, 6, -3, -1, 0, 0, -3, 0, 0, -3, 0, -4, -3, -3, -2, -2, 0, -1, 1, 0, 3, 0},
{0, 0, -3, -1, -2, -3, 6, -2, -4, 0, -2, -4, -3, 0, 0, -2, -2, -2, 0, -2, 0, -3, -2, 0, -3, 0},
{-2, 0, -3, -1, 0, -1, -2, 8, -3, 0, -1, -3, -2, 1, 0, -2, 0, 0, -1, -2, 0, -3, -2, 0, 2, 0},
{-1, 0, -1, -3, -3, 0, -4, -3, 4, 0, -3, 2, 1, -3, 0, -3, -3, -3, -2, -1, 0, 3, -3, 0, -1, 0},
{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
{-1, 0, -3, -1, 1, -3, -2, -1, -3, 0, 5, -2, -1, 0, 0, -1, 1, 2, 0, -1, 0, -2, -3, 0, -2, 0},
{-1, 0, -1, -4, -3, 0, -4, -3, 2, 0, -2, 4, 2, -3, 0, -3, -2, -2, -2, -1, 0, 1, -2, 0, -1, 0},
{-1, 0, -1, -3, -2, 0, -3, -2, 1, 0, -1, 2, 5, -2, 0, -2, 0, -1, -1, -1, 0, 1, -1, 0, -1, 0},
{-2, 0, -3, 1, 0, -3, 0, 1, -3, 0, 0, -3, -2, 6, 0, -2, 0, 0, 1, 0, 0, -3, -4, 0, -2, 0},
{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
{-1, 0, -3, -1, -1, -4, -2, -2, -3, 0, -1, -3, -2, -2, 0, 7, -1, -2, -1, -1, 0, -2, -4, 0, -3, 0},
{-1, 0, -3, 0, 2, -3, -2, 0, -3, 0, 1, -2, 0, 0, 0, -1, 5, 1, 0, -1, 0, -2, -2, 0, -1, 0},
{-1, 0, -3, -2, 0, -3, -2, 0, -3, 0, 2, -2, -1, 0, 0, -2, 1, 5, -1, -1, 0, -3, -3, 0, -2, 0},
{1, 0, -1, 0, 0, -2, 0, -1, -2, 0, 0, -2, -1, 1, 0, -1, 0, -1, 4, 1, 0, -2, -3, 0, -2, 0},
{0, 0, -1, -1, -1, -2, -2, -2, -1, 0, -1, -1, -1, 0, 0, -1, -1, -1, 1, 5, 0, 0, -2, 0, -2, 0},
{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
{0, 0, -1, -3, -2, -1, -3, -3, 3, 0, -2, 1, 1, -3, 0, -2, -2, -3, -2, 0, 0, 4, -3, 0, -1, 0},
{-3, 0, -2, -4, -3, 1, -2, -2, -3, 0, -3, -2, -1, -4, 0, -4, -2, -3, -3, -2, 0, -3, 11, 0, 2, 0},
{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
{-2, 0, -2, -3, -2, 3, -3, 2, -1, 0, -2, -1, -1, -2, 0, -3, -1, -2, -2, -2, 0, -1, 2, 0, 7, 0},
{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
};
int openGAP = -11, extensionGAP = -1;
void localAligmentWithAllinPenalty(string &s, string &t) {
int m = s.size(), n = t.size();
// 状态初始化
vector<vector<vector<int>>> dp(m + 1, vector<vector<int>> (n + 1, vector<int> (3, 0)));
int ii = 0, jj = 0, kk = 0;
for (int i = 1; i <= m; ++i) {
for (int j = 1; j <= n; ++j) {
// 无删除或插入操作
dp[i][j][0] = max(
max({dp[i-1][j-1][0], dp[i-1][j-1][1], dp[i-1][j-1][2]}) + blosum62[s[i-1]-'A'][t[j-1]-'A'],
0
);
// 连续插入
dp[i][j][1] = max({
0,
dp[i][j-1][0] + openGAP, // 起始罚分
dp[i][j-1][1] + extensionGAP, // 延伸罚分
dp[i][j-1][2] + openGAP
});
// 连续删除
dp[i][j][2] = max({
0,
dp[i-1][j][0] + openGAP, // 起始罚分
dp[i-1][j][1] + openGAP,
dp[i-1][j][2] + extensionGAP // 延伸罚分
});
// 更新最大局部得分
for (int k = 0; k <= 2; ++k) {
if (dp[i][j][k] >= dp[ii][jj][kk]) {
ii = i; jj = j; kk = k;
}
}
}
}
// 回溯输出最佳路径之一
cout << dp[ii][jj][kk] << endl;
string path1, path2;
while ((0 != ii + jj) && (0 != dp[ii][jj][kk])) {
// cout << ii << " " << jj << " " << kk << endl;
if (0 == kk) {
int matchScore = blosum62[s[ii-1]-'A'][t[jj-1]-'A'];
for (int k = 0; k <= 2; ++k) {
if (dp[ii][jj][kk] == dp[ii-1][jj-1][k] + matchScore) {
path1.push_back(s[ii-1]);
path2.push_back(t[jj-1]);
--ii, --jj; kk = k;
break;
}
}
} else if (1 == kk) {
for (int k = 0; k <= 2; ++k) {
if (dp[ii][jj][kk] == dp[ii][jj-1][k] + (k != kk ? openGAP : extensionGAP)) {
path1.push_back('-');
path2.push_back(t[jj-1]);
--jj; kk = k;
break;
}
}
} else {
for (int k = 0; k <= 2; ++k) {
if (dp[ii][jj][kk] == dp[ii-1][jj][k] + (k != kk ? openGAP : extensionGAP)) {
path1.push_back(s[ii-1]);
path2.push_back('-');
--ii; kk = k;
break;
}
}
}
}
reverse(path1.begin(), path1.end());
reverse(path2.begin(), path2.end());
cout << path1 << "\n" << path2 << endl;
}
// trim from start (in place)
static inline void ltrim(string &s) {
s.erase(s.begin(), find_if(s.begin(), s.end(), [](unsigned char ch) {
return !isspace(ch);
}));
}
// trim from end (in place)
static inline void rtrim(string &s) {
s.erase(find_if(s.rbegin(), s.rend(), [](unsigned char ch) {
return !isspace(ch);
}).base(), s.end());
}
// trim from both ends (in place)
static inline void trim(string &s) {
ltrim(s);
rtrim(s);
}
vector<string> readFasta() {
vector<string> seqs;
string seq, line;
while (cin >> line) {
trim(line);
if (!line.empty() && line[0] == '>') {
if (!seq.empty()) {
seqs.push_back(seq);
seq.clear();
}
} else {
seq += line;
}
}
seqs.push_back(seq);
return seqs;
}
int main() {
vector<string> seqs = readFasta();
// for (string &seq: seqs) cout << seq << endl;
localAligmentWithAllinPenalty(seqs[0], seqs[1]);
return 0;
}
>Rosalind_3230
MYCNADFSNKVVCVRAQMLTQIQSGCSPNRRLTTCVDYTDENLTCHTGSNMAKHQGHNYY
VCRAPSECNAFSIRYLYICDECPALNMKPDCCYDTPYSDSIMLTCEQSTNGRPVFCGIWY
LRKHNMISICTIVWTKEAGKRIDNVHRGSPVMGQFKMYGGYYVRKMNVVLQPGDAAVNSN
NQGINDSLYASWVPAMYNTYDCRDPLDLYVMPMPVPAHFIIVSAEPRHYPAKRRPRQMDY
YDKAVPVKHCCITKDGLFWEMPMYSWEHMGAKQREQPFITLPCPYIPANPQVCKLAAAYF
RYHKYHWMTKDRPKDWEWVRNIWCRSKNHTESFREKTHQCAQSDLQHYRYPHSCLKGIVS
QDKADATMWSQKSTGTHRDRNSQNAGEMSQRQYRMNTVGGRHDQGCIPMWFAEYKMLDVL
MHVATMDSIKEKPFGQNKNFPFMLRRPKGIWRRCGGFSKYEIERYAMMHDLGHIDEYWDD
CQSFLDSSWYGVDVKTENTWQRAGHCTTKRPGMSTYLSWIDWMADQAWMCDMTGGAEHFY
CDLLFWRIRVVVEMTGHMALDITGYAPDKDPILYNECGHTAGFVVIDEKKQMGPCWWNEK
DIRISVSSKFCEKFDLCNHICKVYLDASLVWSLKIPPQVVGHDAAVELACDFLVTDRRQG
ELAENRGQASTGNRHFMWDHIGYDTNRLMVWAIGAMPICWMADDTKTCNVSAEKTTPDSY
NQHKNMYHCKEMDEACVLTYWAKHMTVEVGKFCNGTLLKPVPCEIMVQPHQSVHRMINLN
TIKKRCRLDCMMHIKTKVTQVSYGILDTFLSVKWMVTAFHRCTLQDSWVKIPFYMSANTH
PFSDHVAFSYCHARRTGDNGWTRCKSEDKDYMHNYEETHHQHESCWVHNEMSAETVIRCT
KLVMMEFFDSGYDCIWAAEIMHTGFYGSAFLMHGRFAHFYKYWWHCACDVQLRGCFAQHT
NDSSLPQCDVPEMPVQSTYSMPVRKHVLDSYYMVTISKCPMIFKIPRLVLVSRHTAAGFQ
CSHTIQGYSSNAFQYESRLWERSSREKWEALRRGAECCNCSGHDIGGPKIINGQWAFRVY
PQLFSIIGGEVREMVSNEKEHYFEFMGHVWHYWEAIHFGMMIDNDANNGWALLRLKKGAC
TYQTTEESVFDDYTIPLLHHQRWMQSIWYWWSEDCKRIPWGESHGQTRDVRHRSWLFNSV
TLRHVGYMDDDAGGGFAWMECIELDKLYYQNIGWHDEIKGQYWANFNPQCPCIRKQDASW
GRWSTVWLMMRMSQHWCNNSVPELAQCRRINCHHAIVMWHWHDANAFNIDYCRDPVHYVA
VFPQSMREAYYCMLQISYNVEITPHEFRGYEQVRSKFINTFKPGFDMDPPKEWWLMWRQT
YQARCKVLACADDKKYMTYSVCMNAWIPIAGDSGKVGHFYWTWSCGCCPGDFTIHYNLTK
VKWQVYHIDSWEPVEKAWIDPQYGEFVRVYDDQGEFCVISDNIRYKLPHHLRQIPFGIMY
DEVYPHRECITKHKCDGCARDWWAPDPEDVTHTIPRMRNKTLMHNNKNHQAYFQCEGWEM
RRFWHELCTTEGFYHTCQMLTLSHTCYFDAKCCSQCMYLEGKHPPKFARTRCERYAPCYC
FIWALWTDGGYLFNENFIYENGLFHYQHLMEVCITYDTWTCYKLAEPCSKDPNAPSTFKT
TWAPPPWYYTSLIISQSGYHSECVFCWMSMYNFNSRRWPTTDCLDKFNPTIEPGSDTSMG
ELAPVCRWMPDEWFWSLNSIASPNDVRQFLTFVQDCGIQNFRRAEGCANTKVDRGHEQFD
YNTWATIIWYDQFMSIDDQGVLHVCHVRMAYCYQFESRCVLFEIMHYTQQGPFHMEMITM
ANPTPSSVQAIMTCETHATLDSDGIHKFKVMHQPWKAKKCIMNRFENDTAQDSCGYAEYR
MEQFIFTACSVVAGENKCKASVDSVEVTWMCNPASKYPTWVWIQRPKTIHFTWDKMEKTQ
DAIELCYACDDQECIYGAGSMQDIAYGRWTGSWEIWAWWIFNFFMFDSVDDGERRFAGFF
VYASMLYFIPKMIVEYSEFEWYQHLASKFTMVYDHAIICIGKNDWEHQRLEIFMQMMNAE
NENCGFGQHGTQQRMEWGLMSYRLETLMPSHKPDSDKQETGIINLRLMMSCHNGEAKHSD
WRLDWQCTDFNAREKPCRVASGYIRCTFQKVWNTPRPESQCCQYIKNLCDNVSMQHSRRW
KATTLSQTMSNRACKYTDKKDDNRDGYMNTPMCNFGHMQVYNEQWSHEEVDIPYVVMQSY
AGITLHMVAIAWKYQGWDVPMLGWDWHWRAFCWIDCIVAAWLIGKFRGSGDRYKCCAVLD
WHPHRRTIFCYDFFKYSATSRDDTRWEWKYTAIVMMGIMRINQKNNHRWTKTDQEEDQAV
IEIDSWPIKKARLFSEMLMMHVNSSRVFYNQTWNFIQVGCFMEGIVVCSKDALQTQQVHR
QCDALFLAAYMERAGYEYWNFTKAPDWYWDCQPVMRGCTTAPPEDNPIPKALHSEHHIQR
YPYLVCRGLNLPCNTNGVDCKVQQGVYMHSCAISFCRGHIEEIEAAAHHIWRPNPHYHMV
WGAVFLSKFDRAKCYTKAAYQFRCDDVWIRNPKYQQYIYAISRPKCMKSLFLMSTVANVH
GCIHMMKTHWMSENWWCWMYPPFHSKKPAWFQGMIQHSRVCAGFSEVVAVGISSHRQGTT
WATSSGIDVVGPCDDATQIGQMCHFAFVGQRFNDWPKFKSMRCKPKTSGAPQMHNFWCKS
RPHNPQHCTIGRNPSYWHYPGLTYIIPILKLAKVQQQDYCIRYTFMCMYVFTMCAFFDHS
IAIYQDANPNWPYYRLVCAPTTLQEYGRKKSKTSATQTKYFFTSCGSCSHPGEGLRNKEN
WEEVGNYIDVVGHRIPVPRCNVAEHHGQPTMSYACNATDSQMTPWHAYKPTQADEQDRTN
IWCLKAADNWEPEVGEKALEQYPSGLLKKFFVTHQGVIGKIVDPMLNGDEYLPWFQYSDN
DERLKKVQYSKLVNDPQCWLWAVDFSKQPLQLWDHSMDLYLEAEFRSFWHGCNIEPTMNL
TFCYMDIRIGHIVCGLYNNCIFMQTTPQYTYMFTIMNILLQHLISVASEQDGNWYYEDVC
ASAYLQRFARANRDWWTIWHLCNRVRHYLPIVRQRMKIWDYAWMNTDIHELPLVNRWVIR
MVCAWKNTVADREISAFQEQPERQYAKTHATPEHLKMRHDYMHLGPPWTDTYAWIFIEKD
TCKQSMGCGCCWFWCYCAAEVCWAFCEWFNDAWEIAMATFNKEMTYIAHVHTMRIYLWKM
PMETGTSMQSRNTIDDSIQVNDWRNGQLHMLPDFHPFMLAEAIWRWSLAANRLRAFAFWH
SFFGLQWQRTPFEGICFVTRYEFIRSVFRGVINWHRQEAWHYCEYKGNQNYDVYTCEIQF
DIEKQFCWWQMKCEFIGHQMHCCYTFIMTGFHNQIRESKITDCAIPKPCWDIYGYWVFHG
ICREDQEDPVKYTEQSNHMRKIPGDVKRSIMCTGNKAPYMIRCCNDCPGLFMCSIVWAFG
YQSYLQHVSTRGQKMMTHCTHWILCFGMFWSFQNPWPETFMVAHWEANWGPDLQPQFGFS
DYHGDSYITASLQWYYAGPWAAIWCGLHYAIFLPHPKKKFMINNWNPQHCVGPSCRWHRH
QTMLLDEKVNEHGVMPWGIVQENGYYARSFTRCLFTMEFDMCCAIISATTMRCLGCDQPE
KLAQHHQTSPQVNSELCAGRKNKPVPQHNCEKSQEPPSFWRMGRPCFPIYIRCYTCMTFQ
FPWNMTEATPQCAQDAGCHWKAGGKAWAEMYTNALHLGKLSDDFGIKESYTFCCTDWWPN
EMQNVIFRQDSRWDKPVLVHYIFWGRCIMSGGIPAQMEQGTHTPSKHHYYAWIQSNFYYQ
KWYHAPCPGQITEYQSNGSDQFCAGPACKPIMPPNCKFYIPQRWTGFAVGVSPVPNLENQ
LQCMHLTWEGTTDFQKFIHRNHTNPEMDWPTRKGDCSSIITGTMKSISQWLLRMWWMSSI
RGLTEVVQTNYRGMFNPGTSHFGKSFHPSSMITYINVSEACTHVCIKMVWRGNLLKEKMC
FTPLMCVEYCAHCEVGPIVTLWNVFYYLNPVQMFHTCNVKEEMYVFKHATAQYEQARMDA
PLKHIMDMEYHCWCCCCTQTAFRRFKDEHDCTLQLGHMVACLNAQFTMAWRKHAPFPLMA
PHKAFPLCDCWTIWYNFDHHQNVQMGFVANHGERAGCYNSWCRLMILNEGCCCFRRKHLL
YSTFELIQANTMTKFCDGLNRALFVYNKKKAALITAYPDPRNDCPIDWHRDFPKCIQQNN
QYGIAPRSLCRSRGMDNIDLCLHHFDNDNMRMKFICSHYFGRRNGINNYEWQRDQWEKCS
VTIAISFISIGSCSTLLIVSTGHFWSCEYWLRMPRGLHNMNVYCNGHHCSYGDHFDPREH
YQIIYKVMKGHHTAQFEYEQGTYWQQDGSQMVTETMHAQFTAVFADKLAYNYYITEQNLW
GRPKKSLDCSMGLPMDWFESVSHYEKHWPGAEVGAVADMEELFGGKMHLHDGMEFPPYTK
ETINLYMYIVVCRYIFWSEEWVSKYQDILSNFHDQQWNDRDDSLLVSYHRSKMWTIFTRV
GNCCLSIIVCPKHFAPSWFPALGRPHRLKTVTMNHCCNSWIIVYSQYYGVNKACMTHKKA
RCHRTRTWMDSKTYKFHYTDFGSSNLFNHHLWLAEPVIGLKSMSQGMRYEQTHRKFEIEY
DNVKNGICAYKNKKWFPYQFNLVNIIEFLKFWWRPSCWGHMQFVDKNGFMADHNHEMVPW
QNPGQNISACHFHQHTAQICFCARLIRHNNDPAYEECYHATVLMSLILICEMDVEDAHVR
WCRIRFCHNWLVMMVGGNSMAWHVWKRQATAACKQVRPNCDGENSDHAMQPYCATSYPLL
FLKEQHVQTSKAHEGEDYFRSMLAETTHTNCVGHIAMDQEQRPRDCWPTDIYFGQHDKFN
MNIQGEWCDIWPNKGIHFTPWFYLTIWCGDIWHECSMTTYQQEGAIFCKSPQGWQYNKPD
KKPDVRADVHSYDMRLWQTPPQVDFSYGAWTFTDMGFRHQDSGSPHKKTAVYWCCDLVME
WGVWDLYAHKTHVPLEFERYHCTEFKIEQRHPESLDQPGLSKKLADHLPRPRELKEMDAY
DGQGFKQHETDQLTYPVTWEENRAGVFWGLLEINMPNCVEDWTCYRCDMGIEHKKRPVYY
FEAYKYVTYSIGVPIFNKSRSAHNMTFYTYPTGCAPHHLWDHGTLGDAMMIRIPDTPDIQ
TYVDAVWTQSDRPWLPELRWANDTTQFVECGTIYCTEIWLNNLVIGLIWFQYLVDNFQRP
NSDDEEGAPSEYMWDTDTGQVTFMMDLCKTHPMARVGWKNIMWFCSEHLVTYHWLGYINF
TWFFMGCTPTKIWMGEWKDLQFWYSSVWQSEMRIKCIYPQMTKYNEPPQYCKQDDNPRAD
WWCYRYKFTLEAYRAMLVARLRTPASRYGNFRTPHWSVWWSEVMENVVGLYCDLVMNPYK
CMLFKHNCYTTHNWNKPQHTIECQDDRLFFQYNHQATTSPYAAECRAWWPHYFCAYDNMP
AKTPKMWETRGLDEEFGHHTMEKVVEEYTPLCEKVATAPRPNYSIEWGKSVHSCPNNERS
PMFDRHQEKCFHTRVVFHSVNGTNLNCPYMCQKWDHVDLKRGNGAALQERDSNWHNWNWC
WREPHTLCSPPMKQRINAFEIEMHSEVMEGLDRYRDYDSAFETCPIDHCFDRCGQPDGIP
WEMCEPKMQCEYTRLFCIQFICHSVLHRQSVVYTDRSCNQNGLCMMRVRQWSFRNLKQFP
QVVPSLPRWMTRCVKFIYTLFVCRWHCHTDYLNVFSDECFSDGKHFWVMLKWEFWFPAAW
QQMETPCLWYLWLHGQSVFCYEHKWEGANYEERMLIWTYTKYEWHAHCKTLLECHNVSIW
SQLHHLDYIPIADWTCQMRFESTYKTHVFPQAEAPGWKPFLINMVFPPCPLYKRGQYFPF
NCGPGSPQAWLSKRKVDEGPNDAHWADSAHVSIHHHFVAMGRYDYWYKTPTHVIWYSGPK
PGCPIWKHNVFVVHNHFWAIVYWEIMTKCLRFVIYPHIDFFLTSNTGRHIMKCGPWTQVE
ENSQQYTRLAAGKTDLDCSYNTACRHHLLMDNGGATDWGNQHQQLSEQTKSGATPAADAI
AKHFVKLNIVMPCNFWFCWVVDDFLNESIMILYYPRFTNHSNVDEIFRCLAHFDKSPTNH
REAMYGFAMRDHAEHWFLAFYHPNQVWDRDLMGARWNAKHSTCIVEHPMTLPNTDWDWKF
PQSVNCFNFKSAVVYDITIMNIKTKYRCKFFSTVYNMVWKQNRHSHPHECDQPYGSASKF
ETQLMRYPGKVDPGCFKCDVVQMDPPQLHPMLHCKQFINLKIMTPVMGPQPMMKVRVRPV
TWENYLCTQLCVGQKVKIVGYEMTWPGWAVKESKYRLFIVGMDDPSKFSPFGDNQECPYM
GVCVQPCLHLVHHPKLCYYSLIMRYRQNQESLKRSDNKYMPVETCIHWHTGQEPPYWVEV
FYLKFVKYGKCWVYVGFKNRSILWAKSCHWCWYMSMCLHESGASMMDMAQMWQGDDWWSH
TCAAECCTRGSDHVKFNTKHYSCDCFKARPRVSLTPEPEMSMEDNWNHNADALLTLQFNF
RKERERAYMHFCPGTWGHNKEWSRFHELKIVDVFNNYQMMSPAHIWQPMELDNQMNDGPT
TLHNPKTSLIHPTNFHARDYCVGKKLQTEHIYRENNFPPSQVVGTMSCHGMTITSCPISD
LQRQTEIKQCASQAFDKKFAVTYYFTPGLYPWPTIACKWVNVRSHRSHAGCMAHGCMNPR
MTDNIRARRESNMPDFEWKIDCRYFAAEDMFEYLPKLFFWMYQSGYSCFEVVPTCYYMDQ
GLMWDAEIGATHWKVWAMCRVHMEWDIQHNFGITPICSVMNYEAYDCPPDDECWFNRWFY
DKGAQKRCEIMTFRTRLECAWPRHTWRRMTTYYQMKRVMPTTRAGTGRYHIQYIWKCQVF
WHKFMQYDYWINNPCQNNCKVIRTAIGHARRWHDMLMTKIRRGWSVVPIRIPHPFSNYCC
ANPRIKDHGESCIYVQVKNDIFYCDARFWWAMENTTRKCPHAMNEAYQSQWYQRFHAPYM
CAETYLRVLLIHPWRVIHITKVQYNQVIGVPCSACNTFLDDVTACMVPVEWKDYAGYGRF
LFTIVPLNMIGQMIPDQVSMRVLFKQRFPIQCYKCVEFFREQFHPINQRNICIWTECCMD
TRIASDRHWLHRAYCYDPYRYDFWVDWALPQIWRNFALPGYLANFPHGDQQADQLWWETC
WSARPWNCYLSFLDHEVHPNHDVLLQYKCETMKTHSPDVYLRIDPLEWFWHNENPDRPGM
GPDVPFYIIVDQGDDLWCPSWFHWEVNFVYAHKAPEIMTCWDREFMESRENTHYNHLTYE
HNGSIRARHCYWLRHDFKLREFGLAPQIYQFREMDKMYLFHPKYCREDDEPNKACGKKGH
CCIAYHNGFCIMYYSNGRNLICEMCWPCTTSTDMRPWICLTQKFEQGCARALLAMNNIDV
HYSESRNGCGQRHHTMRPNQQNMFMGVLGRWESGCVNVQYSVNRAPQWVFSLKNRHDFCL
GMTMQPEAIHTVKGENNKQHRKMCQIEKMPSLAHGHNDCHLHFWVDCGDAALGHMWNGMC
YESKEPQLFRMILNNQLVQGWNCTDAPAITAQKRQQLLYVKWMPPEMFISCRSMTYPIDT
AIICPGFGYFRRTDLKDKDMKFPFDYMFCIGCGNMHSTSHMVCNVWFVGYCKDIDKTNDP
PTADRQRKDRKAFSILMHRMFGSLAQFYVREPWFDWCEYDVRAASLFWREFCAARELWYN
HNPYIYEVTGHEYGNWHIQMSYKCKQYKRITMSGRWIVKCLKALIAAPYIVGVVSMLGHG
DYRECNNEDYNATETQGVSEPLMTNGTPYSAEQGYREFCPYSAWEIIANSQWHRRSYHHA
CFKGNFWYCIMINEHKGHMYCYSVGQAVQANTRFHQNWTCVQNVIGHNQTRHWEMVVGMT
LTWEFMCSEAEIVTKIMNETDMCDTDVLPNWAFGRRMWTLVQVLETFFCQMYKGICKDEL
PACIKWKLPPYTRKSWGHNCPSADYHVWYPTIDDKKQQLDMNTPIDLCTPWNDSLAPEEH
MRHTRDKDSLRKETYQLEYRGLATFKWFGKCCMMLSFNASGHHSMKWHRCKTGEHKVQFQ
HPDMYKPFTIVCECRILLGDGDCQCPKQDRVMSIWHYIVTYNECMSNAGEMNAYQHWTEV
NVVWRLFRFPLPPYGIVPMMIYRVFPHMGGAREFKYEVTKKMHHRYQGDSVPKTPTADFV
PRFNLWAVYRRICTHASRPTYYIQGMNPHMHVHEDVYWGPSLFEIWHDPKYKLCMACGHR
AGNHGNTQERFAALIKQVNEVPWHNKACTMLAKPRIGTTLPAYLVQHVFMNVIHCILENV
SDLWATGSLHRIMRCGAVWKSLRSYQYNFLAGHVNHPSTVGEQWKCAHFIFCVWWGFAFH
EYLQQLNCVKPGYLTNRRSVKYMDKNNAYRKIYVRMLAQLEHIKNVGRYIAVRQQIGIAL
HPGHWICEYNMFCAILYPQALHPSDEFVYDTWNNCWHYYISCPWFPTVKHSYCALKLSQT
MYNSSCVATDMRPYEEDALMCEQHRCFKSYHFSELVLIFDLWYHTLPESQATSCNDVDVA
LAP
>Rosalind_4714
MYCDASCLFSLEGVPTCMITQIQSGCSPNRRMRITTTVDYTNLTAHTGSNMAKHQGHNAL
LCNALWIRYLYICDACPALNMKPDSIMLTCQSTNGRPVFSMYHMVWGIWYLMRFDWKHDW
KFVQTKMTISDCTIVWTSFERLTGACRIDNVHRGSPVMGQFKEDQMYGGYYVRCCEMNVV
LQPGDAAVNSNNQGINDSLYASSVCAMYNMPVPFIIVSARDHEIRLPRHYPAKRGGRPGM
GCECCQDQMDYYDKACDICTKPHPVKHCVITKDGLFWEMPMPTWLIDEFGAPFITLDWRA
RRKPANPQVCKLAAQCSFRYHHLYQCRHYHWTKYPHKKKDWKNPPWVRNIWCRSKNHTQS
FREKWLQHYRPTAIPHSCHKGIVSQDKADARMWSQKSTQTERNSQRQYRVVGRSDQGCIF
SDHWHRFWGVAMLDVLMHVATMDSWPFMLYRPHHFKMIGIWRRCGGFSKMVYRTIEIHRY
AMMHDLGHICMIRGSECMYWDDCQSFLDSSWYYVDWFWLDETKKTENTWQRAGPCTTPRP
GMSSWIDWDQAWMCDMTGGIEHFYCDLLFWRIRWYYVREMTPHMERYITGYAFAHEEGFH
LQWAGFVVIDEKKQMHQCWWNEKDIRISVTTRSDQEPWSDLCNHICKVYLDASLVWFNFE
ELKIPPQVVGHDACVKLACDLVTDRRQGHLAEWRHPGFCPQAITKGNRHFLHHNANHDHI
GYDTNRLMVVAIGAMPDCWMADDTKTCPVSAEKTTPDSYRQHKKEMDEACVLTWWCYNWK
LKPVPSEIMVQPFCEQYPCHHRMINLNTIKKKCRLDCMMHIKTTICTEAASYGSVKWMVT
AFHSNNTPPCFCTLQDSWVKKPFYMLELSHVANTHPFSDHVAASYCHARTFTDIVEHSWG
HNKDYMHNYEETHHNGKHVGQAHESNEMSAEVPLITVIFFDEQCHIHGYDKNWAAKIMHT
GFYGSVQLMHGRFAHYSNKYKLVQLRGTCRNVEPPHAKFEACFRHLQCDVPPMPVYFVLD
SYYHVTISKAPPYEPMITSWCCEEIPRLVLASRHTAAFQYESRLWCIACHCRSSRKKWLA
WAGKALRRGAECCNASGCDIGGPILNGQWCTSGFSIAGGENRYFEFMGHVWYFYWEAIHF
GQMARLKKGACTSQCTEAYSVFDTIPLLHHQRWMQSIWWSEDCTRIPWGESHSDKFNSVT
LRHVPILVCIGQGFAWMECIERDKLYYPKETHWGDDHSKFQGGWHLEIKGQYLEIQKWDV
SPQSNSIRKQDASWGRWSTVGYVQCHMMRMSQHWCNNGECRVPELAQCRWQMSMWHDANA
DYCRDPVHYVAVFPKFSSMYCMLNITCVGSYNVEIIQFKRKCPHEFRSKFIMCWHKWKDP
PKEWWLMWRQTYGVLACADDKKYMTYSVCMNAWTDECPSGKVWHFYWTQHEQEKIWSCGA
EFRDFTIHYNLTKPFWQIDGWEPVEKAPIDPQYGEIGYDDDNIRPKLPHHLRQIPFGIMY
DEVYPHREDARDQWAPPEGRNKTLMHNRKNHQAYFQCTHIKENAGWEMRRFWYELCTTEH
FYHTCQMLTLSHTCYFSQCMYVEGKVPHHIDHIKFAEPIDHTRCERYAPCYCFIWALWTD
GGYTFNENFIYEMSHLRYITGLLMESCITEDRLAEFCSKDPTANFAYKGHAPDHERECTA
GFTYRIPVWVKHYAIFFMFQTPQQPVIISPFYYTNLKEAFGTLIISQSGYHSNHCVFCKM
SMYNFNSRRWPTCLDKFNPTIEPGRDTSMGELAPVCRWMPDEWFWSLNSIASPNDVRQFL
TPVQNPTEAKCGIEGCFTHNVNEKQHFCYCSCEANEYNTWATIGWYSPSLQDGFMSIDDQ
GVLVGMNHAVCHVRMAYCNLWNTAKTEDYDQFEIMHPNQQGPFITQAIMTCETCNCKFQA
TLDSHKFKVMGWPWKEKACIMTFYENDMAQDSCGYAEYRMEQFIPLMEMSAPQPACSVVR
AVGFHSVDSVTVTRPMVMMCNSYMRPCYPLNPPTNHKREKTMDAIITIMCLDVILCYACD
DQECIYMIAYGRWKQDMFMFDSMCRQYRHVGLFAGFQVYSMLYFIPVEYSEFEWENAEAP
CHLASKFTMDYAIKGDQIGKNDWEHQRQYEIFAENENHMMYRLVTLMPSHKPDSDKQETG
IILRNMISYLYMRLKGCHCHSDWRLIGQCTDFNAREKPCRFPVPMCMVGIETMQGDIRCT
FTNTFCGRPELQCCQYIKNLCDSVSMQHSRRQKSQTACKYTLHQGYHLEDLYFNGWDHGF
GHMQVYNEQWSHEEVHDASVFGSIPYVVMQSYATYTLDVPMLGWDHAFCWIQMEIAWLIG
KFRGSGDRAVLDWHPHRYQRTEWIIFCYDFFKYKATIHYMEKCKWCNIFHPWKYTMIYCN
MMGIMRINQKNNHESWKKAGLMMWVNSSRVYYNQGDIVFNPWNFIQVGCFMWVIGGWAKD
ALQTQQVQRLCYCNYFTDISWEQLFEAAYMERAGYEYWNFTKAPQWQPSTIQTWDCQAPS
TDNPIPKALHSEHHIQRYPYLVCRGLNLCATNGVDEQAISFPFMGPWHCPGHIEEIEAAA
HHIWRPNMTWGANFFSKFDRAKKYKKMQYQFRCDWVWIRYQQYTYSRPKCAYGFTSDLNL
ENSLKHCVRKMLMFGSNQYIRTKANTNPSKWNGTDIGYRKTHWMSENWWSKKPAWFQGQI
QDSRNCAVAVGISSHRQGTVEILPCATWPATTSGQRRPMHVIEDVVGFIGQFQYSLNTHF
AFVGNDEMIWHTQREHAHGWHGNEWPKEHDRPSEVKSMRCKPDTSKDLLGEFQCESRPHN
PQIEKEGESEFVTIVRIPISRNPSYWHYPFGNPILKLAKVQQQDYLICMYVFTMCAFFDQ
DYVWTYHRDIAIYQDANPNWPQYRLVCAPTTYQEYQRKKSKRTSCGSCSMPGEGLRNKCN
WEEVGNIIIEKIDVVGHRIPVPRCNVAIRTPVHGQGMHWFESMSYLCNETDSQMTPLNIR
WHAYKHPGRYGIEYTQADEQDRTNIWCLKAADNWEPQYPSGLLLKWFVTHQIVDPMLNGQ
YSDNDERKKKVQYSKLVNDPQCFQYVNWWLWAVDFSKQPLQPELWDHSMDLYLEAWFAKT
FCMNLTFCYIVCGLYNWCIFMQTTPQYTYMVTHDFFIKSSYDQRGENILLQHLISWKQNH
MNECYESVCAPWMTWIDIIPLQREQFKHGYARANRVWWTIWEEGETFSTYSCNRVRFKYH
PIVRQRAWVNTDIHGLPLVNRWVIRMVCAWENTIAGLISSFQFNSYNHNSRQPERQGAKT
KADAERKMRHDYLHLEEHGWFMTGDTIEKPSKKPEAKRGCCWFWCYYVDMDPKAHDQLVC
AAFCEVQDYSEWFWEIAFLPATFPKEIYRLGEWEKNSMQWRNTIDESIQYTIKPCCINDC
RNGQLHMLPDFHDDMLMDDEAIWRWSLYDNRLRAFHNHGWSRHDFWHSFFGLKWQTPFSG
ICFVTRYEGVITWHRQKGNQNYDVYTCEIQFDIEKHAKWEMKCLFHCCYTFIMTGFHNQI
RESKITDIRWPSPKQAIPEVPCWDIYYEIMVRETAYWVHVTGDTDPTGNTAMQEDPVKET
IPGDVKRSIMCTGNKCLFRISFNPPGLFMCSIVWAFGYQSYLQSDVSTRGTCTFWSFQNP
WPETFMVAWGGPVLQPQFEFSDYHGDSYYTQWYYAGPWCALWCGLHYAIFLPHPKKKFMH
VISRGHQGPSCRWHPIPSHYSWTMLLDEKKNEHGVMPWGIKQETGWMFYARSFRRELFTD
FKCMTCCAIISDTHVSILFNWMRCLGCDQPRKLAAHHQTMPQYNSELCAGRKNKPVPQHN
CLLWGHQLCGKSVHPEPPSFWRMGRPCFWAQEHIYIRNGHYTTWNPAFYTTFQFPWENQY
QRMTEATPHCAQDAGLHWKAGGKRFGANALHLGWVPKLAKLSDDKTNKHMPGGIKESYTF
YCTDHTNGGHTQQDSRLFDKPVLVHYLFWGVYCWEEFCICRTGGIVNVKNPKDDAQMEQD
QGVAYKTHTPSKHHYYACIQSNFYYQKWYAAPCPGQITEYGSDQFCYGPEGTRYCKPIMP
PNCTGFKVGVSPVNQLQCMHLTWKGTHRNHPNPRWQRNDWETRKGDCSSIITGTMKPING
HAIISQWLLRMWWKGWSSIYEWEAPQPALLRENFWVVKAPCTNYRGQDYTQSFCPGCCHT
HFGKSFHRRDSSMITYINVWCYIRSEACTHVTEMKFQQIWRGNLSKWECKDTKMCFTPLA
HCEVGWRIRVTCDAWEAQYEQARMDKPLKHIMDMEYHCWTCCTQTAVRLQLGHMVACLNA
SFTMLWRHKLETCMNCSVMAPHKAFPLCDSWTIWRNFDHHQNDQKGFVANHGERATWCVN
ATQCCNESDCDRLNCRLMILNEGCCCFHRKHLLYSTFELIFCDYLNRALFVYNKKKMGFK
WYALITAYPHGPRNDCPIDWYGMTGQNAIAPRSLCRSRGMDNIDLCCRHAFDNDSLQMRM
HGRSSMVAYIQGPRRNGINNYEWQRDLWEIAISFISIGSCSTLLWRPHYMVSTWHFWSCE
YWLRMPRGMNVYCKGHHCSYIDHFDGPREHYQRTWAYIYGHVHKGHHTAQFEYEQGTYWI
QDFYMERHMSQMVTETMHAQKIVISLDVLAYNYNIWEVPFGNIQPHTWGRPKKSLDCSMG
LPMDWFESVSHYEKGYKIPWWKHADMIELFGGKMALQDGMEFPPYTKETINLYRLWLYIV
EYVSKYQIILSNFHDQQWNDRDDSYNRSYMSWTGWNEFSGNCCLSIIVCPKHEDPSWEPA
LGRVHRLWTMYARIHHCTCWYIPANHCCNSWIIVYSQYTHKKARPHRTRTWMFSKTYKFH
YSSNLFNHCLWLAEPVIGHLKSMSQGMRYEQTHRKCLEIENDIMCENEVKNEICAWKNKK
YFYYQFMTMIEFLKFWWRPSCWKALHMQFVDRYGFMADHNHEMVSWQNMFSYVFAVQQDS
TEWYHWSACHFHQHIAQICFRHNNDPAYEECYHAMVLMSLILICEMDVEDAHVGQKMAVV
HKCFIRFCHNNSMAWHVNVTRGKKRQATAACKYNCDGENCHAMTHWKGVPLFVLCKEQHV
QTSKAHHFEDRFSSMLAEDTMTNCVAMDQEQRKRDCWPMDIYFGQHDKFNMNIQGIRCDI
WPNKYLTIWCGRIWHECSGWADFSRDHTTYQQEGAIFAWHDYTQYNKPDKKPDVRADVHY
YPVNIDHYDMSKDIQLTQTFSYGAWTFTDMAPHKKLAVYVCVWTLYAHKTHVPLEFERYV
LPQPWIEQDWVQHPESLDKLADHLLNDPRYRPRELKMMFKMDAYDGQGFALPKEIESQHE
TDQLTYPVTWEENRAGVFWGLLWGNMNYFLTHEENCVIDWTCYNCDMGIMHKKIPVYYFE
AYKYVTYSIPVPILNNKAHNMTDWYTYPTGCAPHHLWDHMTLGDAMHIRAPDTPMRCIYV
DPVWTQWQEYQEKDRPWLPELRWANDTMQFKTRRLKLEGTIYCTEIWLNNLVIGVNIWSP
EGPWMPAECRHRENYCMAPSEYMWTGQVTFMMDLCKTHPMAVDWANFKCTLSRVTDCGLK
YPRYINTTWFFMGCTPTKIWMWWWKDLQFGFSNMVGYSSVWQSEMRICCIYPQMTKYNEP
PQYCIQDTNPRAYRTKFTLEAYRAMLVARFYHLVIASRYGNFSEVRELVVGLYCDLVMNP
YKINVNKPVHTIAFAKFVGNRQFQYKFHQATTQAEYPYAAECRAWWPHYTRAPAKTPTYE
MLDEEFMHHTMHKVVEEYTPLCEKVATMPRPNSSIDWVHSCPFDRHYHSVNGTNLNCIYM
CQFLFMPDKWDDLKANGMWGWWGAALEEDPHTLFDQYDGTSPPQKVGQVKQRWNSKCACY
NARMRCDFIEYVTLWGEVMEFHWQCELDRYRKYDSAFETCPIDHRCGQPDGIPWEMKTAE
PKMQCEYTFWMAKYLSVLHRQDRSQCMWVPQGFNQCGFGLCAMRVRQWKKNRNLYDWHFP
VIVPSLRWGTRIYTVFVCRWHFSSDGKHFDVMLPAAWQQMFTPCLWYVFCYEMVRNFWWE
GACYELRIWTYTKWLFPWMMSHYAHCKTLHECHNVSIWSMTGGIIAVSLRIYVWAGKALW
TCRLNCLWCMRFIPTYKTHVFPGVHIRQAEAPGNSPFLINMVFPPNPCYIRLNIETVIRG
QYCPFQCGPGSPQWLSDWPNNIAPLLGKRKVDEGPNDAHWADSQWYQDCHLMVSIHHHSK
FPEKHVAMGRYDYWYKTPWYSSPKPGCNVFVNWAIVYPELDRNMTDMHWCLRFIKRNMKI
YPHTSNTGRQIMKCGPETQVEENSQSMYTRLAAGQTDLDCSYNTACRHHLLMDNGGATDW
GNQHQQLSEQTKSGQCTSRYAHPAATIDLRRWTSAIAKHFVGLNFVMTVPTCFWPRFWFC
WVVDDFLDVGGEQMILYYPRFTNHSNVDEIFICLPQGTCHDKSTTNHREAMYKVAMRDHA
AHWFLAFTHPMQVWDYYQWVDRMGARWNAKHSTCIVIHPMTWDSTEKFPQSVNCKYHNFK
AWHGCAGIHMGISTKYRCKHFSTVYNMVWKQNRHSHPHETDQPYGCASKFETQLMNYPGK
PKMPWTPFKMDPCKQFINLKIMTPVMRCEPQPPENYENRLCVRIKGYEMTWPGWAVAESK
YRPFIDDPSKPFGDNFNRDPQKSREMGVCVFPCLHGVESPPWRRHPKLCMMSLIMRYRQN
CESLKRSDNKYEPVETCKHWHTGQEPPYWVFYLKFVKYCKNRSIWFANLMPWAKSFWIQG
QGWRHWCQYMSMCLHGASMMAMAQMWGGDNWWSHAEGQNPACTRGDDHVKFNTKHYSCYA
IGCFKLRHLGDRKTTYLLPVTEPVEELPEPEMCSFPHSCNWMHNNRTTFERLVMKNHFCP
GTWGHNKEWSMFHEDMTLTKVWVFNNYWMMSCLKPANAHIWQDNQMNDYSETKWPMTHTG
EEEHNPKTSYQCGKGNVHPTNFHARDYCVGKKLQTEHIERENNFPPSQWVGTMQCHCYCT
SIIYTKISPRVIEMSDLQRQTEIKQCASQAIKKFWSVDDKKFAVTRAKHDIFYATPGWYS
WPTIGCKCVCVVSHRSHAGCMAHGCMWRMQPNIRNKCMTAFSRRESNMPNFEWKIDCRYR
KAAEDMHQGCENRFYEYLQPGYSCFEVVPTCYYMDQGLMWDAEIGAWIHWKVWAMCRVHM
EWDDQHNFGITPFHEAYDCPPDDHPSSEKNMNRWRCFPGAGMTFRTTWRHIWGPPYYYLG
QYQWKCQVFWIKFMQYWYNNCHVIRTAIGHARREHDMLMTKIRRGWSKSVMPIRIPHPFS
NYQNFMSGQVGESCIYVQVKNKLFYEGWDFWWQSNSHTKWAMENTNKSNYACVRHAMNQR
FHAPYMRKTMVFENYLRVLLIHPWRVIHIQGKVQMPQTWPDNGYKCVQPQTMQHVTSLIG
VPCSACNTDVIAMHPVEWKDYAGYGRFHMTIVPLPEDWGKEIWDMRVLFKQRFPIQCYTL
FRNICIWTECCMDTRIAQDRDKPWHLQWLHRAYCYDWYRNANDRADFWVQIWRQGEMAHM
GMYNPPLVVQADLQLPMIPAWPWNISFLTHEVHPNHRVEKIAMKTHYPDVYLRIDPLEFQ
FGHYYSPSNENPDRPGMGPDVPFYIIVDYGDDLDCWVSFRFHWEVNFCYAHKAPEGMEPQ
LEEPLCWCGFFQEVDEWLIGMRHSHANHTPQICLHSTYEHNGTSTMTFMHIRARHIYWLR
HDFKFYPWREFGVDYWRIAMQIYQFRFHPKKGISARSKLCGVKGVCCILYHNGFCIMYYC
EMCWGCTTVTDERPWCLTHKFELYAGHRNNDAHAVLPNSDQMLYMMNNIDVHYSESRNGC
DFYHCTHSVTMRPNQQNMGTQCCGVLGRWESGCVNVQYSVHQCRAPQWVFSLKNRHDLPN
YCMGVVACMTVTQGIGTDAMPLQVSFHKHHRFQSTQMCQIEKMVEFCRSMGNPHGHNDCH
LHFWVSFRARVTAICGDAALGHGWNGMCYESKEPCHFRMNQYWCVQFKQDVQGLNCTDAP
AITAQKRQQLLYVKWRPPEMFSYTYPIDIINDRIGPLPPGFGYFRRTDLKDKDMKFPFDY
MFCIDQCGCCRQHMHSTSHMHCNVWAKHWDKYVGYCKDIDQQRLETMYQGADRKASRHCK
FHILMHRTFGELAQFYVRGAPWWCEYDVRLFWRERMNPGFCANRELKMTWGDYNKNPYAG
DEKTGHEYGNWHIQSYLQEMRFSSNCFQYKRIMSGRNRQCILIVVSALGHGDYRNCNNED
YNATEKQGVWEFGPGVHTMTPYSAPQGYREFDLYSAWEIIINSMWHERSYHSSACSYYEE
HSQQWGNFSAGNYCIMINEGKGHMYCYSVGGAVQANTRFHQMWGCVQNVIGHNWNQRCGM
TLTSKANRLFMCSEREIVTKIMTRVWVFKPNWAFMRRMWTLVQVLETFFCGICKDFLPAS
FWMFSWKLPYWPHNCWSADYHVWYPTIDDKGSDCMMTFLMPTGPGWNDHMRICALKDETR
DKDLMYELDTGQRYRGLATFKWFGKCPHEKSLILSFNASGHHSMKWHPCKTGEHGYPDGN
EQFQHPDMGKPVRASDDIWKRSHQHCGECRILLGDQHVNYDCQCPKIDATVKVMSIMSYI
VTYNEIWVFEADSSNAWEDWDFFEMNAYQHWTEVNVVWTRFDVLAWIEFRFPYPPYGIHP
MMIMTTAGAREFKYQLLSGQSARVEKMHWRYQGDSVPKTPTCDFVPRFNCEAVYRNICTH
ASRPTYYIQGMNKHMHLNPHHLAYVQEDVYWTPAGYQSHDPKYLLCMACGHRAGNHGNTQ
ERFAALIQQVNEVPWHNKACCMSAKPRIGTTLQTVFMNVIHCIVAWQHASAMENVSDQVA
TGSLHRIMRCRIYMDKYQHNILAGHVNHPSTVGEQWWCAHFIFKVWWGFAFHEYLQQLNF
NRVVLGYLTRRSLWQNMHKPFPRPEGWAHSFVWQVWRIVRMLAQLIKNVGRYIAAATTER
QQGHWICEYNMALKPTSYICAILYPQALHPSYYICCFPVVKHSYYNSSCVATDMRPYELD
ALMCEQVCNWKLQSMRCFDTSRQWYHFSELVLCSWRQTIDFDLWYHHLIESEYDDGVMAT
SCNDVDVGLAP