Mind the gap

In "Global alignment with scoring matrix¹" we considered a linear gap penalty, in which each inserted/deleted symbol contributes the exact same amount to the calculation of alignment score. However, as we mentioned in "Global alignment with constant gap penalty²", a single large insertion/deletion (due to a rearrangement) is then punished very strictly, and so we proposed a constant gap penalty.

Yet large insertions occur far more rarely than small insertions and deletions. As a result, a more practical method of penalizing gaps is to use a hybrid of these two types of penalties in which we charge one constant penalty for beginning a gap and another constant penalty for every additional symbol added or deleted.

Assignment

An affine gap penalty is written as $$a + b\cdot (L−1)$$, where $$L$$ is the length of the gap, $$a$$ is a positive constant called the gap opening penalty, and $$b$$ is a positive constant called the gap extension penalty.

We can view the gap opening penalty as charging for the first gap symbol, and the gap extension penalty as charging for each subsequent symbol added to the gap.

For example, if $$a = 11$$ and $$b = 1$$, then a gap of length 1 would be penalized by 11 (for an average cost of 11 per gap symbol), whereas a gap of length 100 would have a score of 110 (for an average cost of 1.10 per gap symbol).

Consider the protein strings PRTEINS and PRTWPSEIN. If we use the BLOSUM62 scoring matrix and an affine gap penalty with $$a = 11$$ and $$b=1$$, then we obtain the following optimal alignment.

PRT---EINS
|||   |||
PRTWPSEIN-

Matched symbols contribute a total of 32 to the calculation of the alignment's score, and the gaps cost 13 and 11 respectively, yielding a total score of 8.

Your task:

Write a function globalAlignmentScore that takes two protein strings $$s$$ and $$t$$. The function must return the maximum alignment score between $$s$$ and $$t$$. To align the two given DNA strings, the function must use the BLOSUM62 scoring matrix, a gap open penalty equal to 11 and a gap extension penalty equal to 1.
Write a function globalAlignment that takes two protein strings $$s$$ and $$t$$. The function must return a tuple containing two augmented strings $$s'$$ and $$t'$$ representing an optimal alignment of $$s$$ and $$t$$. To align the two given DNA strings, the function must use the BLOSUM62 scoring matrix, a gap open penalty equal to 11 and a gap extension penalty equal to 1.

Example

In the following interactive session, we assume the FASTA file data.faa³ to be located in the current directory.

        >>> from Bio import SeqIO

>>> globalAlignmentScore('PRTEINS', 'PRTWPSEIN')
8
>>> globalAlignmentScore(*SeqIO.parse('data.faa', 'fasta'))
298

>>> globalAlignment('PRTEINS', 'PRTWPSEIN')
('PRT---EINS', 'PRTWPSEIN-')
>>> globalAlignment(*SeqIO.parse('data.faa', 'fasta'))
('EIAYEERLPG-------KYGETGMEMCVVSGNLPVERDRYCGWEQWAHTTHMIVHHVPQYEKPWEWVSINMTLQYDHHGMCQCENFPSMLYVANPPL', 'EIAYEERLPGPPPVWQHENNETGME-CHVSGNLNPQTYERIG-EQWAHTTHMIVHHVKQYEKVWEWVSINMTLFYDHTG------FPSMLYVANPPL')