We say that position $$i$$ in $$k$$-mers $$s = s_0s_1\ldots s_{k-1}$$
and $$t = t_0t_1\ldots t_{k-1}$$ is a
mismatch if $$s_i \neq t_i$$.
For example,
CGAAT and
CGGAC have two mismatches.
The number of mismatches between strings $$s$$ and $$t$$ is called the
Hamming
distance between these strings.
Assignment
Write a function
hamming_distance that takes two DNA strings
$$s$$ and $$t$$. The function must return an integer value representing the
Hamming distance between $$s$$ and $$t$$.
Example
In the following interactive session, we assume the FASTA file data.fna to be located in
the current directory.
>>> hamming_distance('GGGCCGTTGGT', 'GGACCGTTGAC')
3
>>> hamming_distance('AAAA', 'TTTT')
4
>>> hamming_distance('ACGTACGT', 'TACGTACG')
8
>>> hamming_distance('ACGTACGT', 'CCCCCCCC')
6
>>> from Bio import SeqIO
>>> hamming_distance(*SeqIO.parse('data.fna', 'fasta'))
859