A dot plot is one of the oldest graphical representations to compare two biological sequences. Areas of two sequences that look alike a lot are visualized in a dot plot as diagonals.

dot plot
Dot plot in which the entire genome sequence of Lactobacillus acidophilus is displayed with regard to those of the related organism L. bulgaricus. Positions that show a strong resemblance, are indicated with a dark point. The fact that the dot plot gives a diagonal more or less, suggests that both kinds stem from a recent common ancestor.

Dot plots are built as two-dimensional matrices of which the rows correspond with the consecutive windows (this is the term that is used in this context for connected areas within a sequence) of the first sequence, and the columns with the consecutive windows from the second sequence. In the most simple shape, the windows are formed by the individual residues (letters) of a sequence, but in expansion, a window can also consist of $$n$$ consecutive residues. One cell of the matrix is made black (represented as the Boolean value True) if the corresponding window of the first sequence shows enough resemblance with the corresponding window from the second sequence. Otherwise, the cell from the matrix stays white (represented by the Boolean value False).

Assignment

Define a class Dotplot which can be used to make dot plots for two given biological sequences. Biological sequences are hereby represented as strings that only consist of letters from the alphabet (that represent the individual residues). Positions within these sequences are indexed from zero. Objects from the Dotplot must have the following methods:

Example

Click the links in the example session below to see a graphical display of the dot plots.


			>>> dotplot = Dotplot('ATCCTC', 'ATTCTCG')

>>> dotplot.windows(start1=1, start2=4, length=3)
('TCC', 'TCG')
>>> dotplot.windows(start1=1, start2=4, length=-3)
Traceback (most recent call last):
AssertionError: invalid window size
>>> dotplot.windows(start1=1, start2=5, length=3)
Traceback (most recent call last):
AssertionError: invalid start position

>>> dotplot.equal(start1=1, start2=4, length=3)
True
>>> dotplot.equal(start1=1, start2=4, length=3, number=2)
True
>>> dotplot.equal(start1=1, start2=4, length=3, number=3)
False

>>> dotplot.plot(length=1, step=1, number=1) 1
[[True, False, False, False, False, False, False], [False, True, True, False, True, False, False], [False, False, False, True, False, True, False], [False, False, False, True, False, True, False], [False, True, True, False, True, False, False], [False, False, False, True, False, True, False]]
>>> dotplot.plot(length=3, step=1, number=1) 2 
[[True, True, False, True, False], [False, True, True, True, True], [True, False, True, True, True], [True, True, False, True, False]]
>>> dotplot.plot(length=3, step=1, number=2) 3 
[[True, True, False, True, False], [False, True, True, False, True], [False, False, True, False, False], [False, True, False, True, False]]
>>> dotplot.plot(length=3, step=1, number=3) 4 
[[False, False, False, False, False], [False, False, False, False, False], [False, False, False, False, False], [False, False, False, True, False]]
>>> dotplot.plot(length=2) 5 
[[True, False, False], [False, True, True], [False, True, True]]
>>> dotplot.plot(length=2, number=2) 6 
[[True, False, False], [False, False, False], [False, True, True]]
>>> dotplot.plot(length=3, number=2) 7 
[[True, True], [False, True]]