We are now ready to describe BWMATCHING, an algorithm that counts the total number of matches of Pattern in Text, where the only information that we are given is the last column of the Burrows-Wheeler matrix. From this column we can derive the first column and the last-to-first mapping. The pointers top and bottom are updated by the green lines in the following pseudocode.

    BWMATCHING(FirstColumn, LastColumn, Pattern, LastToFirst)
        top ← 0
        bottom ← |LastColumn| − 1
        while topbottom
            if Pattern is nonempty
                symbol ← last letter in Pattern
                remove last letter from Pattern
                if positions from top to bottom in LastColumn contain an occurrence of symbol
                    topIndex ← first position of symbol among positions from top to bottom in LastColumn
                    bottomIndex ← last position of symbol among positions from top to bottom in LastColumn
                    topLastToFirst(topIndex)
                    bottomLastToFirst(bottomIndex)

                else
                    return 0
            else
                return bottomtop + 1

Assignment

Implement the bw_matching method which takes a FASTA filename. This FASTA file contains a Burrows-Wheeler transformation of a text (containing a '$') and a couple of string patterns (without a '$').

The function should return a tuple of integers, where the i-th integer corresponds to the number of substring matches of the i-th pattern in the text.

Example

>>> bw_matching('data01.fna'1)
(0, 1, 0, 0, 0, 1, 0, 0, 0, 0)

>>> bw_matching('data02.fna'2)
(1, 0, 0, 1, 0, 0, 1, 0, 2, 0, 0, 1, 1)