Disjoint motifs

In this assignment, we will consider an algorithmic (but not particularly practical) variant of motif finding for multiple motifs. Say we have two motifs corresponding to the coding regions of genes, and we want to know whether these motifs can be found in genes occupying the same region of the genome. To prevent exons from coinciding, we further stipulate that the two motifs are non-overlapping.

In this assignment, we will ask whether two disjoint motifs can be located in a given string. We considered a similar problem in "Interleaving two motifs1", which asked us to find a shortest possible string containing two motifs. However, in that problem, the motifs were allowed to coincide.

Assignment

Given three strings $$s$$, $$t$$ and $$u$$, we say that $$t$$ and $$u$$ can be interwoven into $$s$$ if there is some substring of $$s$$ made up of $$t$$ and $$u$$ as disjoint subsequences.

For example, the strings ACAG and CCG can be interwoven into GACCACGGTT. However, they cannot be interwoven into GACCACAAAAGGTT because of the appearance of the four As in the middle of the subsequences. Similarly, even though both ACACG is a shortest common supersequence of ACAG and CCG, it is not possible to interweave these two strings into ACACGACACG because the two desired subsequences must be disjoint. See "Interleaving two motifs2" for details on finding a shortest common supersequence of two strings.

Write a function interwoven that takes three DNA strings $$s$$,  $$t$$ and $$u$$. The function must return a Boolean that indicates whether or not $$t$$ and $$u$$ can be interwoven into $$s$$

Example

In the following interactive session, we assume the FASTA file data.fna3 to be located in the current directory.

>>> interwoven('GACCACGGTT', 'ACAG', 'CCG')
True
>>> interwoven('GACCACAAAAGGTT', 'ACAG', 'CCG')
False

>>> from Bio import SeqIO
>>> interwoven(*SeqIO.parse('data.fna', 'fasta'))
False