Luciferase is a generic term for a class of enzymes1 that produce bioluminescence2 in nature. The name is derived from the Latin word lucifer — meaning "lightbearer" — which in turn is derived from the Latin words for "light" (lux) and "to bring or carry" (ferre). One well-known example is a luciferase of the Photinini3 firefly Photinus pyralis.

firefly
Bioluminescence by luciferase in a firefly.

In luminescent4 reactions, light is produced by the reduction-oxidation5 of a luciferin6 (a pigment) and adenosine triphosphate7 (ATP). The rate8 of this reaction between luciferin9 and oxygen is extremely slow unless it is catalyzed10 by luciferase, often facilitated by the presence of calcium11 ions (similar to muscle contraction12). The reaction mechanism13 occurs in two steps:

luciferin + ATP → luciferyl adenylate + PPi14

luciferyl adenylate + O2 oxyluciferin15 + AMP16 + light

Light is produced because the reaction forms oxyluciferin in an electronically excited state. The reaction releases a photon of light as oxyluciferin goes back to the ground state. This reaction is very efficient: almost all energy is converted into light. By comparison, an incandescent lamp loses nearly 90% of its power in the form of heat.

Luciferins and luciferases are produced in different forms by different animal species. In addition to fireflies, some species of fungi (Omphalotus olearius), insects, fish, molluscs and algae such as Noctiluca scintillans (milky seas effect17) are known to produce light by luciferase. Some species even have multiple luciferases that can produce different colors of light using the same luciferin.

But the most diabolical of all must be Oceanobacillus caeni18: a Gram-positive, rod-shaped, spore-forming bacterium that was isolated from the activated sludge of a wastewater treatment system in South Korea. Her luciferase (KPH7874319) contains a peptide20 ASPGLUVALILELEU:

MKLSILDQSP ISKGKTPKDA LEASIELAKL TDELGYHRYW VAEHHDLGGL ASPAPDILLG IIGSQTEQIR
IGSGAVLLPN YSPYHIAERY NELATLYPNR VDLGLGRAPG GSAEVSIALA GNFLEKVRMY PKLVDEVILF
LHQDFPSDHM YAKVSATPVP KTPPVPWLLG TSNKSAKLAI EKRLPFVFGH FMSNEDGPSI VKEYMKNVLN
GKSNVIVTVS AICAETTEEA EEIAMSNYLW KILQDKGEGK EGVPSIEEAK AYPYSLEEKE RIERMKQNQI
VGNPSQVREQ LENLQSEYEV DELMIVTITH SYEARKKSYQ LLAEEFCLA

Assignment

In biology, a sequence motif is a peptide (a part of a protein) that is widespread in many different proteins. The repetitive nature usually relates to a biological function21 of a macromolecule containing the motif. The search for motifs is often complicated by the fact that small variations occur, so repetitions are not necessarily identical.

To represent all possible variations of a motif, the protein database PROSITE22 uses a pattern that consists of one or more units separated by dashes (-), for example [AC]-x-V-x-{ED}. Each unit describes which letters may occur at a particular position in the motif. There are four different notations for units:

The same uppercase letter may occur multiple times between a pair of square or curly brackets, but this doesn't change the pattern. Since each unit corresponds to a single position in a motif, the length of a motif that matches a pattern always equals the number of units in the pattern.

For example, the five-unit pattern [DJINN]-x-V-x-{SATAN} matches five-letter motifs whose first letter is a D, I, J or N, whose second and fourth letters can be any letter, whose third letter is a V, and whose fifth letter is no A, N, S or T. This pattern for example matches the motif DEVIL.

Your task:

If the first argument passed these functions is no string (str) representing a valid pattern, an AssertionError must be raised with the message invalid pattern.

Example

>>> unit('V')
'V'
>>> unit('x')
'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
>>> unit('[DJINN]')
'DIJN'
>>> unit('{SATAN}')
'BCDEFGHIJKLMOPQRUVWXYZ'
>>> unit('abc')
Traceback (most recent call last):
AssertionError: invalid pattern

>>> expand('[DJINN]-x-V-x-{SATAN}')
['DIJN', 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'V', 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'BCDEFGHIJKLMOPQRUVWXYZ']

>>> ismotif('[DJINN]-x-V-x-{SATAN}', 'DEVIL')
True
>>> ismotif('[DJINN]-x-V-x-{SATAN}', 'dive')
False
>>> ismotif('[DJINN]-x-V-x-{SATAN}', 'SATAN')
False

>>> motifs('[DJINN]-x-V-x-{SATAN}', 'GNFLEKVRMYPKLVDEVILFLHQDFPSDHMYAKVSATPVPKTPPVPWLLGTSNKSAKLAI')
{(14, 'DEVIL')}
>>> motifs('[DIL]-x-{AEIOU}-[AEIOU]-x-x-[ORS]', 'fanhymkkcllnpwsdetailslmmipiedqcwwffvluciferrhaqcnhgqdyytspmhinfernodkwcfiyveagp')
{(15, 'details'), (37, 'lucifer'), (61, 'inferno')}