The genome of an organism is the whole of hereditary infromation in a cell. This hereditary information is either coded in DNA or — for some types of viruses — in RNA. Genes are structural components of DNA and RNA that code for a polypeptide or an RNA chain that has a certain function in an organism. The image below, for example, shows the different genes that are situated on a ring-shaped mitochondrial DNA (mrDNA) of a human being. The arrows that are used in the representation of genes, indicate that they can be oriented both forward and backward (DNA is double-stranded). A gene is localized on a genome by giving the position of the first and last base of the gene. Positions are indicated with regard to the start of the genome (for ring-shaped genomes, an initial position is chosen), that gets index 1.

menselijk mtDNA
The mitochondrial genome has 16569 bases and codes for 37 genes: 13 polypeptides, 22 tRNAs and 2 ribosomal RNAs.

What is unclear in the image above, is that genes can also (partially) overlap. This is clear in the structure of the HIV virus (an RNA lentivirus that causes AIDS) that is given below. The genes are indicated as rectangles (RNA has only one string, which results in all genes being oriented forward).

HIV genoom
Structural components of the HIV-1 genome of the HXB2 stem. Genes are indicated as rectangles. The initial position of a gene — indicated with the small number in the left upper corner of each rectangle — normally indicates the position of the a in the ATG start codon for that gene, whil the small number in the right lower corner indicates the last position of the stop codon.

The density of a genome is a percentage that indicates how many positions within the genome are taken by genes (or other structural elements). The genome density can be determined by summing up the lengths of the genes and then dividing it by the length of the genome. This, however, is a naive way of working, that doesn't take the overlapping genes into account and consequentially calculates some positions multiple times. A better way consists of determining what the percentage of the positions within the genome that are situated in at least one gene. 

Assignment

Define a class Gene which can be used to represent gene objects that contains the following methods:

Also, define another class Genome which can be used to represent genome objects. Genome objects must be able to keep the position of their genes, and contain the following methods: 

Example

>>> gene1 = Gene(3309, 4264)
>>> len(gene1)
956
>>> gene1
Gene(3309, 4264)
>>> print(gene1)
3309..4264

>>> gene2 = Gene(14675, 14151)
>>> len(gene2)
525
>>> gene2
Gene(14675, 14151)
>>> print(gene2)
complement(14151..14675)

>>> hiv = Genome(9719)
>>> len(hiv)
9719
>>> hiv.addGene(Gene(1, 634))
>>> hiv.addGene(Gene(790, 2292))
>>> hiv.addGene(Gene(2085, 5096))
>>> hiv.addGene(Gene(5041, 5619))
>>> hiv.addGene(Gene(5559, 5850))
>>> hiv.addGene(Gene(5831, 6045))
>>> hiv.addGene(Gene(5970, 6045))
>>> hiv.addGene(Gene(6062, 6310))
>>> hiv.addGene(Gene(6225, 8795))
>>> hiv.addGene(Gene(8379, 8424))
>>> hiv.addGene(Gene(8379, 8653))
>>> hiv.addGene(Gene(8797, 9417))
>>> hiv.addGene(Gene(9086, 9719))
>>> hiv.density()
98.2302706039716
>>> hiv.density(overlap=False)
110.16565490276777

>>> hiv.addGene(Gene(8888, 9999))
Traceback (most recent call last):
AssertionError: invalid coordinate