#peptide mass
Proteins are not only important building blocks of the muscles in our bodies, but they are also the “workhorses” in our cells, speeding up biochemical reactions, for example. Proteins are chains made up of combinations of 20 amino acids. We refer to it as a protein when an amino acid chain is at least 100 amino acids long; for shorter chains, we speak of peptides. Each protein or peptide has a specific weight, which is usually expressed in the unit Dalton. The weight of a single amino acid can be calculated from its structural formula. When forming an amino acid chain, one H2O molecule (18.0153 Daltons) is released for each covalent bond formed between two amino acids.
That is why the masses of amino acids are often given as the sum of all atoms, minus 18.0153. The mass of a peptide is the sum of the masses of the amino acids as shown in the mass table, plus the mass of one H2O (18.0153 Daltons).
Mass table:
amino acid | mass |
---|---|
A | 71.03711 |
C | 103.00919 |
D | 115.02694 |
E | 129.04259 |
F | 147.06841 |
G | 57.02146 |
H | 137.05891 |
I | 113.08406 |
K | 128.09496 |
L | 113.08406 |
M | 131.04049 |
N | 114.04293 |
P | 97.05276 |
Q | 128.05858 |
R | 156.10111 |
S | 87.03203 |
T | 101.04768 |
V | 99.06841 |
W | 186.07931 |
Y | 163.06333 |
is_aa_sequence
that takes an amino acid sequence as input and returns a boolean True if the input is an amino acid sequence, i.e., it consists only of uppercase letters representing the 20 amino acids, and False if it is not an amino acid sequence.peptide_or_protein
that takes an amino acid sequence as input and returns a boolean True if it is a peptide and False if it is a protein.frequency_aa
that takes an amino acid sequence as input and returns a tuple as output, where each position indicates how many amino acids of that type are present in the sequence. The amino acids are in alphabetical order, so the first position in the tuple gives the number of A’s, the second the number of C’s, the third the number of D’s, etc.mass_aa
that takes a tuple with the number of amino acids per type and calculates the mass of the peptide or protein provided. This function takes a tuple of integers as input and returns a floating-point number as output.info_sequence
that uses the above functions, takes an amino acid sequence as input, and outputs a string variable. The string should contain the message “This is a peptide of x amino acids and a mass of y Daltons” if it is a peptide, where x is the number of amino acids and y is the mass, or the message “This is a protein of x amino acids and a mass of y Daltons” if it is a protein, where x is the number of amino acids and y is the mass. If the input is not an amino acid sequence, the message “That is not an amino acid sequence” should be printed.>>> is_aa_sequence("DIEFRVLHQ")
True
>>> is_aa_sequence("ABCDEFGHILJKLMNOPQRSTUV")
False
>>> peptide_or_protein("DIEFRVLHQ")
True
>>> peptide_or_protein("MEKFLKYEIKVNNEQARANPNYGIFEVGPLESGFVITIGNAMRRVLLSCIPGASVFALSISGAKQEFAAVEGMKEDVTEVVLNFKQLVVKISDLLFEDGEMVEPPLERWPLLTVTAEKAG")
False
>>> frequency_aa("DIEFRVLHQ")
(0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0)
>>> mass_aa((0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0))
1155.60837
>>> info_sequence("DIEFRVLHQ")
"This is a peptide of 9 amino acids and a mass of 1155.60837 Dalton"
>>> info_sequence("ABCDEFGHILJKLMNOPQRSTUV")
"That is not an amino acid sequence"