DNA sequencing.
★★★3-letter sequences encode amino acids in DNA. For example, TTT is phenylalanine and TTA is leucine. This program reads a DNA sequence stored in a file and outputs the number of a particular amino acid in the sequence requested by the user. E.g. in the sequence: ACGTTTGTATTT the sequence TTT appears twice.
Write a program that asks the user to enter three characters and outputs how many times that sequence of characters appears in a file.
Remember to add a comment before a subprogram, selection or iteration statement to explain its purpose.
get_amino_acid that:check_sequence that:dna.txt for reading. Note this is included in the Trinket above for you to use as source data.get_amino_acid to input a valid amino acid.check_sequence to return the number of the amino acids in the file.dna.txt file:
ACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGC
CCTGGAGGGTGGCCCCACCGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAATAAGGAAAAGCGG
CTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGGAGAG
Enter the amino acid to find: CCC
There are 4 CCC amino acids in the DNA sequence.
Enter the amino acid to find: GGT
There are 0 GGT amino acids in the DNA sequence.
Enter the amino acid to find: GGG
There are 3 GGG amino acids in the DNA sequence.
Restricted automated feedback
Automated feedback for this assignment is still under construction. Submitted programs are checked for syntax errors and their source code is checked for potential errors, bugs, stylistic issues, and suspicious constructs. However, no checks are performed yet to see if the program correctly implements the behaviour specified in the assignment.
get_amino_acid
# Input the amino acid
def get_amino_acid():
---
choice = ""
valid = False
---
# Validation
while not valid:
---
choice = input("Enter the amino acid to find: ")
valid = True
---
# Amino acid must be 3 letters
if len(choice) != 3:
---
valid = False
---
else:
---
# Check each letter of the choice
for letter in range(len(choice)):
---
# Amino acid must contain only the letters ACGT
if choice[letter] not in "ACGT":
---
valid = False
---
return choice
check_sequence
# Read the DNA sequence file
def check_sequence(amino_acid):
---
# Check file exists
try:
---
file = open("dna.txt", "r")
---
except FileNotFoundError:
---
return -1
---
else:
---
count = 0
---
# Read in each line
for line in file:
---
line = line.strip()
---
# Consider data letters in threes
for index in range(0, len(line), 3):
---
sequence = line[index] + line[index + 1] + line[index + 2]
---
# Add to the count if amino acid found
if sequence == amino_acid:
---
count = count + 1
---
file.close()
---
return count
# -------------------------
# Main program
# -------------------------
---
amino_acid = get_amino_acid()
---
number = check_sequence(amino_acid)
---
# If -1 is returned the file does not exist
if number == -1:
---
print("DNA file not found.")
---
else:
---
print("There are", number, amino_acid, "amino acids in the DNA sequence.")