How often do words appear in a text?
A word in a text is defined as the longest sequence of letters, where every character that is not a letter is considered a word boundary. Your task:
Write a function word_split
that takes the location of a text file (str
). The function must return a list
containing the sequence of words from the given text file.
Write a function word_count
that takes the location of a text file (str
). The function must return a dictionary (dict
) that maps each word in the given text file onto the number of occurrences of the word in the given text file. The function may not make a distinction between uppercase and lowercase letters, and must lowercase words as keys in the dictionary it returns.
In the following interactive session we assume the text file data.txt
1 to be located in the current directory.
>>> worden_split('data.txt')
['How', 'much', 'wood', 'would', 'a', 'woodchuck', 'chuck', 'If', 'a', 'woodchuck', 'could', 'chuck', 'wood', 'He', 'would', 'chuck', 'he', 'would', 'as', 'much', 'as', 'he', 'could', 'And', 'chuck', 'as', 'much', 'as', 'a', 'woodchuck', 'would', 'If', 'a', 'woodchuck', 'could', 'chuck', 'wood']
>>> word_count('data.txt')
{'how': 1, 'much': 3, 'wood': 3, 'would': 4, 'a': 4, 'woodchuck': 4, 'chuck': 5, 'if': 2, 'could': 3, 'he': 3, 'as': 4, 'and': 1}