A rebus is a puzzle that uses emoji to depict one or more words. Each emoji has zero or more associated actions that indicate which letters should be added, removed or replaced with other letters. The rebus groups emoji (and their associated actions) that together form one word of the solution. For example, the following rebus describes two words: the first word is described by the first two emoji and the second word by the last two emoji.
We will represent a rebus as a list (list) of groups. Each group describes one word (a series of lowercase letters) of the solution and is represented as a list (list) of patterns. Each pattern describes part of a word and is represented as a string (str) consisting of a combination of one or more emoji followed by zero or more associated actions. There is one space after the combination and between each consecutive pair of actions. Actions themselves never contain spaces. As such, the example rebus is represented as
[['📦📦 -s x=h', '🐜 mi+ -t'], ['🔧 +a -w -enc', '🐩 le=y o=s']]
Determining the sequence of letters described by a pattern is done in two steps. First, we search a lexicon for the word associated with the combination of emoji from the pattern. A lexicon is a text file where each line consists of a combination of emoji, a space and the word associated with that combination. Each combination appears at most once in the lexicon and all words consist of one or more lowercase letters. For example, this is part of a lexicon that contains all combinations of emoji from the example rebus:
… 📦 box 📦📦 boxes 🐜 ant 🔧 wrench 🐩 poodle …
This assignment uses UTF-81 (8-bit Unicode Transformation Format) character encoding to represent Unicode symbols as a stream of bytes. It is important to know that an emoji may consist of multiple bytes, so you should not assume that an emoji consists of a single character.
When the word associated with the combination of emoji from the pattern is found, we then apply all associated actions to it. It is important that these actions are applied in the given order (from left to right). There are four types of actions:
prefix: one or more lowercase letters followed by a plus sign (abc+); the letters (abc) must be added at the beginning of the word
suffix: a plus sign followed by one or more lowercase letters (+abc); the letters (abc) must be added at the end of the word
deletion: a minus sign followed by one or more lowercase letters (-abc); the first occurrence of the substring of letters (abc) must be removed from the word
substitution: an equal sign preceded and followed by one or more lowercase letters (abc=def); the first occurrence of the substring of letters before the equal sign (abc) must be replaced in the word by the substring of letters after the equal sign (def)
This table shows how the example rebus can be solved by converting each pattern in two steps into part of the solution.
pattern | word (step 1) | actions (step 2) | part of solution |
---|---|---|---|
📦📦 -s x=h | boxes | boxes → boxe → bohe | bohe |
🐜 mi+ -t | ant | ant → miant → mian | mian |
🔧 +a -w -enc | wrench | wrench → wrencha → rencha → rha | rha |
🐩 le=y o=s | poodle | poodle → poody → psody | psody |
If we combine all parts of a group into one word and combine all words with spaces in between, we get the solution of the example rebus: bohemian rhapsody2.
Write a program that solves rebuses. This is done in the following way:
Write a function read_lexicon that takes the location (str) of a lexicon $$\mathcal{L}$$. The function must return a dictionary (dict) that maps each combination of emoji from the lexicon onto its associated word. This result is called the dictionary representation of lexicon $$\mathcal{L}$$.
Write a function edit that takes two arguments: i) a word $$w$$ (str) and ii) an action $$a$$ (str). The function must return the sequence of letters (str) that results from applying action $$a$$ on word $$w$$.
Write a function part that takes two arguments: i) a pattern $$p$$ (str) and ii) the dictionary representation (dict) of a lexicon $$\mathcal{L}$$. The function must return the sequence of letters described by pattern $$p$$ according to lexicon $$\mathcal{L}$$.
Write a function rebus that takes two arguments: i) the representation (list) of a rebus and ii) the dictionary representation (dict) of a lexicon $$\mathcal{L}$$. The function must return the solution (str) of the rebus according to lexicon $$\mathcal{L}$$.
This interactive session assumes the current directory contains the text file lexicon.txt3.
>>> lexicon = read_lexicon('lexicon.txt4')
>>> lexicon['🔧']
'wrench'
>>> lexicon['🐜']
'ant'
>>> lexicon['🐩']
'poodle'
>>> lexicon['📦📦']
'boxes'
>>> edit('wrench', '+a')
'wrencha'
>>> edit('ant', 'mi+')
'miant'
>>> edit('poodle', 'le=y')
'poody'
>>> edit('boxes', '-s')
'boxe'
>>> part('📦📦 -s x=h', lexicon)
'bohe'
>>> part('🐜 mi+ -t', lexicon)
'mian'
>>> part('🔧 +a -w -enc', lexicon)
'rha'
>>> part('🐩 le=y o=s', lexicon)
'psody'
>>> part('🔥👅 c=et p=oc', lexicon)
'society5'
>>> rebus([['📦📦 -s x=h', '🐜 mi+ -t'], ['🔧 +a -w -enc', '🐩 le=y o=s']], lexicon)
'bohemian rhapsody6'
>>> rebus([['🏨'], ['🐈 t=l', '📯 h=if', '🐈 c=i -t']], lexicon)
'hotel california7'
>>> rebus([['🧂 a=u', '🐜🐜 -t'], ['🐂 x=f'], ['💍 r=sw']], lexicon)
'sultans of swing8'
>>> rebus([[🐚🐚 h=m'], ['🚲 b=l'], ['🟢 gr=t'], ['🕷 -de', '🪁 -k -e']], lexicon)
'smells like teen spirit9'