A universally unique identifier1 (UUID) is a 128-bit number used to identify information in computer systems. The term globally unique identifier (GUID) is also used.

When generated according to the standard methods, UUIDs are for practical purposes unique, without depending for their uniqueness on a central registration authority or coordination between the parties generating them, unlike most other numbering schemes. While the probability that a UUID will be duplicated is not zero, it is close enough to zero to be negligible.

Thus, anyone can create a UUID and use it to identify something with near certainty that the identifier does not duplicate one that has already been, or will be, created to identify something else. Information labeled with UUIDs by independent parties can therefore be later combined into a single database, or transmitted on the same channel, with a negligible probability of duplication.

The structure and textual representation of UUIDs are defined in the RFC 41222 standard. In its canonical textual representation, the sixteen octets ($$16 \times 8$$ bits = $$128$$ bits) of a UUID are represented as 32 hexadecimal (base 16) digits, displayed in five groups separated by hyphens:

550e8400-e29b-41d4-a716-446655440000

The groups respectively contain 8, 4, 4, 4 and 12 hexadecimal digits3, for a total of 36 characters (32 alphanumeric characters and four hyphens). The hexadecimal numeral system uses sixteen distinct symbols: the digits 09 to represent values zero to nine, and the letters af to represent values ten to fifteen. RFC 4122 requires that the hexidecimal letters must be represented in lower case.

Assignment

Each line of the text file uuid.txt4 consists of a pattern $$p \in \mathcal{P}$$, followed by a space and a word $$w \in \mathcal{W}$$. The set $$\mathcal{P}$$ contains the canonical textual representation (according to RFC 4122) of all possible UUIDs. The set $$\mathcal{W}$$ contains all words that only consist of letters. Your task:

  1. Determine the shortest possible regular expressions for the following subsets of $$\mathcal{P}$$:

    • $$\mathcal{P}_1 = \{\,p \in \mathcal{P}\,|\,$$all hexadecimal digits are different in each group of $$p\,\}$$

      example: a91e35fd-b0d4-b95a-fdb0-f0d7b839c645 $$\in \mathcal{P}_1$$
        0fa05af9-87e9-49d0-bfdd-463e9993f90f $$\not \in \mathcal{P}_1$$
    • $$\mathcal{P}_2 = \{\,p \in \mathcal{P}\,|\,$$the second and third group of $$p$$ have no hexademical digits in common$$\,\}$$

      example: 6482cf60-dbf8-412c-a63c-d2b50e92ed6d $$\in \mathcal{P}_2$$
        7b17ccef-f9f2-4597-aa0d-5f186d206a36 $$\not \in \mathcal{P}_2$$
    • $$\mathcal{P}_3 = \{\,p \in \mathcal{P}\,|\,$$all letters are in strict alphabetic order in each group of $$p\,\}$$

      example: 9a578cf1-bc9f-49ad-8294-a61b736088c5 $$\in \mathcal{P}_3$$
        64efc3d9-448a-4902-a6c6-61849ac6c7ef $$\not \in \mathcal{P}_3$$
    • $$\mathcal{P}_4 = \{\,p \in \mathcal{P}\,|\,$$there is a hexadecimal digit that occurs at least once in each group of $$p\,\}$$

      example: 2beb1c38-8158-4448-a280-28c46a515b87 $$\in \mathcal{P}_4$$
        cd53e91b-4bce-4e58-ad28-9416cf468d71 $$\not \in \mathcal{P}_4$$

    Each time give a Unix command where the regular expression is used by a command from the grep family to write only those lines from the text file to stdout whose pattern $$p$$ belongs to $$\mathcal{P}_i\ (i = 1, 2, 3, 4)$$.

  2. Find the words $$w_1\ w_2\ w_3\ w_4$$ of a secret message in the following way:

    • the word $$w_1$$ is on the unique line whose pattern $$p$$ belongs to $$\mathcal{P}_1 \cap \mathcal{P_2}$$

    • the word $$w_2$$ is on the unique line whose pattern $$p$$ belongs to $$ \mathcal{P}_2 \cap \mathcal{P_3}$$

    • the word $$w_3$$ is on the unique line whose pattern $$p$$ belongs to $$\mathcal{P}_3  \cap \mathcal{P_4}$$

    • the word $$w_4$$ is on the unique line whose pattern $$p$$ belongs to $$\mathcal{P}_4 \cap \mathcal{P_1}$$

    Each time give a Unix command where the regular expressions for the subsets $$\mathcal{P}_i\ (i = 1, 2, 3, 4)$$ are used by commands from the grep family to find the word $$w_j\ (j = 1, 2, 3, 4)$$ in the text file and write it to stdout. It is not allow to write the word $$w_j$$ literally (e.g. echo $$w_j$$).