A universally unique identifier1 (UUID) is a 128-bit number used to identify information in computer systems. The term globally unique identifier (GUID) is also used.

When generated according to the standard methods, UUIDs are for practical purposes unique, without depending for their uniqueness on a central registration authority or coordination between the parties generating them, unlike most other numbering schemes. While the probability that a UUID will be duplicated is not zero, it is close enough to zero to be negligible.

Thus, anyone can create a UUID and use it to identify something with near certainty that the identifier does not duplicate one that has already been, or will be, created to identify something else. Information labeled with UUIDs by independent parties can therefore be later combined into a single database, or transmitted on the same channel, with a negligible probability of duplication.

The structure and textual representation of UUIDs are defined in the RFC 41222 standard. In its canonical textual representation, the sixteen octets (16×8 bits = 128 bits) of a UUID are represented as 32 hexadecimal (base 16) digits, displayed in five groups separated by hyphens:

550e8400-e29b-41d4-a716-446655440000

The groups respectively contain 8, 4, 4, 4 and 12 hexadecimal digits3, for a total of 36 characters (32 alphanumeric characters and four hyphens). The hexadecimal numeral system uses sixteen distinct symbols: the digits 09 to represent values zero to nine, and the letters af to represent values ten to fifteen. RFC 4122 requires that the hexidecimal letters must be represented in lower case.

Assignment

Each line of the text file uuid.txt4 consists of a pattern pP, followed by a space and a word wW. The set P contains the canonical textual representation (according to RFC 4122) of all possible UUIDs. The set W contains all words that only consist of letters. Your task:

  1. Determine the shortest possible regular expressions for the following subsets of P:

    • P1={pP|all hexadecimal digits are different in each group of p}

      example: a91e35fd-b0d4-b95a-fdb0-f0d7b839c645 P1
        0fa05af9-87e9-49d0-bfdd-463e9993f90f P1
    • P2={pP|the second and third group of p have no hexademical digits in common}

      example: 6482cf60-dbf8-412c-a63c-d2b50e92ed6d P2
        7b17ccef-f9f2-4597-aa0d-5f186d206a36 P2
    • P3={pP|all letters are in strict alphabetic order in each group of p}

      example: 9a578cf1-bc9f-49ad-8294-a61b736088c5 P3
        64efc3d9-448a-4902-a6c6-61849ac6c7ef P3
    • P4={pP|there is a hexadecimal digit that occurs at least once in each group of p}

      example: 2beb1c38-8158-4448-a280-28c46a515b87 P4
        cd53e91b-4bce-4e58-ad28-9416cf468d71 P4

    Each time give a Unix command where the regular expression is used by a command from the grep family to write only those lines from the text file to stdout whose pattern p belongs to Pi (i=1,2,3,4).

  2. Find the words w1 w2 w3 w4 of a secret message in the following way:

    • the word w1 is on the unique line whose pattern p belongs to P1P2

    • the word w2 is on the unique line whose pattern p belongs to P2P3

    • the word w3 is on the unique line whose pattern p belongs to P3P4

    • the word w4 is on the unique line whose pattern p belongs to P4P1

    Each time give a Unix command where the regular expressions for the subsets Pi (i=1,2,3,4) are used by commands from the grep family to find the word wj (j=1,2,3,4) in the text file and write it to stdout. It is not allow to write the word wj literally (e.g. echo wj).