Geohash-361 is a technique for encoding geographic locations as a sequence of letters and digits. For example, the location of The Shard2 building (London, England) at coordinates (-0.086667, 51.504444) is encoded as bdrdC26BqH and the location of the Statue of Liberty3 (New York, USA) at coordinates (-74.044445, 40.689168) is encoded as 9LVB4BH89g.

Geohashes offer properties like arbitrary precision and the possibility of gradually removing characters from the end of the code to reduce its size (and gradually lose precision). As such, the location of The Shard may be successfully shortened to geohash-36 code bdrdC26B, without dramatically reducing the precision. As a consequence of the gradual precision degradation, nearby places will often (but not always) present similar prefixes. The longer a shared prefix is, the closer the two places are.

Geohash-36 divides the world map into 36 areas of equal size according to a $$6 \times 6$$ grid. Each area is assigned a unique symbol from an alphabet of 36 letters and digits, starting at the North-West (top-left) coordinate and continuing, row by row, to the South-East (bottom-right). Uppercase and lowercase letters are considered to be different symbols. By default, geohash-36 uses the following alphabet:

23456789bBCdDFgGhHjJKlLMnNPqQrRtTVWX

The symbols in this alphabet are chosen to avoid vowels, vowel-like numbers, character confusion, and to use lowercase letters which are generally distinct from their uppercase equivalents in standard typefaces. Without vowels, unintended English-language words are avoided that may otherwise appear as a geohash-36 code.

geohash
Geohash-36 divides the world map into 36 areas of equal size according to a $$6 \times 6$$ grid and assigns each area a unique symbol. The Shard building in London, England (red marker) is in the area that is assigned the letter b.

For example, The Shard building in London (England) — indicated by a red marker in the above map — is in the area indicated by the letter b. That's the first symbol of the geohash-36 code. To indicate the location more accurately, area b is again divided into 36 subareas in the same way (map below). We then observe that London is in the subarea indicated by the letter d, which is the second symbol of the geohash-36 code. By iteratively dividing the remaining area into 36 subareas, we continuously zoom in and increase the accuracy of the geohash-36 code.

geohash
To indicate the location more accurately, area b are again divided into 36 sub-areas. We then observe that London is in the sub-area indicated by the letter d, which is the second symbol of the geohash-36 code.

A geohash-36 code may have an optional checksum4 that is represented using a lowercase letter at the end of the code and separated from the actual code by a hyphen (-). It confirms the code as a geohash-36 and provides a check for incorrect or transposed characters. It is calculated as modulus 26 of the sum of each symbol position in the geohash-36 code multiplied by its position in the geohash-36 alphabet. The positions of the symbols in a geohash-36 code are numbered right to left, starting from 1. The positions of the symbols in the geohash-36 alphabet are numbered left to right, starting from 0. As such, the positions of the symbols used in the default geohash-36 alphabet are numbered in the following way:

geohash-36 2 3 4 5 6 7 8 9 b B C d D F g G h H
position 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
 
geohash-36 j J K l L M n N P q Q r R t T V W X
position 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35

The checksum of geohash-36 code bdrdC26BqH-m with checksum m is then computed as \[ \left( 10 \times 8 + 9 \times 11 + 8 \times 29 + 7 \times 11 + 6 \times 10 + 5 \times 0 + 4 \times 4 + 3 \times 9 + 2 \times 27 + 1 \times 17 \right)\!\!\!\!\mod{26 }= 12 \] with the letter m at position 12 in the classical alphabet az, where positions are numbered left to right starting from 0.

Assignment

Convert a given geohash-36 code into coordinates $$(\lambda, \varphi)$$ with $$\lambda, \varphi \in \mathbb{R}$$, longitude $$\lambda \in [-180, 180]$$ and latitude $$\varphi \in [-90, 90]$$. The algorithm to accomplish this was outlined in the introduction, but here are some more details on how each symbol in a geohash-36 code can be used to zoom in to a more specific area.

Suppose we have already located the coordinate within a rectangular area with longitudes in the interval $$[l_{\lambda}, u_{\lambda}]$$ and latitudes in the interval $$[l_{\varphi}, u_{\varphi}]$$. We further divide this area into 36 subareas of equal size according to a $$6 \times 6$$ grid and number the rows from bottom to top and the columns from left to right, starting from 0.

geohash
Een gebied wordt opgedeeld in 36 gelijke deelgebieden volgens een $$6 \times 6$$ rooster, zodat nauwkeuriger kan bepaald worden tot welk deelgebied een locatie behoort.

If the next symbol corresponds to the subarea in row $$r$$ and column $$c$$, then the longitudes of the subarea are in the interval $$[l'_{\lambda}, u'_{\lambda}]$$ and the latitudes in the interval $$[l'_{\varphi}, u'_{\varphi}]$$, where the lower and upper limits of the intervals are computed in the following way:

\[ \begin{align} l'_{\lambda} &= l_{\lambda} + c \times \frac{u_{\lambda} - l_{\lambda}}{6} \\ u'_{\lambda} &= l_{\lambda} + (c + 1) \times \frac{u_{\lambda} - l_{\lambda}}{6} \\ l'_{\varphi} &= l_{\varphi} + r \times \frac{u_{\varphi} - l_{\varphi}}{6} \\ u'_{\varphi} &= l_{\varphi} + (r + 1) \times \frac{u_{\varphi} - l_{\varphi}}{6} \end{align} \]

The table below shows how the successive subareas are computed for geohash-36 code bdrdC26BqH. Column $$\lambda$$ contains an estimate of the longitude as the midpoint of the interval $$[l'_{\lambda}, u'_{\lambda}]$$ and column $$\varphi$$ contains an estimate of the latitude as the midpoint of the $$[l'_{\varphi}, u'_{\varphi}]$$. Each subsequent symbol of the geohash-36 code makes this estimate more accurate.

symbol $$r$$ $$k$$ $$l_{\lambda}$$ $$u_{\lambda}$$ $$l_{\varphi}$$ $$u_{\varphi}$$ $$l'_{\lambda}$$ $$u'_{\lambda}$$ $$l'_{\varphi}$$ $$u'_{\varphi}$$ $$\lambda$$ $$\varphi$$
b 4 2 -180.000000 180.000000 -90.000000 90.000000 -60.000000 0.000000 30.000000 60.000000 -30.000000 45.000000
d rr4 5 -60.000000 0.000000 30.000000 60.000000 -10.000000 0.000000 50.000000 55.000000 -5.000000 52.500000
r 1 5 -10.000000 0.000000 50.000000 55.000000 -1.666667 0.000000 50.833333 51.666667 -0.833333 51.250000
d 4 5 -1.666667 0.000000 50.833333 51.666667 -0.277778 0.000000 51.388889 51.527778 -0.138889 51.458333
C 4 4 -0.277778 0.000000 51.388889 51.527778 -0.092593 -0.046296 51.481481 51.504630 -0.069444 51.493056
2 5 0 -0.092593 -0.046296 51.481481 51.504630 -0.092593 -0.084877 51.500772 51.504630 -0.088735 51.502701
6 5 4 -0.092593 -0.084877 51.500772 51.504630 -0.087449 -0.086163 51.503987 51.504630 -0.086806 51.504308
B 4 3 -0.087449 -0.086163 51.503987 51.504630 -0.086806 -0.086591 51.504415 51.504522 -0.086698 51.504469
q 1 3 -0.086806 -0.086591 51.504415 51.504522 -0.086698 -0.086663 51.504433 51.504451 -0.086681 51.504442
H 3 5 -0.086698 -0.086663 51.504433 51.504451 -0.086669 -0.086663 51.504442 51.504445 -0.086666 51.504444

Define a class Geohash36 that can be used to represent geohash-36 codes. When Geohash36 objects are instantiated, a geohash-36 code (str) must be passed that may contain a checksum. In addition, the optional parameter alphabet may take a string (str) containing the alphabet used by the geohash-36 code. If no value is explicitly passed to the parameter alphabet, the geohash-36 code uses the default alphabet as given in the introduction.

It must be explicitly checked that the given alphabet is a string (str) that consists of 36 different letters and digits. If this is not the case, an AssertionError must be raised with the message invalid alphabet.

It must also be explicitly checked that the given geohash-36 code is a string (str) whose characters belong to the given alphabet. If the code contains a checksum, this checksum must also be valid. If this is not the case, an AssertionError must be raised with the message invalid code.

If a Geohash36 object is passed to the built-in functions repr and str, they must return a string (str) representation containing the checksum of the geohash-36 code. The string representation returned by the function repr must be a Python expression that creates a new Geohash36 object with the same state as the object that was passed as an argument. Only when the alphabet of the geohash-36 code differs from the default geohash-36 alphabet, an argument must explicitly be passed to the parameter alphabet, which must also be named explicitly. In addition, the class Geohash36 must support at least the following methods:

Example

>>> geohash = Geohash36('bdrdC26BqH')    # The Shard (London, UK)
>>> geohash.checksum()
'm'
>>> geohash
Geohash36('bdrdC26BqH-m')
>>> print(geohash)
bdrdC26BqH-m
>>> geohash.position('b')
(4, 2)
>>> geohash.position('d')
(4, 5)
>>> geohash.position('r')
(1, 5)
>>> geohash.position('c')
Traceback (most recent call last):
AssertionError: invalid symbol
>>> geohash.longitude()
(-0.08666861949397955, -0.0866626657521719)
>>> geohash.latitude()
(51.504442086762694, 51.5044450636336)
>>> geohash.coordinates()
(-0.08666564262307572, 51.504443575198145)

>>> geohash = Geohash36('9LVB4BH89g-m')  # Statue of Liberty (New York, USA)
>>> geohash.checksum()
'm'
>>> geohash
Geohash36('9LVB4BH89g-m')
>>> print(geohash)
9LVB4BH89g-m
>>> geohash.coordinates()
(-74.0444452779683, 40.68916794076742)

>>> geohash = Geohash36('EAQK46y', alphabet='i8jC4TsPkQplz6AZE5WB3R2oKymUrOc0t7MG')
>>> geohash.checksum()
'k'
>>> geohash
Geohash36('EAQK46y-k', alphabet='i8jC4TsPkQplz6AZE5WB3R2oKymUrOc0t7MG')
>>> print(geohash)
EAQK46y-k
>>> geohash.coordinates()
(85.19483024691357, 18.600501543209877)

>>> Geohash36('bdrdc26BqH')     # kleine letter c niet in alfabet
Traceback (most recent call last):
AssertionError: invalid code
>>> Geohash36('bdrdC26BqH-a')   # ongeldige checksum
Traceback (most recent call last):
AssertionError: invalid code
>>> geohash = Geohash36('EAQK46y', alphabet='ABCDE')
Traceback (most recent call last):
AssertionError: invalid alphabet