Geohash1 is a geocoding2 system that encodes a geographic location into a short string of letters and digits. For example, geohash ezs42 encodes for an area around the coordinate with decimal longitude and latitude (-5.6, 42.6). Geohashes offer properties like arbitrary precision and the possibility of gradually removing characters from the end of the code to reduce its size (and gradually lose precision). As a consequence of the gradual precision degradation, nearby places will often (but not always) present similar prefixes. The longer a shared prefix is, the closer the two places are.
Using the area around coordinate (-5.6, 42.6) as an example — indicated by a red marker on the world map below — here is how it is encoded as a geohash. First we divide the world map with a vertical line into two halves and represent each half with a binary value: 0 (left half) and 1 (right half). Because the location is in the left half, the first bit of the geohash is a zero (0).
To improve the resolution of the location, we subdivide the remaining area again in two halves. But now we use a horizontal line for the division and also indicate each half with a binary value: 0 (bottom half) and 1 (top half). Because the location is in the top half, the second bit of the geohash is a one (1).
We then subdivide the remaining area in two halves, but again use a vertical line. Because the location is in the right half, the third bit of the geohash is also a one (1).
By iteratively subdividing the remaining area in two halves — alternating between vertical and horizontal subdivisions — we double the resolution of the geohash in each step, but also make the geohash one bit longer.
Decode a given geohash into coordinate
The decoding algorithm is outlined using the geohash ezs42
we recall from the introduction. The first step is to decode the geohash
into its decimal value. Geohashes use 32 different
characters in their symbol set: the digits 0–9
and the letters a–z, excluding a,
i, l and o (no distinction is
made between uppercase and lowercase letters). As a result, the geohash
can be interpreted as a number in a base-32 numeral system. A number
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | b | c | d | e | f | g | |
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | |
h | j | k | m | n | p | q | r | s | t | u | v | w | x | y | z | |
16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 |
The decimal value of geohash ezs42 with
>>> bin(14672002) '0b110111111110000010000010'
When converting a decimal value into a bitstring, we might have to add
leading zeros to the binary representation, because each symbol in a
geohash encodes for 5 bits (
If we now assume that counting starts at 0 in the left side, the even bits are taken for the longitude code (0111110000000), while the odd bits are taken for the latitude code (101111001001). Each of these bitstring can be interpreted as the instructions in a series of divisions, considering one bit at a time, again from the left to the right side. For the latitude value, the interval [-90, 90] is subdivided into two equal halves, producing two intervals: [-90, 0] and [0, 90]. Since the first bit is 1, the higher interval is chosen, and becomes the current interval. The procedure is repeated for all bits in the latitude bitstring. Finally, the latitude value is the center of the resulting interval. Longitudes are processed in an equivalent way, keeping in mind that the initial interval is [-180, 180].
For example, in the latitude bitstring 101111001001, the first bit is 1, so we know the latitude is somewhere between 0° and 90°. Without any more bits, we would guess the latitude was 45°, giving us an error of ±45°. Since more bits are available, we can continue with the next bit, and each subsequent bit halves this error. The table below shows the effect of each bit. At each stage, the relevant half of the interval is highlighted in green: a 0-bit selects the lower interval and a 1-bit selects the upper interval. The column "latitude" shows the estimation of the latitude as the mean value of the interval. Each subsequent bit makes this estimation more precise.
position | bit | min | mid | max | latitude | maximum error |
---|---|---|---|---|---|---|
0 | 1 | -90.000 | 0.000 | 90.000 | 45.000 | 45.000 |
1 | 0 | 0.000 | 45.000 | 90.000 | 22.500 | 22.500 |
2 | 1 | 0.000 | 22.500 | 45.000 | 33.750 | 11.250 |
3 | 1 | 22.500 | 33.750 | 45.000 | 39.375 | 5.625 |
4 | 1 | 33.750 | 39.375 | 45.000 | 42.188 | 2.813 |
5 | 1 | 39.375 | 42.188 | 45.000 | 43.594 | 1.406 |
6 | 0 | 42.188 | 43.594 | 45.000 | 42.891 | 0.703 |
7 | 0 | 42.188 | 42.891 | 43.594 | 42.539 | 0.352 |
8 | 1 | 42.188 | 42.539 | 42.891 | 42.715 | 0.176 |
9 | 0 | 42.539 | 42.715 | 42.891 | 42.627 | 0.088 |
10 | 0 | 42.539 | 42.627 | 42.715 | 42.583 | 0.044 |
11 | 1 | 42.539 | 42.583 | 42.627 | 42.605 | 0.022 |
The same procedure yields an estimation for the longitude (bitstring 0111110000000).
position | bit | min | mid | max | longitude | maximum error |
---|---|---|---|---|---|---|
0 | 0 | -180.000 | 0.000 | 180.000 | -90.000 | 90.000 |
1 | 1 | -180.000 | -90.000 | 0.000 | -45.000 | 45.000 |
2 | 1 | -90.000 | -45.000 | 0.000 | -22.500 | 22.500 |
3 | 1 | -45.000 | -22.500 | 0.000 | -11.250 | 11.250 |
4 | 1 | -22.5000 | -11.250 | 0.000 | -5.625 | 5.625 |
5 | 1 | -11.250 | -5.625 | 0.000 | -2.813 | 2.813 |
6 | 0 | -5.625 | -2.813 | 0.00 | -4.219 | 1.406 |
7 | 0 | -5.625 | -4.219 | -2.813 | -4.922 | 0.703 |
8 | 0 | -5.625 | -4.922 | -4.219 | -5.273 | 0.352 |
9 | 0 | -5.625 | -5.273 | -4.922 | -5.449 | 0.176 |
10 | 0 | -5.625 | -5.449 | -5.273 | -5.537 | 0.088 |
11 | 0 | -5.625 | -5.537 | -5.449 | -5.581 | 0.044 |
12 | 0 | -5.625 | -5.581 | -5.537 | -5.603 | 0.022 |
Your task:
Write a function geo2dec that takes a geohash (str) and returns its decimal value (int).
Write a function geo2bin that takes a geohash (str) and returns its bitstring representation (str).
Write a function unravel that takes a string (str) argument. If we number the characters in the given string from left to right starting at zero, the function must return a tuple (tuple) containing two strings (str), where the first string is composed from the characters at the even positions and the second string from the characters at the odd positions.
Write a function bin2coord that takes three arguments:
i) a bitstring (str), ii) the lower limit
Write a function geo2coord that takes a geohash (str)
and returns its corresponding coordinate
>>> geo2dec('ezs42') 14672002 >>> geo2dec('DRUGGED') 13684424108 >>> geo2dec('ZUR1CH') 1068205424 >>> geo2bin('ezs42') '0110111111110000010000010' >>> geo2bin('DRUGGED') '01100101111101001111011110110101100' >>> geo2bin('ZUR1CH') '111111101010111000010101110000' >>> unravel('0110111111110000010000010') ('0111110000000', '101111001001') >>> unravel('01100101111101001111011110110101100') ('010011001101110010', '10111110111101110') >>> unravel('111111101010111000010101110000') ('111111110000100', '111000100111100') >>> bin2coord('0111110000000', -180, 180) (-5.625, -5.5810546875) >>> bin2coord('101111001001', -90, 90) (42.5830078125, 42.626953125) >>> bin2coord('010011001101110010', -180, 180) (-71.91375732421875, -71.91238403320312) >>> bin2coord('10111110111101110', -90, 90) (44.27215576171875, 44.273529052734375) >>> bin2coord('111111110000100', -180, 180) (178.6376953125, 178.648681640625) >>> bin2coord('111000100111100', -90, 90) (69.23583984375, 69.2413330078125) >>> geo2coord('ezs42') (-5.60302734375, 42.60498046875) >>> geo2coord('DRUGGED') (-71.91307067871094, 44.27284240722656) >>> geo2coord('ZUR1CH') (178.6431884765625, 69.23858642578125)