Profiling Distance Functions for Geodesic Search
Computing distance on road network is not as straightforward as on a map. The Earth is actually an ellipsoid and so distance should be computed on a sphere rather than on a plane. Accounting for all the ways the Earth is neither a plane nor a sphere can be computationally expensive. I recently implemented a choice few geodesic distance function in warthog, a pathfinding library.
Searching on a (irregular) sphere
In pathfinding, one often need to use distance heuristics. That is functions able to (quickly) evaluate distances. Classic examples include Euclidean distance, for searching on a plane; or, Manhattan distance, for searching on grids. However, these functions are not accurate when it comes to estimating distance in road networks.
The Earth is an ellipsoid,^{1} an oblong object resembling an elongated sphere. For example, it is not technically correct to refer to “the radius of the Earth;” it depends where you measure.^{2} Thus, measuring using the Euclidean distance quickly accumulates error.
When computing distances, it is easier to consider a geodetic system. That is, use spherical coordinate. In this post, I will look at three distance functions, with different degrees of accuracy:
 The spherical projected distance.
 The greatcircle distance.
 The Vincenty distance.
Different distance functions
There is more than one way to compute distances on the globe. The main question boils down to: how accurate do you want it? Obviously, the more accurate the function the more expensive it is.
Some notation:
 $\Delta\phi$ is the difference between the latitudes $\phi_2  \phi_1$, in radians, sign does not matter.
 $\Delta\lambda$ is the difference between the longitudes.
 The mean latitude is: $\phi_m = \frac{\phi_1 + \phi_2}{2}$.
 $R$ is the radius of the Earth.
The code is mainly in geography.cpp, with the cosinebased haversine in haversine_heuristic.cpp and Euclidean distance in euclidean_heuristic.cpp.
Spherical projected distance
The idea is to project a sphere (the Earth) onto a plane. This function takes into account the variation in distance depending on latitude. $\begin{equation} D = R \sqrt{(\Delta\phi)^2 + (\cos(\phi_m)\Delta\lambda)^2} \end{equation}$
Greatcircle distance
The shortest distance on a plane is a straight line, on a sphere it’s a greatcircle. An example of a greatcircle on the globe is the equator. A greatcircle divides a sphere into two (equal) hemisphere. The problem of finding the distance between two points on a sphere is thus to compute the arc between these points on the greatcircle joining them.^{3} The distance is thus given by the spherical law of cosines: $\begin{equation} D = R \arccos(\sin \phi_1 \sin \phi_2 + \cos \phi_1 \cos \phi_2 \cos D\lambda) \end{equation}$
Haversine distance
For small distances and small float precision, the greatcircle function can have large rounding errors. ^{4} In this case, we can use the haversine (half of a versine) formula: $\begin{equation*} \text{hav}(\theta) = \sin^2 \left(\frac{\theta}{2} \right) = \frac{1  \cos \theta}{2} \end{equation*}$
Historically, this function was the de facto standard due to the availability of precomputed tables, making it efficient to use by hand. Computing a distance on a sphere is then: $\begin{align} D =\ & 2 R \arcsin \left( \sqrt{\sin^2 \left(\frac{D\phi}{2} \right) + \cos \phi_1 \cos \phi_2 \sin^2 \left(\frac{D\lambda}{2} \right)} \right) \\ \iff & 2 R \arcsin \left( \sqrt{\cos \left(\frac{1  D\phi}{2} \right) + \cos \phi_1 \cos \phi_2 \cos \left(\frac{1  D\lambda}{2} \right)} \right) \end{align}$
The second form uses the cosine form of the haversine to avoid the expensive square operation.
The Vincenty formula
Finally, if you want a precision down to 0.6 mm, you can call on the Vincenty formula. This is a procedure that computes distances on an ellipsoid. But, first, there is a special case when the ellipsoid is a sphere which we can use: $\begin{equation} D = R \arctan \frac{\sqrt{(\cos \phi_2 sin \Delta \lambda)^2 + (\cos \phi_1 \sin \phi_2  \sin \phi_1 \cos \phi_2 cos\Delta\phi)^2}}{\sin \phi_1 \sin \phi_2 + \cos \phi_1 \cos \phi_2 \cos\Delta\lambda} \end{equation}$
It’s already a more involved formula. I won’t go into the details on the actual Vincenty formula, the most important thing is that it is an iterative procedure. It means that, as opposed to everything before, it’s not a single operation. The (direct) Vincenty formula finds the azimuths and ellipsoidal distance between two points. It refines its approximation up to a constant – except a few cases that cannot converge.
Benchmarking
Test case
We will start by validating the code. No point in measuring performance if the results are bogus. I took the test case from movabletype.^{5} We will use two points in Australia: Flinders Peak and Buninyong.
Name  Long  Lat 

Flinders Peak  144°25′29.52440″E  37°57′03.72030″S 
Buninyong  143°55′35.38390″E  37°39′10.15610″S 
First, let us convert the degreeminutesecond notation to decimal degrees. The formula to convert is: $\begin{equation*} ddd°\ mm'\ SS.ssss'' = ddd + mm / 60 + SS.ssss / 3600 \end{equation*}$
Name  Lon  Lat 

Flinders Peak  144.424868  37.951033 
Buninyong  143.926496  37.652821 
Now we can look at the results of calling these functions:
Function  Result (km)  Difference (%) 

base  54.972271  0.000 
spherical  54.925547725412486  0.085 
greatcircle  54.925395854762677  0.085 
vincenty  54.925395854763295  0.085 
vincentyexact  54.972227314037639  0.000 
haversine  54.925395854761305  0.085 
haversinedirect  54.925395854762719  0.085 
fasthaversine  54.925395854762833  0.085 
haversineapprox  54.925717823812342  0.085 
euclidean  58.077968914900702  5.650 
I think one interesting result from this is that geodesic functions underestimate the distance while Euclidean distance overestimates. The other is that, between these two points, all functions seems to have the same error – with the exception of the exact Vincenty procedure.
Using a database
I ran some tests by collecting known cases online. This is more of a sanity check rather than some actual tests.
The main result is that the approximate functions all seem to return the exact some values. The iterative Vincenty procedure is more accurate and for more points.
Benchmarking
Now, one thing I like a bit too much is looking at assembly. I tried to engineer the code so the distance function themselves are not too tightly bound to the search code. I had a quick look at how the compiler would optimise the functions:
 spherical: projection onto the plane, less operations.
 haversine: optimised to minimise power operations^{6} and has a direct
deg_to_rad()
.  fast haversine: implemented using
sincospi
to reduce trigonometric operations.  approximate haversine: omit some trigonometric operations (
sin
, mainly) at the cost of precision.  greatcircle: similar to haversine but with the classic formula.
 vincenty: special case where the Earth is a sphere, resembles the two above.
I will not the exact Vincenty – not quite exact in this case^{7} – because it’s not a direct computation.
I counted the number of lines in the main function and its branches. This means that it is an upper bound on the number of operations executed as most are behind conditional jumps – especially for the fast haversine. I also defined a few constants to shave off one operation here and there.^{8}
Function  # lines 

Spherical  36 
Greatcircle  38 
Haversine  51 
Haversine (sin)  40 
Haversine (fast)  400+ 
Haversine (approx)  53 
Vincenty  70 
It is interesting that the sin
version compiles down to less instructions in the end. I tried fiddling with the code a bit but could not get the cos
version to bo shorter. Let’s see how they fare.
I decided to run benchmarks on a decent dataset: 24M nodes of the road network of the USA. I ran some preliminary testing on small test sets and could not get reliable results.
Function  size  Total time (ns)  Time per query (ns) 

spherical  23947347  5.14306e+08  21.477 
greatcircle  23947347  2.60185e+09  108.649 
vincenty  23947347  3.1358e+09  130.946 
vincentyexact  23947347  2.22148e+10  927.652 
haversine  23947347  1.87201e+09  78.172 
haversinesin  23947347  1.68385e+09  70.315 
fasthaversine  23947347  4.58581e+09  191.496 
haversineapprox  23947347  1.60163e+09  66.881 
euclidean  23947347  1.08925e+08  4.549 
Euclidean, having no trigonometric functions, is orders of magnitude faster than the rest. Then comes the spherical projection which is much faster than geodetic functions. I am actually surprised by how slow the greatcircle is, I was expecting something competitive with the haversine.
Finally, the last interesting bit of information from that table: tweaking the haversine one way or the other is actually slower. Simply having a wellwritten implement ion gets optimised better under the hood. It’s even more true for the fast version, which may good on GPUs but lags behind on CPUs. As for the approximate form, I don’t think a 5% improvement is worth it.
I had a quick look at the perf
report to see where we were spending time. Not surprisingly, we get the same distribution. Below is a summary of the data collected, stopping at internal functions. The way to read it is by comparing children’s time with their parents. E.g., if warthog::geo::exact_distance
took 26.35% of the total running time and its parent, run_distance
took 44.39%, it means exact_distance
took 59.36% of the profiling time.
Note: the distance functions are so fast that running the profiling code only takes 40% of total execution time, the rest is parsing the data.
44.39%run_distance

26.35%warthog::geo::exact_distance
 11.07%__sincos
 6.63%__atan2
 1.27%__ieee754_asin_fma
 1.23%__atan_fma
 0.72%__tan_fma

4.46%warthog::geo::fast_haversine
 0.79%__fma_fma3
 0.62%fma@plt
 0.50%__ieee754_asin_fma

3.55%warthog::geo::vincenty_distance
 2.09%__sincos
 0.99%__atan_fma

2.87%warthog::geo::great_circle_distance
 1.40%__sincos
 1.04%__ieee754_acos_fma

2.13%warthog::haversine_heuristic::h
 1.24%__cos_fma
 0.55%__ieee754_asin_fma

1.82%warthog::geo::haversine_approx
 1.30%__atan2

1.72%warthog::geo::haversine
0.62%__cos_fma
As a comparison, the Euclidean distance ran for 0.05% of the total time.
Now, let’s have a look at their precision. I did not manage to plot it because there are 58M datapoints. The fivenumber summary is below.
Function  mean  std  min  25%  50%  75%  max 

spherical  6.32e04  1.68e03  4.97e01  7.59e04  6.29e04  4.94e04  2.60e01 
greatcircle  6.31e04  2.33e03  8.99e01  7.59e04  6.29e04  4.94e04  1.00e+00 
haversine  6.28e04  1.42e03  4.09e01  7.59e04  6.29e04  4.94e04  3.68e01 
haversinesin  6.32e04  1.68e03  4.97e01  7.59e04  6.29e04  4.94e04  2.60e01 
fasthaversine  6.32e04  1.68e03  4.97e01  7.59e04  6.29e04  4.94e04  2.60e01 
haversineapprox  6.32e04  1.68e03  4.97e01  7.59e04  6.29e04  4.94e04  2.60e01 
vincenty  6.32e04  1.68e03  4.97e01  7.59e04  6.29e04  4.94e04  2.60e01 
vincentyexact  9.39e04  2.48e03  4.95e01  2.70e03  9.86e04  6.53e04  2.57e01 
Interestingly, the exact procedure seems to have the same results as the approximate ones. I think this is due to the format:
 DIMACS coordinates are truncated to six significant digits then converted to
int
.  We need to convert them back to double again.
 The distances are also encoded as integers with a 0.1 m resolution.
So we may be using functions with too high a precision to matter. Plus, all the rounding and converting may already be enough to shoot down the precision.
Conclusion
Right now, I feel like I want a better dataset to try the functions. The precision results are underwhelming and point to the format. However, we got some good insights on the speed of the different distance functions.
There is still one thing we do not take into account: elevation. A PhD candidate in my lab has very detailed maps which he uses to route electric vehicles. In this case, elevation matters. For example, electric vehicles can use regenerative breaking downhill. The longest path achievable on one charge depends on elevation changes.
Footnotes
An “oblate spheroid.” ↩
The Earth’s physical radius varies between 6,378 and 6,356 km, a 22 km difference. ↩
Which is unique if the points are not antipodal. ↩
But at most 0.5% with 64 bit floating points. ↩
I also followed their implementation. ↩
Although, from looking at the output, most modern compilers can inline power operations well. ↩
Or rather, exact in a single datum: WSG84; and missing some convergence/reciprocal checks. ↩
For example:
2 * EARTH_RADIUS
. ↩