I love Websites that promise information about IP addresses. Feeling like an international operative, Iím able to ferret out all sorts of details from a simple binary number. There is one thing, though: I wish the location associated with the IP address was a bit more accurate... well, a lot more accurate, actually.
It appears my wish has come true. Establishing the location of an IP address to within 700 meters is now possible. A Northwestern University research team has announced: ďWe demonstrate that our system can geo-locate IP addresses with a median error distance of 690 meters in an academic environment.Ē
Letís get to how they do it. The process involves three steps, or tiers, of increasing granularity. First, using Ping servers, Tier 1 converts response times into geographical distances, which allows the researchers to roughly estimate the location of the Target (the unknown location).
Next, the researchers pick a Landmark (a known IP address and physical location) that is near the Target. The slide below is a representation of how Tier 2 works.
In Tier 2, a Landmark is chosen near the Target.
Diagram courtesy of Northwestern University.
Two servers, V1 and V2, send Traceroute traffic to the Landmark and Target. D1 and D2 are the routes taken from server V1. D3 and D4 are the routes taken from server V2. Looking at the diagram, one can see that D1+D2 is less than D3+D4. So D1+D2 parameters would be used to estimate the geographical distance from the Landmark to the Target.
Tier 3 repeats the process in Tier 2 but uses Landmarks located within the D1+D2 geographical area, making it more accurate yet.
One thing I did not understand is how Landmarks are chosen. Professor Aleksandar Kuzmanovic, member of the research team, explained in an email exchange with me, as follows:
Kassner: The key appears to be establishing a database of known reference locations or Landmarks. How is that accomplished?
Kuzmanovic: Basically, we crawl the Web and scrape available geographical locations from Websites. To verify the accuracy of our system, we need some ground-truth, which we establish in three ways:
- PlanetLab Dataset: PlanetLab is an academic testbed consisting of hundreds of nodes worldwide. Because the locations of these nodes are well known, we used this data set. Because other researchers use this data set to estimate the accuracy of their systems, we were able to compare our results to prior work.
- Residential Dataset: We designed a Website where we enabled people we know all over the US to access the Website and leave their geographic location and IP address.
- Online Map Dataset: We obtained a large scale query trace from an online-mapping service. This dataset contains three months of usersí search logs for driving directions. From this data, we extracted another set of IPs with known locations.
Once Landmarks are pinned down, all attention is focused on how fast digital traffic travels from one point to another. The variables are mind-boggling, one being how fast digital traffic travels through fiber optic cables. The fact that we live on a sphere is also significant. The shortest distance between two points is not a straight line but a segment of a great circle. Why is this important? The timing responses will be different, depending on whether the cable follows a great-circle route or not.
I asked Dr. Kuzmanovic if the Northwestern team had any additional tricks. He responded:
Our contribution lies in showing that relative network distances (as opposed to absolute network distances) are the key to achieving accurate geo-location results.
I interpret this to mean latency and different paths are not an issue. Thatís because their approach runs timing samples in the immediate vicinity.
Since this approach uses IP addresses, something every device on the Internet has, GPS and other location-tracking techniques being talked about now are not required. Anyone with the appropriate know-how can locate you using this method.
It will be interesting to see what unfolds from this research and whether it will run headlong into the controversy surrounding Do Not Track.
— Michael Kassner is a writer and consultant specializing in information security.