Showing posts with label location tracking. Show all posts
Showing posts with label location tracking. Show all posts

Wednesday, April 27, 2011

A couple of discussions on location privacy and the iPhone

So my recent posts analyzing the iPhone location data log have gained a lot of traffic and attention over the past few days, from places including the Toronto Star, the Wall Street Journal, TUAW, PC World, MacDailyNews, Apfeltalk (German), Cisco, Pete Warden at O'Reilly, Business Insider, and more.

This led to me being invited to participate in a discussion on the Brian Lehrer Show yesterday on radio WNYC, the NPR affiliate in New York, together with Jennifer Valentino-Devries of the Wall Street Journal. We had a good sensible discussion, in contrast with a lot of the hysterical reporting that has been going on.

Today I will be taking part in a longer discussion on the KQED Forum discussion show in San Francisco. Also participating will be Congresswoman Jackie Speier, Jim Dempsey from the Center for Technology and Democracy, and Kara Swisher of All Things Digital. This will be at 10am mountain time, there should be a recording here sometime after that.

Apple issues Q&A on "Locationgate", and addresses key issues

Apple rather belatedly issued a Q&A on the whole "LocationGate" saga. This confirms what I said about the data being a cache of cell tower and wifi locations. The fact that this was kept for up to a year was a bug. Within the next few weeks they will reduce this to 7 days, they will not back up the cache any longer, and they will turn off the cache when you turn location services off, which addresses the issue reported by the Wall Street Journal and widely re-reported.. These are all good actions to take, and address the key issues in my opinion. It does reinforce the importance of developers being careful about location security, and Apple was slack in this case, even though the potential risks were much less dire than widely reported.

Note that in the short term if you are concerned, you can encrypt your iPhone database backup just by checking a box on the front page in iTunes (after plugging in your iPhone). If you do this, the current location log cannot be accessed by someone who hacks into your computer.

Sunday, April 24, 2011

The scoop: Apple's iPhone is NOT storing your accurate location, and NOT storing history

The Summary
So in my previous two posts I discussed how the data I was seeing in my iPhone location logs was actually not very accurate, and certainly didn't reveal where I lived or worked or had stayed on my travels - beyond showing the cities I had been to, including general areas I had visited, as well as some I hadn't. There had been some discussion that the data appeared to be, in a number of cases, the location of cell towers you had been in communication with, although in some cases locations were a long way from where you had been.

The quick summary: I believe I have confirmed that Apple is not storing your location, but the (actual or estimated) location of cell towers (and WiFi access points) that are close to you, to help locate you as you move (these are not necessarily towers that you have been in communication with). In the data I have examined there is nothing that is based on the accurate location of the iPhone. For a good example, see my previous post showing the location of cell equipment in Coors Field baseball stadium, and not revealing the location of my home which is very close to there. In my opinion, if Apple was storing this data in order to know where you had been, they would be storing different, more accurate location data that they have access to.

And, importantly, they are not storing history - the only thing that can be found from the files is when you last visited a general area, not if you made repeat visits. This is especially important as it means that many of the concerns expressed about this data are simply not valid: it cannot be used to determine where you live, or work, or go to school, or who your doctor is.

Here is a report of what Al Franken said:
Sen. Al Franken, a Minnesota Democrat, said it raises “serious privacy concerns,” especially for children using the devices, because “anyone who gains access to this single file could likely determine the location of a user’s home, the businesses he frequents, the doctors he visits, the schools his children attend and the trips he has taken — over the past months or even a year.”
The only part of this that is correct is that the data will show what cities you've visited, with some indication of which parts of a city you may have visited, though nothing definite - there will be records in areas you didn't visit. And it doesn't show repeated visits to the same location, only the last one.

Update: see below for a very interesting comment from "Anonymous", who includes a link to a document submitted by Apple to Congress in July 2010. This includes the following:
"When a customer requests current location information ... Apple will retrieve known locations for nearby cell towers and Wi-Fi access points from its proprietary database and transmit the data back to the device" ... "The device uses the information, along with GPS coordinates (if available), to determine its actual location. Information about the device's location is not transmitted to Apple, Skyhook or Google. Nor is it transmitted to any third-party application provider, unless the customer expressly consents". 
The data under discussion in this whole debate is clearly (in my opinion) a cache of the data mentioned here of nearby cell towers and Wi-Fi access points. I guess the remaining valid concern is that this cache is not stored as securely as it could be, and a fairly large amount of data is stored in the cache. But still this data provides only relatively coarse information as discussed here, and is stored only on the user's own computer, so the risks are relatively minor compared to many of the more dramatic scenarios that have been raised.

Update April 27: Apple has issued a Q&A document about all this, which confirms the conclusions I had drawn, and talks about changes they will make. See my thoughts here.

Read on to find out how I reached these conclusions.

The details
Last night someone called Jude commented on my last post, saying:
My Guess?

It's not a list of cell phone locations that you've been to, but the opposite, a list of cell phone locations near you downloaded to the iPhone from Apple in case you move into range of one of them. i.e. At a guess what is happening is location services identifies a cell tower and asks for its location, and is replied to with the list of locations that contains that cell tower, that list is then cached so that it does not need to be requested again.

Of course, this is only a guess based on the wide range of addresses people are seeing and how its near to, but not exactly where, the people have traveled.
Good thinking Jude! I thought this could explain a lot, so I investigated further. First I looked at some data from my fairly recent New York trip. I looked at the timestamps on some locations and did a query to display all the locations with the same timestamp. I found out that in general, quite a number of records shared the same timestamp, and they would be clustered in the same area. For example, this screen shot shows a set of records that were all loaded at exactly the same time:
Screen shot 2011-04-24 at 7.25.30 AM
This cluster of points is some way above where I drove, I was driving along the Long Island Expressway going east from LaGuardia Airport. The timestamp appears to be in seconds and has 7 decimal places, so it is apparent that this set of data must have been downloaded in a single transaction, it was not obtained by communicating with cell towers at each of these locations independently. It seems reasonable to assume that this data was downloaded to help locate me in the event that I drove into this area (which I didn't). You can observe similar clusters by clicking a dot at random, copying the timestamp, and running a filter in Google Fusion Tables to display all dots with the same timestamp.

What I really wanted to do now was to animate my data, to more easily visualize what was happening. I couldn't figure out an easy way to do this in Google Fusion tables - although it has some capability for this, it wasn't recognizing the timestamp field as a date-time. So I went to look at the data that Sean Gorman had posted of his logs at GeoCommons (my original file had been too large to visualize there without me doing a little more work). GeoCommons has a cool animation capability, which you can try out on Sean's map by dragging the sliders at the bottom left.

I found something really interesting when I zoomed in around the geoIQ office in Arlington, where Sean works. This screen shot shows that between November 11, 2010 and April 20, 2011, there is no record of Sean being at his office.
Screen shot 2011-04-24 at 8.12.15 AM
Now I know that Sean likes to escape for a spot of skiing in Colorado now and then, but that's a pretty long absence for a company President :) ! And I know I have met with him in the office during that time period.

If you drag the time slider a little further, then at the same instant, about 20 more locations appear on the map, covering a general area around the office, roughly half a mile square:
Screen shot 2011-04-24 at 8.12.31 AM
So from this data I can tell that Sean was somewhere in the general area of this half mile square (not necessarily inside it) on April 20. I know nothing about whether he was there before that, and I don't know anything about exactly where he went.

So, this data stored in the iPhone logs is much less revealing than it may initially seem. At a quick glance it does look like it is recording your location history, and I think that Pete Warden and Alasdair Allan were quite right to raise the concerns that they did. It takes some digging in the data to realize that the concerns are not nearly as bad as they appeared at first sight. By publicizing it as they did, and providing their tools and documentation on how to examine the data, they made it easy for others like myself, Sean Gorman and Will Clarke to analyze the data and figure out more about what is going on.

It's still not clear exactly what the data is for, but my guess, as Jude suggested, is that it is to aid in fast location determination - once the iPhone figures out that you're in an area, it downloads data for surrounding cell towers (and Wifi hotspots, a detail I haven't gone into here but the data is available for those too, as discussed in my previous post), so it can quickly locate you as you move around that area (update: see the first comment below, and my addition to the initial summary, which reference a document from Apple that confirms that this is the case).

So to summarize again, there are still some concerns with this data - it does give an approximate indication of places you've been, but not good enough to identify specific buildings or businesses. It doesn't record history - there is no way to tell if you've visited a location multiple times, you can just tell the last time you visited a general area (though there might be clues about multiple visits - for example data showing you visited a neighboring area on a different date, but nothing definitive or detailed about repeat visits). But it definitely doesn't reveal the sort of detailed information that many people have been concerned about.

Saturday, April 23, 2011

More on Apple recording your iPhone location history

In my previous post I discussed how the location data being recorded from my iPhone actually wasn't very accurate, and certainly not accurate enough to tell where I live or work (based on the data I've examined so far, which is in a table called CellLocation in the iPhone backup, and is the data discussed by Pete Warden and displayed by his iPhoneTracker app, which is what I used for the visualizations in my previous post). Pete's app aggregated data to a regular grid, partly to provide additional security.

However, I was sufficiently intrigued to follow Pete's instructions to get at the raw data. My investigations with this reinforced the conclusion of the previous post, that the data does not accurately represent your location. But it did show up some interesting new patterns. I loaded the data into Google Fusion Tables and have made it public, you can view it here (and feel free to play around with it).

Here is an interesting map of downtown Denver, where I live.

This shows all the raw point data, with no aggregation or changes. There are actually no dots at all in the block where I live. However, there is a noticeable cluster in Coors Field, the Colorado Rockies baseball stadium which is 3 blocks away from where I live. I haven't been to the Rockies stadium over the time period that this data was recorded. There's also a strong cluster in Mile High Stadium, home of the Denver Broncos.

I would assume that there is additional cell phone infrastructure in these stadiums, to help cope with the heavy concentration of people. A quick Google search found this article about AT&T infrastructure at Coors Field
AT&T at Coors Field
This reinforces the notion that at least some of these locations are the locations of cell equipment that your phone is communicating with. But I'm not sure that's the whole story.

Here's a map of Cropston, where I spent most of my time on my last two visits to England. It's a small village in a fairly rural area.

Here there are no locations shown in the village itself where I spent most of my time. A lot of the locations are clustered in towns or along streets, but some seem to be more in the middle of nowhere. Hard to draw any definite conclusions.

I just received a suggestion from Jonathan Barnes, via Pete Warden, that the HorizontalAccuracy field may be significant, with lower values indicated accurate locations via GPS. However, I did a quick test, for example this map filters to only show records with this field set to 500.0, the minimum value I found from a quick skim (Fusion tables seems to treat this as a string rather than a number), and while this reduces the number of records it doesn't offer any noticeable change in accuracy - it still includes all the readings from Coors Field and Mile High Stadium where I haven't actually been.

Pete also pointed me at another table in the backup called WifiLocation, which in my case was about 5 times larger than CellLocation. I have loaded this to Fusion Tables here. One interesting thing about this table is that data only shows up in North America (with one random exception in Munich, where I haven't been recently). It seems a little more focused on areas I've been to, but no more revealing in terms of showing specific locations where I've spent time.

As I said, feel free to play with the tables I uploaded, and let me know if you find anything interesting! But my conclusion remains that this data doesn't reveal where you've been with any degree of accuracy.

Update: see my latest post where I conclude that this is data being downloaded from Apple, rather than uploaded, and that detailed history is not stored - thanks to Jude in the comment below for his suggestion about this.

So actually, Apple isn't recording your (accurate) iPhone location

So over the past couple of days there has been mass hysteria, questions in Congress, etc, over the fact that Apple is apparently recording all the locations you've been to with your iPhone without telling you, and storing it without encryption. The news was broken by my friend Pete Warden at Where 2.0 last week and has escalated rapidly since then. As someone who publishes their location anyway (you can see where I am right now by checking the right hand panel on my blog) I was less concerned about this than many, though I agree that Apple should make it clear that they are recording this information and give you the option to turn it off, plus it should be stored more securely.

However, yesterday Sean Gorman posted that he had analyzed his data, and the interesting thing is that it wasn't accurate - it showed the general areas he'd been to, but didn't reveal where he lived or where he worked. And then I also found this post by Will Clarke, followed by this one, which also conclude that whatever the data is, it isn't your accurate location (though I think Will prematurely concludes that it is cell tower locations - Sean's analysis suggests that isn't the case, though it seems it may well be related to this).

I just had a good chat on the phone with Pete about these posts, and about my findings which I'll get onto in a moment, which similarly conclude that whatever is being tracked, it isn't your accurate location. Pete said that their conclusions were similar, but also that he didn't think it was simply cell towers. I know that my iPhone knows my location much more accurately than the locations that I see in the data I've looked at. For me, as for Sean, there was no cluster of points either at my home or my office. Pete asked me if I'm on WiFi rather than 3G at home and at work, and the answer is yes, so there may be some clue there.

But the main point of these posts, and mine, is that this data does NOT indicate where you live, where you work or any exact locations you've been to. This is not reflected in most of the reporting you see about the topic.

I thought I'd share some screen shots of maps that I got, which I actually thought were cool :). Since I travel quite a bit, I have a few interesting examples which might give some clues as to what this location data actually does represent. The detailed (larger scale) maps here show a grid of dots, which is something introduced by Pete's map display tool rather than how the underlying data is. I will try to play around a bit more to get at the raw data, but thought I would share these initial findings first.

So to start with, here's an overview of my world travels of the past few months, which seems pretty accurate, and goes back to at least September:
01 World

Here's a view zoomed in on the US. The interesting thing here is that New York has the largest bubble over it, but I only spent two days there on a recent trip (1 day in Manhattan, 1 day on Long Island). Denver where I live has a much smaller blob.
02 North America

Here's a map of Colorado - there seem to be quite a few outliers here on the south side of the map - I think that the closest I've been to these in recent months is Keystone, where you see a cluster of dots. Some of these dots are probably 50 miles away from where I was.
03 Colorado

Zooming in on Denver, you see a lot of activity. I'm sure I haven't covered Denver quite as comprehensively as the dots here suggest.
04 Denver

In this map of downtown Denver you see the gridding which somewhat obscures the underlying data. However, the largest dot is some way away from my home (which as I think everyone knows is above the famous Wynkoop Brewing Company), and the dots are fairly evenly spread - these certainly do not indicate where I spend most of my time downtown.
05 Denver downtown

Similarly, my office (where I usually work a couple of days a week, the other days I work at home) does not jump out on this map of the Denver Tech Center (as you can find out from our web site, the Ubisense office is at 5445 DTC Parkway).
06 Denver Tech Center

On to my UK travels - I think the data includes two trips there. There seem to be quite a few outliers here also, and some fairly large clusters in places I just passed through on the train. I spent most of my time on these trips at my mother's house in Cropston, just north of Leicester, which isn't reflected in the data. I spent some time in London, but it has a disproportionate representation on the map (as New York did in the US map).
07 UK

Zooming in to the Leicester area, you can see Cropston just to the north of the city, which is where I spent nearly all my time, and this has no readings. I didn't travel around Leicester nearly as much as the dots would suggest. So this map is very misleading in terms of where I spent my time in this area.
08 Leicester

This map of Zurich is interesting: I connected through Zurich airport in November, en route to Denmark. I spent maybe 3 hours in the airport and didn't leave it, but you can see lots of outliers, which are up to about 20 miles away.
09 Zurich Airport

Here's a map of my trip to Denmark, where I spent time in Copenhagen, Aarhus and Naestved. The interesting thing on this one is that I just drove straight across the island of Funen (Fyn) in the middle of the map, but you can see quite a scatter of readings on either side of the road, especially to the south on the east side of the island.
10 Denmark

Almost at the end ... I included this map of Paris as I thought it was interesting that we traveled from London to Paris and back on the Eurostar train, but no points show up along the route. There's an odd horizontal line of locations to the north of Paris, but nothing apart from that between Paris and London.
11 Paris

And finally an example from Sydney. This shows a disproportionate number of readings at the airport in the south, where I just arrived and left but didn't spend any time. It doesn't show that I spent a good amount of time downtown, and I gave a talk in Paramatta where there is just one isolated dot. I stayed with friends north of Sydney but again you can't tell where.
12 Sydney

While I don't want to be an apologist for Apple, and what they are doing here is careless at best, my general conclusion is that this is likely something unintentional, similar to the Google Street View WiFi data fiasco. If Apple wanted to track your location history, why wouldn't they use your accurate location, which I know my phone knows much more accurately than is shown in the data in these files.

The interesting question for us geo-geeks is exactly what the location data is - something related to cell towers seems plausible. I will try to poke around in the raw data a little more. Since I have a few interesting example cases, am happy to share my data if anyone wants to look at it. Pete just tweeted that there is another table with WiFi locations, that would be an interesting thing to explore.

Update: I've done a new post which includes maps with the raw data, using Google Fusion tables. Doesn't change the conclusion that the data doesn't accurately represent your actual location, but does show some interesting new patterns.

Monday, May 10, 2010

Location based art: Audio Graffiti

One of the cool things about the Ubisense Real Time Location System (RTLS) is that customers come up with all sorts of interesting applications that we would never have thought of. We have had several artists doing cool things with the system - check out the video below showing "Audio Graffiti" by Zack Settel and Mike Wozniewski, powered by Ubisense. Users can "tag" or "spray" sounds at a location, and other users hear these as they move around the space. Click through on the video for more information.

Audio Graffiti no. 2 from Mike Wozniewski on Vimeo.

Thursday, April 24, 2008

Ubisense location tracking to be featured at Location Intelligence

My former company Ubisense (which I still hold some equity in) is going to have its indoor location tracking technology featured at the upcoming Location Intelligence conference. I'm looking forward to seeing the latest iteration of the technology, which has come along significantly since I left there in late 2005 to join Intergraph. The technology will feature heavily in the opening "speed networking session", and all conference attendees will be given tags which will track their location throughout the conference. Ubisense uses ultrawideband networking, which is generally accurate to within a foot or so (though that depends on a variety of factors). When I worked in this space, I was really surprised to find out how hard it was to do accurate tracking indoors. The difficulty is that whatever type of sensing technology you use, signals tend to reflect off walls, floors and ceilings, and direct signals are frequently blocked by furniture, people or other obstacles. This means it is very easy to get false readings which can result in serious errors in location calculation.

There are a variety of technologies that people are using to try to tackle the problem. Passive RFID really just measures when a tag is within a given distance of a sensor, so it can be used to track whether someone has entered or left a given room or area of a building, but can't give a more precise location (unless you have a very dense set of readers). WiFi is another option which has a pretty coarse accuracy - typically tens of feet, so not enough to reliably identify which room someone is in, and certainly not accurate enough to measure more specific interactions between people. As I mentioned above, ultrawideband is generally accurate to within a foot or so, which puts it at the higher end of the range in terms of accuracy, with relatively low infrastructure needs compared to other high accuracy solutions like ultrasonic or laser sensors.

For several years now I have been thinking that the precision indoor location tracking market is about to take off, and it hasn't quite done so. But based on some of the signs I am hearing about from my friends at Ubisense, I think it may finally be reaching that point. I will write more about how it all works out at the show next week, and also plan a few more updates on what Ubisense is up to.