Just like the perennial discussion on location-based services and Apple’s ability to track you, the question of accessing an iOS device’s data when the device is locked seems to come up every few months. This time around, the discussion was inspired by a CNET article, with the sensational title “Apple deluged by police demands to decrypt iPhones.”
The article seemed to be built around a single paragraph in a blurry copy of a search warrant affidavit from ATF, which stated that the writer “contacted Apple” and was told by “an employee [...] who is part of their Apple Litigation Group” that Apple “has the capabilities to bypass the security software” on the iPhone.
That’s it. That’s all we know. An ATF agent reports having talked to a single person at Apple, who told him that they can “bypass the security software” on iOS devices. And from that tenuous hold, the Twitters exploded with “See! I TOLD you Apple had a back door!” and other related Fear, Uncertainty, and Doom.
But is any of it warranted? What could Apple really be doing, and is that any different from what we already know? Let’s review what we know, and don’t know, about iOS security, passcodes, and encryption.
Filesystem Access via Boot Images
iOS devices will only boot from a drive with a boot image properly signed by Apple. This image is (usually) on the device itself, but the Device Firmware Update (DFU) mode can allow the device to boot from an external drive via USB. For older devices, an bug in the bootrom allowed unsigned drives to be booted. That’s since been fixed, but it’s always been an “open secret” that Apple could probably still boot from DFU (since, obviously, they would be able to create a signed external boot image).
Once booted off an external drive, the internal device can be mounted, and unprotected information read. Most built-in Apple apps do not provide extra encryption (to my knowledge, only the Mail application separately encrypts its data at this time). One reason is that some data needs to be accessible while the device is locked: Inbound SMS messages and phone call information have to be written to the disk, the Contacts list needs to be available for displaying the name of inbound calls (and for making outbound calls), etc. So there’s a fair amount of data that can be retrieved at this stage.
So far, we’ve simply replicated what commercial forensics providers do: Boot off an external drive, and find “easily extracted” data. The difference is that forensics tools take advantage of the DFU bug (and thus can’t extract data from iPhone 4S or 5), while Apple doesn’t need any stinking bugs and can do this magic with any device.
“But wait, iOS devices also have crypto! Crypto that uses ALL the bits! And this article PROVES that Apple can bypass that! They must have a back door.”
Well, yes, there are multiple layers of cryptographic support, but again, there’s no proof that Apple has any kind of way to get around that. First, let’s start with the device’s unique ID (UID). This isn’t the same as the “UDID” that’s been used by app developers to track devices and their users. This is a deeper ID, that’s “fused into the application processor during manufacturing.” Apple says that “No software or firmware can read [the UID] directly; they can only see the results of encryption or decryption operations performed using them” (see the excellent iOS Security overview paper, last updated October 2012).
This UID is used as the basis for all the rest of the keys on the system. At the lowest level, it’s used to derive the overall disk key, which provides a built-in full disk encryption for the iOS device. This means you can’t simply remove the flash drive from one iPhone and move it to another, since the key will still be back in the first device.
Additional encryption protection (alluded to above) can be added to a file’s data if the developer requests it, simply by setting an attribute when writing data to the disk. These files have their own encryption keys (it gets complicated — you really need to read that Apple paper. And when you do, keep this HITB 2011 presentation open in another window, it’ll help…) The keys for all the files are themselves protected with class-level keys (now we’re getting kind of hierarchical and/or meta), and those keys are stored in a keybag, which is itself encrypted using yet another key.
This last key is derived using the user’s passcode and the aforementioned UID device-unique key. Because the UID is tied to a device, any brute-force attempts to break the passcode have to happen on that device. And because “The UID is unique to each device and is not recorded by Apple or any of its suppliers” (again, the iOS Security paper) it is not possible to move any of these operations to another system, or to speed it up in anyway.
So how could apple “bypass” security? Several possibilities have been speculated on:
- They could have an escrow keybag that only they know about. True, this is possible. But this security system has been subject to some pretty heavy scrutiny, if there’s a hidden escrow bag, it’s very well hidden, and nobody’s discovered the mechanism for creating and updating that.
- There could be a back door in the crypto. Not likely, again, given the 3rd party research in the system. If there’s a back door, it’s an “NSA-LEVEL” hole and way beyond anything Apple would be doing.
- They could have a way to extract the UID after all. One person on Twitter said that “sending me marketing material a la It’s secure because the vendor says it is is THIS close to insulting my mother.” Okay, fair point. But this is also a very technically detailed bit of marketing material, with far more insight and transparency than just about any other vendor provides. And, again, pretty much everything in that paper has been verified by many other security researchers. Why would Apple risk everything and have a flat-out lie in this paper? It just doesn’t fit.
Finally, I think it’s important to apply Occam’s Razor to the situation. If any of these backdoors existed, then it would take like 10 minutes for Apple to completely unlock a phone, and the alleged “7 month backlog” wouldn’t exist (unless they had thousands and thousands of confiscated devices to process).
Now, there is one final way that Apple might be able to get at the encrypted data on a locked phone: iCloud. If the user has iCloud backups enabled, then there’s a real possibility that Apple has the ability to access that data. After all, you can restore an iCloud backup to a different device, and you can change your iCloud password without losing the data in the backup. But that also shouldn’t take much time at all, and so probably only happens rarely (not contributing to the 7-month backlog).
So, to sum up:
- Apple almost certainly gets confiscated iPhones sent to them by law enforcement
- With the proper search warrant, etc., Apple will do what they can to extract data from those phones
- They almost certainly can boot the phones from a legit, signed external drive, and gain access to much of the unencrypted data on the phone (damned near everything, unfortunately)
- If they want to get at data protected by a passcode, then they can start a brute-force attack, just as researchers and forensics tool companies have been doing for years
- If the user’s passcode is strong (5-6 alphanumeric characters), this could take months, if not years, to complete
- If the device was backed up to iCloud, it’s possible that all bets are off and the data would be easily retrieved from backup
Is any of this new? Any of it at all? Nope. Not a single item in that CNET article told us anything we didn’t already know, except maybe the length of the backlog. Which, really, should be a good demonstration that there isn’t any kind of magic back door, and that if you use a strong passcode and avoid iCloud backups, the data on your phone should be secure against just about anything, including being sent home to Cupertino.
One week from today I’ll be presenting a talk at Black Hat. Black Hat! Wow. I’m still a little amazed at this turn of events, but am trying not to dwell on it for fear of slipping into a blind panic.
But I think I’m ready. I submitted a nice long white paper a couple of weeks ago, and sent in my presentation yesterday. I’m comfortable with the material. I (think) I’ll be able to intelligently field questions. I’m pretty sure I won’t be a complete, blithering idiot on stage. And to settle my nerves, I’ve put in an early order for a bottle of Drambuie. Though I think I’ll save that for the obligatory post-talk celebration.
Of course, this isn’t the first time I’ve spoken at a conference — I was lucky enough to get a spot on the closing panel at ShmooCon this past January. There were four of us on the panel, so I didn’t get to speak long (only about 10 minutes). But being the closing session, most of the con was there — perhaps as many as 1000 people. I haven’t seen the video, but people tell me that I did well, so I guess there’s really no reason to be nervous here.
I still have yet to write up anything about that ShmooCon appearance, and hopefully I’ll finally do something soon. There’s been quite a bit happening in the password cracking / authentication business in the past six months, and I have a lot of interesting ideas swirling around that I really need to put down for others to comment on. Maybe I’ll write some on the flight to Vegas. You know, to keep my mind off of my talk.
It’s actually my talk that I’m writing now, to, er, talk about. Since joining Intrepidus Group, I’ve spent a good deal of time helping to assess risk and craft security guidelines for iOS devices in large enterprises. A large part of securing iStuff in the enterprise relies upon the use of Mobile Device Management technology (MDM). MDM has been around for a while, especially for some of the older, more corporately-established mobile devices (like BlackBerry or Windows Mobile). Last summer, though, Apple jumped into the arena, adding support for their devices as part of iOS 4.0.
Unfortunately, the way that MDM works for iOS hasn’t been very well described, publicly. Which makes it difficult when you’re trying to demonstrate to a customer that it will make their enviroment more secure.
So I set about doing everything I could to understand, at a deep, technical level, exactly how the technology worked. We were already pretty satisifed, abstractly, with the features and capabilities of Apple’s MDM, but we felt it necessary to go that extra step to truly know what it’s doing. The end result of this is that we now have a mostly-complete understanding of how the protocol works.
Which is what I’ll be talking about next week. I start with how iOS settings work, move into additional features available through the iPhone Configuration Utility, and then start talking about MDM. The talk shows in detail how MDM uses the Apple Push Notification Service, and describes the message format used to make that notification. It’ll also document the interaction between device and server, from authentication and enrollment to receiving commands and providing responses. Enough detail is provided to enable you to write your own experimental MDM server (or, you could simply use the one I’ll be releasing at the talk).
Finally, I’ll talk about some limitations and weaknesses I’ve uncovered, and their potential security ramifications. There might even be a surprise for those hardy enough to sit through the whole talk.
This is going to be quite the experience for me. If your work involves securing iOS devices, especially at the enterprise level, please drop by and give a listen. If you can’t make it, check out the Intrepidus Group website after the conference — I hope to write up some of the more interesting bits of the talk for a standalone post, and we should also have the slides, white paper, and source code available for download at some point.
See you in Vegas!
This “Your iPhone Is Tracking Your Every Move!!” craziness just won’t go away. I’ve been kind of disappointed by the lack of very detailed analysis of the data that’s actually being collected, so I spent some time collecting information of my own.
I have access to four iOS devices running 4.0 or better: my personal iPhone 3GS, a family iPad with 3G subscription, a company-owned iPad (whose 3G has never been activated), and just arrived an iPad 2 that belongs to a client. So I spent some time this weekend trying to better understand what the Core Location daemons are doing.
First, please forgive me if I’m retreading already explored ground. Turns out that a few other people did the same thing this weekend, and so maybe I’m late to the party. I don’t want to be a “Me, too!” poster, but I also think there’s a little that I’ve found that I haven’t seen mentioned yet. Plus, I should mention the work of Alex Levinson, who looked at this in detail a year ago and has been a solid voice of reason from the beginning.
Anyway, first I’ll talk about some what I observed, then I’ll see if I can’t draw a few (hopefully valid) inferences. Some of the data were taken from the devices just as they were last week. Saturday, though, we went out to lunch and I took my phone, company iPad, and personal iPad all with me. During that trip, I kept the personal iPad locked the entire time, and I used the company iPad on the road (with Google Maps open the whole way). I used my phone briefly to make a call, and checked twitter a couple times while at the restaurant, and also for a while in a parking lot as my wife went into the grocery store.
First, the database.
I can see 5 tables within the consolidated.db that seem to be pertinent: CellLocation, CellLocationLocal, CellLocationHarvest, WifiLocation, and WifiLocationHarvest. All of these include details about speed, accurracy, elevation, and other such items that I’m not really concerned with (and many of which don’t seem to be used, at any rate). All also include a timestamp, latitude, and longitude, as well as some way of uniquely identifying the point it represents. In the case of a Wi-Fi access point, this is the MAC address, and in the case of a cell tower, it’s a tuple of four data items. Each entry in these tables appears to be unique — that is, no single cell tower or Wi-Fi access point appears more than once. Point 1: The devices are not tracking my every movement.
Now, my phone.
I see several access points noted all around my house. The accuracy isn’t phenomenal, as it puts my access point on my deck, and a neighbor’s in the middle of my kitchen. In fact, there are 11 different access points displayed either in my house, my yard, or just into my neighbors’ yards. Point 2: The Wi-Fi data points are not precisely located.
Also, the timestamps are varied. Four of the 11 around my house show a date/time from a couple days before I dumped the database (and another 4 are stamped two seconds later). But the other three are from early March, late February, and mid January. Point 3: The Wi-Fi data does not represent the last time I visited a location.
Finally, huge swaths are blanketed with data about Wi-Fi access points. Neighborhoods I’ve not driven through in months, if not years (or ever). These points share similar timestamps as the data within my neighborhood. Point 4: Data is present in the database for locations I’ve not visited.
The cell tower data is very similar. It shows towers located in areas I’ve not recently visited, with locations not corresponding to actual towers (in many cases, not even close — several were shown in residential communities where I’ve never seen a tower). The timestamps are similarly varied, with some I randomly clicked on going back to October 2010. Point 5: Cell tower data is treated the same as Wi-Fi access point data.
I did not see any new data points appear during the drive to the restaurant, or while we ate. However, a batch of data, both Cell and Wi-Fi, was timestamped while we sat outside the grocery store. The cell data, in particular, was scattered over a very wide area, at least several miles on a side. Point 6: Data appears for a wide area simultaneously, and is not necessarily tied to length of time sitting still.
Finally, I observed new data in the WifiLocationHarvest table. A total of 11 Wi-Fi access points were simultaneously recorded while I waited in the parking lot. The precision on this was pretty good — only about 50 feet from where I was sitting. Points 7 and 8: Actual recording of new data is not predictable, and is highly accurate.
I was also able to look at some past data on the phone. I took a one-day trip to Dallas at the end of March, and found large collections of data centered on the location I’d visited, the area I ate lunch, and three locations on the highway leading from the airport. Those locations roughly, I believe, correspond with times when I’d refreshed Google Map directions. Point 9: You may be able to force a data fetch by refreshing the maps application.
My family iPad, which I’d woken up before we left and promptly locked again, did not record any new data the entire time. Point 10: When locked, the device might not record anything at all.
The company iPad was in use the whole way to the restaurant. It has no record of any cell towers, which isn’t terribly surprising, since it does not have an active 3G data plan (though it does have the 3G hardware). Point 11: No data plan, no cell info.
Obviously, since there was no data plan, it couldn’t collect any new data along the way. However, as we left the grocery store, I unlocked the device, refreshed the map location, and locked it again. Once we’d returned home, the iPad fetched 394 Wi-Fi points, in an area about a 1/2 mile by 1/2 mile square, roughly corresponding to the place we were when I refreshed the map. All these data points were timestamped when they were fetched — that is, when the iPad had access to the Wi-Fi at home — not when I was actually on the road. Point 12: The device may cache your last request and fetch related data the next time a network is availble.
All three iPads showed a curious distribution of points around my office. The customers’s iPad, which has only been to the customer facility and my office, displayed points in a very short and wide rectangle centered on my office. My family iPad, which has only been a few placed since I loaded 4.0 on it, showed virtually the same distribution around the office and a similar distribution, but not as wide, around my house. Not all of these points had the same timestamp, but over time, it definitely started filling out that shape. Point 13: When fetching data, the device appears to collect points over a nearly-fixed vertical range (about 30 arcseconds of Latitude) and a variable horizontal range.
Finally, my wife had taken the family iPad on a short trip last weekend. The iPad showed a square burst of Wi-Fi data points about where she pulled over to check a map, and another wide rectangle around the hotel she stayed in. It also showed data in the CellLocationLocal table. That table showed her track along the interstate, and appeared to be an actual positional track. Interestingly, the CellLocation table did not have tower locations for virtually anywhere along that track. On my phone, I had two points from my Dallas trip, and a half-dozen points from a taxi ride into Manhattan a week prior. Point 14: The CellLocationLocal table may record actual trip data, but it appears to be very limited.
One further point of (potential) interest: The timestamps on the data were, if you’ll pardon the pun, all over the map. Many data sets had timestamps only a few seconds or minutes apart. But when I stripped out data sets that were within five minutes of another set of points, the average time between updates was about 14 hours. Note that there’s very little stastical rigor to this, but I thought it was interesting. Point 15: When the device spends an extended time at one place, data appears to be fetched about twice a day.
Summary of Observations
So, to sum up, here are my observations thus far:
- Point 1: The devices are not tracking my every movement.
- Point 2: The Wi-Fi data points are not precisely located.
- Point 3: The Wi-Fi data does not represent the last time I visited a location.
- Point 4: Data is present in the database for locations I’ve not visited.
- Point 5: Cell tower data is treated the same as Wi-Fi access point data.
- Point 6: Data appears for a wide area simultaneously, and is not necessarily tied to length of time sitting still.
- Points 7 and 8: Actual recording of new data is not predictable, and is highly accurate.
- Point 9: You may be able to force a data fetch by refreshing the maps application.
- Point 10: When locked, the device might not record anything at all.
- Point 11: No data plan, no cell info.
- Point 12: The device may cache your last request and fetch related data the next time a network is available.
- Point 13: When fetching data, the device appears to collect points over a nearly-fixed vertical range (about 30 arcseconds of Latitude) and a variable horizontal range.
- Point 14: The CellLocationLocal table may record actual trip data, but it appears to be very limited.
What does all this tell us? I think we can infer at least a few things, which are consistent with what others have been saying, and with Apple’s statements last year.
- The data in WifiLocation and CellLocation are not your device’s actual location at any given point in time, but instead are the location of others’ Wi-Fi access points and cell towers.
- The location of these points are estimated by Apple based on data harvested by iOS devices and provided to Apple on a periodic basis.
- Individual devices periodically record the Wi-Fi points and cell towers visible to them, record a precise location, and send that data to Apple. (I have not yet observed this happen, but it makes sense, and Apple’s already said as much).
- Periodically, the device will poll Apple’s servers for location information nearby. This seems to happen when the device has been at rest for some time, or when the location information is refreshed in the map application (it may be reasonable to expect that other applications querying the Core Location service may also trigger a refresh). There may be some logic in terms of what data gets fetched, perhaps to avoid downloading duplicate information. I haven’t been able to dig into that yet.
- The timestamp for the fetched data appear to be the time the data was fetched. One may be able to look in the middle of a set of identically-stamped data to infer where the user was when that data was fetched. However, the data don’t appear to be fetched every time you’re in any given location, even if you’re there for an extended time (like, say, lunch).
So what’s my conclusion? I’m still not sure about the CellLocationLocal table, which perhaps might be for recording locations for future data fetches. But the rest of the data all seem very consistent with what Apple’s told us: they’re used to aid in geolocating the device. Why are so many points stored? So that it won’t have to pull data down again in the future. It’s a big, personalized cache, made to make my personal use of geolocated features faster and more accurate.
[Note -- if you're interested in the python script I used to load the data into Google Earth, I'm posting it on the Intrepidus Group blog. It should be attached to this post from last week about my first review of the data.]
While lying on the couch last Friday, trying to decompress after a busy day and expecting an even more hectic weekend, I had a crazy idea for how Apple might implement multiple user accounts on iOS devices like the iPad.
File System Overlays.
Applications in iOS are all restricted to their own sandbox — that is, they can only access files and data within their own application bundle, and nothing else. So right off the bat, data’s pretty well segregated.
Now, imagine that there’s an easy way for the operating system to distinguish between the application itself and its data. Like if all apps stored data in, say, Documents and Private Documents and Caches and other similarly-named folders. Anything that’s user-specific would be pretty easy to identify and peel away from the rest of the app.
Here’s where the hare-brained idea comes in: Across the entire filesystem, take any of those such folders, and move them off of the main disk, and into a second filesystem that’s mounted as an overlay on the actual disk.
This is sort of weird. It probably needs a picture.
The base iOS filesystem has system files (the operating system itself plus built-in apps and such), and has separate applications installed by the user. Let’s assume that each app stores user-specific data in a standardized place, like “Documents.”
The device simply puts all the Documents folder into a separate filesystem, then depending on which user has been activated, takes that filesystem and merges it with the base filesystem, overlaying the folders back into their proper locations. So to the device, to the apps, it’s as if nothing has changed. Data’s where you expect it to be.
You could merge preferences in a similar way. iOS already supports multiple configuration profiles, and dynamically merges them into a single active settings profile. So you could have perhaps one “master” account, that can make unalterable settings for the entire device, then create different users, each of which could add their own preferences to what’s already been defined.
Imagine going back to the main home screen, and doing a five-finger pinch to “zoom out” of the iPad and into a new screen with different users listed. Tap on a different user (and enter a passcode, if that user has one set), and the OS removes your overlay and installs the other user’s overlay. Then it’s a whole new iPad!
And the best part about this is it’s all handled at the operating system level. No changes to the applications are necessary (obviously, they need to be following at least some kind of predictable approach for storing data, though there might be some sneaky ways for the OS to figure that out on the fly as well). Of course, if users wanted to share data with other users on the same device (think music or videos), then applications would need to add support for that.
iOS already supports some pretty fancy magic at the filesystem level, with the built-in data protections present in iOS 4. (In fact, it was while musing on those protections that this idea occurred to me). So I don’t see this as being too far off in terms of difficulty to implement. Provided they can get the right filesystem support into the kernel, which I’m sure wouldn’t be too difficult.
Any comments? Is this totally whacked out, or is there some potential here? Also, think about taking this to the desktop…it could definitely add a lot more security to data at rest where multiple users (or the same user, with multiple roles) are sharing a system…..
Okay, so in iOS you can disable things. To protect the user, the device, the organization, from misuse, etc. One of the things you can do is disable Safari, so the end user can’t surf to anything bad. (I’m being a little snarky — there are some good cases where you’d want to prevent end-user web surfing: Gambling sites. Porn. Chat rooms. Competitors’ tip sites. Stuff like that). It’s very easy, and appears to be very complete.
But yesterday I was testing something out and found an easy way around the restrictions. You can install what’s called a Web Clip to the iOS device (iPhone, iPad, etc.) That clip is basically a single web page, taken from whatever URL you configure when you create the clip. This clip goes on the main application screen of the device, just like a “real” application would, and allows quick and easy access to, well, just about anything. You could have a clip that shows a security dashboard. Or a weather report. Or list of important emergency contacts. Really, just about anything you could put into a web page.
The trick is that the device disables any links within that clip. So though you could display, for example, the front page of CNN, you couldn’t navigate to any of the links on that site. Or so I thought.
window.location=url to replace the contents of the window with the contents of the supplied URL variable. Pretty simple stuff.
I looked around (via google, naturally) for any other writeups of this vulnerability, but couldn’t find any. So I wrote it up and posted it here, on the Intrepidus Group website.
If you’ve seen this before, or have any additional details or thoughts, or especially, suggestions for a workaround, please let me know. I can’t believe I’m the only person to have noticed this.