Bluetooth is a wireless personal area-networking standard for exchanging data over short distances. Bluetooth Low Energy (BLE) is the power and application-friendly version of Bluetooth that was built for the Internet of Things (IoT). The power efficiency and low energy functionality make this protocol perfect for battery-operated devices. Since BLE now comes native on every modern phone, tablet, and computer, it makes for a perfect starting point to connect with the vast multitude of devices that IoT promises to bring to the world. A Bluetooth device, like any wireless device, announces itself to the world, by sending out advertisement packets.
BLE advertisements are a periodic unidirectional broadcast from the peripheral to all devices around it. A listener can then use the information in these packets to gather the information being advertised or connect to the advertiser. Certain devices cannot be connected to, and this depends on what is announced in the advertisement header. The four types of advertisements are:
- Connectable undirected advertising
- Connectable directed advertising
- Non-connectable undirected advertising
- Scannable undirected advertising
Devices that only transmit, such as beacons, use the third advertising type. Devices that need to quickly connect to something else use the second type. Most other devices use the first advertisement type. While advertising, the device can also indicate if it is using a random MAC address or using its own MAC address. This gets important when doing passive analytics.
The advertisement packet has up to 31 bytes that can be used to advertise additional information about the device. The most common payloads are:
- Local name
- Manufacturer-specific data
- Power level
The manufacturer-specific data, as indicated by the name, is where a device manufacturer can slot in its own specific information, while also identifying the make of the device. Every company that advertises over BLE is supposed to obtain a company identifier from Bluetooth SIG, and these identifiers can then be used to distinguish devices that are heard over the air. The manufacturer-specific data is also where the payloads for beacons such as iBeacon, AltBeacon, and Eddystone are present. For standard BLE devices, this is where Apple, for example, places information that can be used for services such as Handoff, Airdrop, and Airplay.
Analytics and Privacy Implications
With this knowledge, we can hypothesize about good mechanisms for ensuring a user’s privacy and then run some real-world analysis to see if this holds up.
For starters, any device that does not require a connection should use non-connectable advertisements. If connections are required, and only to specific previously known devices, then the connectable directed advertisement would be a suitable advertisement type to use.
In both cases, and as we have seen in the world of Wi-Fi, randomizing the MAC address used to transmit is almost always useful.
If one were to take iOS and MacOS, as an example, we see some interesting patterns. Both do a fairly good job of keeping things random and ensuring that the device is not easily trackable. In my experience, every time an iOS or MacOS device wakes up, it uses a new random MAC address. The device also only advertises in some scenarios. When the screen is unlocked, the user is able to connect via BLE to a device and read out basic information like the hardware model number, firmware version and current battery status of the device. Some interesting packets that Apple devices also send out include those supporting common features like Handoff, Airplay, and Airdrop, provided the device is BLE enabled.
From what I’ve seen to date, only the Apple TV does not randomize its MAC address. From an analytics standpoint, this constant randomization does a great job of maintaining user privacy while also making it seem that there are a lot more devices around than there actually are. The current analysis is that users have not been able to determine a pattern in the randomization, but this continues to be a work in progress.
A more detailed capture via Ubertooth sheds some light on BLE behavior by other devices. Of the mobile devices, it turns out that there are a few that never seemed to randomize the MAC address in an advertisement packet, which implies that once traced to a user that device can be monitored anywhere in the world.
Mobile accessories don’t seem to follow a consistent behavior. At home, for example, my smart TV and headphones all advertise over BLE without any randomizing, while also being connectable. Some connectable devices, share details like the Device Information Service once a BLE connection is maintained, but also seem to have timeouts in place to kickoff random connections. Using a mixture of listening for advertisements and sending scan requests to devices that use Connectable Advertisements, one can also derive the user-specific name of a device.
Generically, accessories that need connections tend to avoid the randomized MAC address. I believe this is to facilitate easy connection with an app. Examples of such devices would be wireless headphones and headsets and connectable lamps.
While many devices do use methods to obscure themselves from prying eyes, there are still some ways in which one can run passive analytics for BLE devices. This has limited scope, however, and can get into murky waters when it comes to user privacy.
Active analytics, however, shows more promise. By getting people to install apps, one can drive user engagement and have people more aware of the system overall. This also helps in de-anonymizing the data coming from devices and opens up the possibility of relying on a mixture of Wi-Fi (connected as well as unconnected) in conjunction with BLE.