Soon after the hackers made first successful attempts of porting Siri to iPhone 4, iPod Touch & 3GS, it was obvious that more is coming.
After a sufficient amount of reverse engineering, enough understanding has been made regarding the Siri Protocol. To tap the app communication with the cloud, hackers setup a rogue DNS server that manipulates and tracks the interactions.
Siri communicates with server at port 443, to a server at 126.96.36.199 which is nothing but https://guzzoni.apple.com. The connection, obviously, is over https that uses SSL certificates to verify if the domain and the client are both authentic. Hackers managed to create custom SSL certification authority, added it to their iPhone 4S, then used it to sign their own certificate for a fake “guzzoni.apple.com”. This proved to be successful – Siri was happily sending commands to a faked HTTPS sever, which, as stated before, can be replicated again and again. Using this data, they managed to understand the data thats transmitted for every command.
Siri’s protocol is opaque. Let’s have a look at a Siri HTTP request. The request’s body is binary but headers look like this:
ACE /ace HTTP/1.0
User-Agent: Assistant(iPhone/iPhone4,1; iPhone OS/5.0/9A334) Ace/1.0
Facts about Siri Header :
- The request is using a custom “ACE” method, instead of a more usual GET.
- The url requested is “/ace”
- The Content-Length is nearly 2GB. Which is obviously not conforming to the HTTP standard.
- X-Ace-host is some form of GUID. After trying with several iPhone 4Ses, it seems to be tied to the actual device (pretty much like an UDID).
Siri Body payload (binary data)
When Siri binary data is looked in a hex editor, you would notice that it starts with 0xAACCEE. Oh, seems like header ! Unfortunately, nothing after that is readable coz its compressed using zlib.
To be more precise this AACCEE header in the request body is 3 bytes, but actual data payload starts after 4th byte. Unzipping data after 4th byte yields actual data that is sent over the network.
Unzipped data still has some binary artifacts plus some human readable text in form of bplist00 i.e. data is some binary plist.
Here is the description of the payload chunks:
- Chunks starting with 0x020000xxxx are “plist” packets, xxxx being the size of the binary plist data that follows the header.
- Chunks starting with 0x030000xxxx are “ping” packets, sent by the iPhone to Siri’s servers to keep the connection alive. Here xx is the ping sequence number.
- Chunks starting with 0x040000xxxx are “pong” packets, sent by Siri’s server as a reply to ping packets. Without surprise, xx is the pong sequence number.
Deciphering the content of binary plists: Its easy, you can do it on Mac OS X with the “plutil” command-line tool. Or in ruby with the CFPropertyList gem on any platform.
How iPhone 4S talks with apple Servers:
The audio data: The iPhone 4S sends raw audio data compressed using the popular VoIP codex Speex audio.
Signature: The iPhone 4S sends identifiers everywhere. So if you want to use Siri on another device, you still need the identfier of at least one iPhone 4S. You would need one of the tools from below tool chain to extract that. But beware, Apple could blacklist an identifier.
The actual content: The protocol is actually very, very network chatty. Your iPhone sends a tons of things to Apple’s servers. And those servers reply an incredible amount of informations. For example, when you’re using text-to-speech, Apple’s server even reply a confidence score and the timestamp of each word.
Writing your own Siri-based Application for Android, iOS
You can download Applidium’s tool chain and get started with your own app that’s Siri enabled.
Update: Spire: Install Siri on iPad, iPod Touch, iPhone 4, iPhone 3GS