Reverse Engineering FaceTime
FaceTime links looked simple enough that I initially underestimated them. A URL opens a web page, a guest types a name, the host taps approve, and media starts. If this were a normal browser-first calling product, I would expect a room id, a WebSocket, SDP, ICE, and maybe TURN. FaceTime did not fit that shape. The browser was not joining a room; it was asking a native Apple-controlled conversation to let it in.
The interesting part of this project was not that FaceTime eventually uses WebRTC. That is the least surprising part. The interesting part was the bridge around it: FaceTime link material, native host authority, IDS-flavoured web registration, WebCourier push delivery, LetMeIn admission, Quick Relay allocation, downlink stream selection, media-key material recovery, and SFrame encryption around encoded frames.
I am keeping real account material, handles, tokens, private keys, device identifiers, and live traffic out of the series. The snippets below are reconstructed/redacted shapes from my notes and local artifacts, not credentials and not a bypass guide. The point is to document the architecture I recovered and the reasoning path that got me there.
- [2026-06-02] Reverse Engineering FaceTime (Part 0): The Link Is Not The Room
- [2026-06-01] Reverse Engineering FaceTime (Part 1): The Mac Was The Oracle
- [2026-05-31] Reverse Engineering FaceTime (Part 2): WebCourier, IDS, Quick Relay, and SFrame
The map I ended up with
I found it useful to stop thinking in product screens and start thinking in protocol boundaries. Each boundary had a different authority, different failure space, and different evidence trail.
| Boundary | What I was trying to prove | Artifact that made it concrete |
|---|---|---|
| Link resolution | A FaceTime URL is not the call; it is an entry handle into a conversation/link object. | ConversationLink and conversation fields in the web bundle. |
| Admission | The browser cannot promote itself from waiting room to participant. | LetMeInState, LetMeInResult, host-visible pending/approved transitions. |
| Identity/routing | The browser still needs Apple-style signaling reachability. | IDS web register/query endpoints and WebCourier connect material. |
| Delivery | Push-like messages reach the browser through a WebSocket fabric. | wss://webcourier.../websocket/anon/<hex(protobuf)>. |
| Relay | Media path allocation is separate from being admitted. | QuickRelayWebProtocolMessage and allocation/status enums. |
| Downlink | Receiving media is an explicit subscription problem. | sessionInfoRequest carrying SubscribedStream entries. |
| Key state | A late joiner needs current media-key material, not only SDP. | MKM/prekey timing constants and recovery flows. |
| Media crypto | FaceTime encrypts encoded frames outside normal RTP payload visibility. | Dedicated SFrame worker and sender/receiver transformer names. |
The rest of the series follows the order I actually worked through it. Part 0 is the first model and the first bad assumptions. Part 1 is the host/native side, where admission and conversation authority became obvious. Part 2 is the browser runtime, where the concrete protocol vocabulary leaked through minified JavaScript.
The thing that changed the project
The big shift was realizing that FaceTime Web is not a separate lightweight web product. It is a web-shaped participant inside a larger Apple calling stack. That explains why the runtime contains concepts that look too serious for a simple web invite: IDS web registration, push topics, WebCourier reconnect tokens, Quick Relay material stores, MKM rolling windows, AVC blob resend requests, downlink bandwidth allocation, and SFrame queue backpressure.
A simplified call path from my notes looked like this:
open link
-> parse/resolve conversation link material
-> create a web-reachable identity/routing surface
-> connect WebCourier delivery
-> submit LetMeIn request
-> wait for host authority
-> allocate Quick Relay participant resources
-> exchange/query key material and media blobs
-> subscribe to streams
-> transform encoded media through SFrameThe arrows matter because every step can fail independently. A guest can have a valid link and still not be admitted. A guest can be admitted and still fail relay allocation. A guest can receive relay traffic and still be unable to decrypt because the current MKM is missing or stale. A guest can have key material and still not know which stream ids to request. That separation is what makes the reverse engineering interesting.