Shipping iMessage as a Channel for Browser-Native AI Agents
Most AI tools today live behind a browser tab. You open a URL, type into a chat box, get a response. It works. But it also means the AI only exists when you remember to visit it.
Personal AI agents should meet you where you already are. For a lot of people, that's iMessage.
OpenBrowserClaw is a browser-native AI assistant — no server, no Docker, no cloud account. Everything runs in a single browser tab: the orchestrator, the agent worker, the database, the file system. It already supported a browser chat UI and Telegram as channels. I added iMessage as the third.
Why iMessage
iMessage is the default. It's not something people install. It's already running. An AI agent that shows up in your iMessage conversation list — alongside your friends and family — is fundamentally different from one that lives behind a URL.
The interaction model changes completely. You don't "go to" your AI. You text it. Same muscle memory as texting anyone else. That's the whole point.
Architecture
OpenBrowserClaw's architecture is unusual. There's no backend. The browser tab is the server:
┌──────────────────────────────────────────┐
│ Browser Tab │
│ │
│ Orchestrator (main thread) │
│ ├── BrowserChatChannel (built-in) │
│ ├── TelegramChannel (HTTPS polling) │
│ └── IMessageChannel (Socket.IO + REST) │
│ │ │
│ ▼ │
│ Agent Worker → Claude API → Tools │
│ │
│ Storage: IndexedDB + OPFS │
└──────────────────────────────────────────┘
│
│ Socket.IO (realtime)
│ REST (send, typing)
▼
┌──────────────────────────────────────────┐
│ iMessage Server (user-hosted) │
│ ├── macOS with Messages.app │
│ └── BlueBubbles or similar │
└──────────────────────────────────────────┘The browser can't talk to iMessage directly. So the iMessage channel connects to a user-hosted iMessage server that bridges between the iMessage protocol and a web-friendly API surface: Socket.IO for realtime events, REST for sending messages and managing typing indicators.
The key design constraint: everything in the browser tab must be stateless with respect to the iMessage server. The server is the source of truth. The browser is a client that connects, listens, sends, and disconnects cleanly.
The Channel Abstraction
OpenBrowserClaw defines a Channel interface:
interface Channel {
readonly type: ChannelType;
start(): void;
stop(): void;
send(groupId: string, text: string): Promise<void>;
setTyping(groupId: string, typing: boolean): void;
onMessage(callback: (msg: InboundMessage) => void): void;
}Six methods. Browser chat implements it. Telegram implements it. iMessage implements it. The orchestrator doesn't care which channel a message came from — it routes based on the groupId prefix (br:, tg:, im:).
This is what made the integration tractable. The iMessage channel is 277 lines of TypeScript. It doesn't need to understand the orchestrator, the agent worker, or the Claude API. It just needs to receive messages, send messages, and manage typing indicators.
What I Built
Socket.IO for realtime, REST for actions. The iMessage server emits new-message events over Socket.IO. For outbound actions — sending a message, starting/stopping typing indicators — I use REST endpoints. Clean separation between listening and acting.
GUID-based deduplication. iMessage servers can emit the same message more than once — on reconnection, on server restart, or due to internal replay. Every incoming message has a GUID. I maintain a map of recently-seen GUIDs with timestamps and a 5-minute TTL with periodic cleanup.
Auto-trigger without @mention. Telegram is a group platform, so the bot requires an @mention to trigger. iMessage is typically a direct conversation. Requiring @Andy before every message is unnecessary friction. So iMessage messages bypass the trigger pattern check entirely:
const isBrowserMain = msg.groupId === DEFAULT_GROUP_ID;
const isImessage = msg.groupId.startsWith('im:');
const hasTrigger = this.triggerPattern.test(msg.content.trim());
if (isBrowserMain || isImessage || hasTrigger) {
// trigger the agent
}Typing indicators with timer management. The orchestrator calls setTyping(groupId, true) multiple times during a single agent invocation — once when processing starts, again on every tool-use step. Each call was stacking overlapping POST/DELETE pairs. Fixed by tracking a single timer: each new typing=true cancels the previous one, typing=false cancels immediately and sends the DELETE.
Echo loop prevention. When the agent sends a reply via REST, the server sees a new outgoing message and emits it back over Socket.IO. Without filtering, infinite loop. The server includes isFromMe: boolean on every event — I drop all messages where that's true, plus reactions and attachment-only messages.
Problems I Hit
Empty chat GUIDs. Some socket events arrive without a valid chats array. Originally I defaulted to an empty string, which meant messages without a chat GUID would all collapse into a single fake conversation. Replies would go to the wrong place. Fixed by treating missing chat GUIDs as invalid and dropping the message.
Hung REST requests blocking state. The orchestrator has a state machine: idle → thinking → responding → idle. If the iMessage server is down, fetch() hangs indefinitely, the orchestrator never resets to idle, the agent appears permanently stuck. Fixed with a 15-second AbortController timeout on every request and try/catch around delivery so the state always resets even if sending fails.
Dedup cache not actually expiring. The dedup map used Map.has() which returns true regardless of when the entry was added. Under normal traffic, entries lived forever. A message legitimately re-delivered after 5 minutes would be silently dropped. Fixed by checking the stored timestamp on every lookup:
const seenAt = this.recentInboundIds.get(msg.id);
if (seenAt && now - seenAt < 60_000) return;Auth errors causing reconnection storms. Socket.IO has built-in reconnection with exponential backoff. If the server rejects the connection due to a bad API key, it would keep reconnecting forever. Fixed by calling this.stop() on auth-error. Bad credentials should fail closed.
React StrictMode double-mounting. Development mode double-mounts every component, creating two orchestrators running simultaneously — both polling Telegram, both connecting to iMessage, both processing messages. Fixed with a module-level singleton that caches the initialization promise itself so concurrent callers don't race.
The Pattern
The most interesting part of this isn't the Socket.IO plumbing. It's how little it took.
Adding iMessage required:
- One new file (277 lines) implementing the Channel interface
- Wiring in the orchestrator (configure, start, stop, route)
- A settings card in the UI
- Two config keys
No changes to the agent worker. No changes to the tool system. No changes to the database schema. The channel abstraction isolated the entire integration.
The future of personal AI agents isn't a single interface. It's the ability to meet users wherever they are — iMessage, WhatsApp, Slack, Discord — through a consistent agent backend that doesn't care about the transport. The agent's intelligence lives in the worker. The memory lives in IndexedDB. The personality is the system prompt. The channel is just the wire.
The iMessage channel is part of OpenBrowserClaw. The integration is available in PR #1.