The Buddy protocol has only 5 message families at the application layer. Let's break them down one by one.
6.1 / Heartbeat Snapshot (Device ← Desktop)
{
"total": 3,
"running": 1,
"waiting": 1,
"msg": "approve: Bash",
"entries": ["10:42 git push", "10:41 yarn test", "10:39 reading file..."],
"tokens": 184502,
"tokens_today": 31200,
"prompt": {
"id": "req_abc123",
"tool": "Bash",
"hint": "rm -rf /tmp/foo"
}
}
Key Contracts
Frequency Contract
Sent on state change, at most once every 10 seconds for keep-alive. 30 seconds without data means connection is dead. The device must implement a "30s no data" timeout logic, and must not rely on BLE link state.
Idempotency Contract
Each heartbeat is a complete snapshot, not a delta. The device can discard old heartbeats and only look at the latest one. This greatly simplifies the state machine—no reconciliation needed, just overwrite.
Pre-aggregation Contract
entries are already strings, not structured logs; tokens are already accumulated; prompt.hint is already truncated. All aggregation is done on the desktop side; the device only displays. Complexity is pushed to the host side; the device acts as a "dumb terminal."
6.2 / Permission Approval (Device → Desktop)
{"cmd":"permission","id":"req_abc123","decision":"once"}
{"cmd":"permission","id":"req_abc123","decision":"deny"}
Contracts:
- · id must be byte-for-byte equal to the previous heartbeat's prompt.id
- · decision has only two values: once or deny
- · No always / forever / whitelist—the protocol layer deliberately does not expose persistent authorization
The last point is key. The Claude Code desktop client itself has an "always allow" mode; Buddy intentionally prevents the device from triggering it.
Design Intent
Physical friction is a feature, not a bug. If you could "tap once on Buddy to permanently allow git push," the device would degrade from "approver" to "numb rubber-stamper." Buddy's product value lies in the light friction of "glance + press" for each approval—this friction itself prevents a class of errors. Once you allow "always," the entire product value collapses.
This kind of restraint—"the protocol deliberately withholds a capability"—is rare. Most protocol designers think "provide it to the client, let the client choose whether to use it"—Buddy does the opposite: it forbids the client from having this choice. This is an example of a product decision hardened into the protocol.
6.3 / Turn Event (Device ← Desktop, asynchronous)
{
"evt": "turn",
"role": "assistant",
"content": [{"type": "text", "text": "..."}]
}
Triggered once after each Claude reply completes. The content array is in the SDK's native format (text / tool_use / tool_result). Events exceeding 4 KB are dropped.
Why 4 KB? Engineering constraint-driven:
- · BLE Notify实测 throughput ~10–20 KB/s (depending on connection parameters)
- · A 4 KB frame takes about 200–400 ms to transmit
- · Replies exceeding 4 KB are generally large code blocks or long explanations, which wouldn't fit on a small screen anyway
- · 4 KB is also manageable for ESP32 RAM to buffer one frame (this SoC has 320 KB default RAM)
Design intent: Turn is not for the device to fully replicate the conversation, but to trigger a "Claude spoke" animation. If you want complete logs, use a different transport (USB direct, WiFi). Buddy's role on BLE is "status indicator," not "log synchronizer."
6.4 / Status Report ACK (Desktop ← Device)
{
"ack": "status",
"ok": true,
"data": {
"name": "Clawd",
"sec": true,
"bat": {"pct": 87, "mV": 4012, "mA": -120, "usb": true},
"sys": {"up": 8412, "heap": 84200},
"stats": {"appr": 42, "deny": 3, "vel": 8, "nap": 12, "lvl": 5}
}
}
Desktop polls, device responds. The weakest contract in the protocol—every field is optional; if the device doesn't support it, it simply doesn't fill it in.
stats.lvl is the "level" tracked by the device itself (leveling up every 50k tokens), reported back to the desktop. This state is not maintained by the desktop—the device's NVS is the single source of truth. Another restrained design: let the device own the state it displays; the desktop is just a consumer.
6.5 / Folder Push — Streaming Character Pack Transfer
The most complex part of the protocol. Triggered when the Hardware Buddy window's drop target receives a folder:
Desktop: {"cmd":"char_begin","name":"bufo","total":184320}
Device: {"ack":"char_begin","ok":true}
— per file —
Desktop: {"cmd":"file","path":"manifest.json","size":412}
Device: {"ack":"file","ok":true}
Desktop: {"cmd":"chunk","d":"<base64>"}
Device: {"ack":"chunk","ok":true,"n":4096}
… repeat chunks until size bytes are transferred …
Desktop: {"cmd":"file_end"}
Device: {"ack":"file_end","ok":true,"n":412}
— pack end —
Desktop: {"cmd":"char_end"}
Device: {"ack":"char_end","ok":true}
Design Highlights
CHUNK-level ACK = Flow Control
Desktop doesn't send the next chunk until it receives the previous ack—application-layer stop-and-wait. The BLE link layer has ACKs, but this application layer lets the protocol sense "the device flash write is slow"—the device just delays the ack, and the desktop naturally slows down.
n field = Progress
Returns cumulative byte count, providing the desktop with progress bar data—no separate calculation needed.
PATH validation = Receiver's responsibility
The protocol specification requires the receiver to reject .. and absolute paths. Defense-in-depth as protocol compliance—when the desktop is compromised and sends a malicious path to the device, the device itself must still reject it.
Content-agnostic = Protocol Reuse
GIFs, configs, firmware images all work. This means the protocol can extend to support new content types in the future without modifying the wire format—OTA, wallpaper, sound packs. Protocol = abstraction, not a specific feature.
6.6 / Character Pack Manifest Format
Within the content transferred by Folder Push, the schema of manifest.json:
{
"name": "bufo",
"colors": {
"body": "#6B8E23",
"bg": "#000000",
"text": "#FFFFFF",
"textDim": "#808080",
"ink": "#000000"
},
"states": {
"sleep": "sleep.gif",
"idle": ["idle_0.gif", "idle_1.gif", "idle_2.gif"],
"busy": "busy.gif",
"attention": "attention.gif",
"celebrate": "celebrate.gif",
"dizzy": "dizzy.gif",
"heart": "heart.gif"
}
}
- · GIF width fixed at 96px, height limit ~140px (M5StickCPlus screen is 135×240 portrait)
- · Total pack < 1.8 MB; gifsicle --lossy=80 -O3 --colors 64 typically compresses 40–60%
- · idle can be an array—switch to the next GIF at the end of each loop (idle carousel)
- · tools/prep_character.py for batch resizing; tools/flash_character.py to skip BLE and flash directly via USB
This manifest is the only place in the Buddy protocol with a strong constraint on specific content format. All other wire formats are structure (message families); the manifest is data (assets). The benefit of isolating the content format into a small schema: the protocol stays unchanged while assets can evolve independently—v2 adds a sound field linking to audio files; old devices read and ignore it, new devices enable it.