Hardware upgrades, cloud region moves, and disaster drills all touch the same OpenClaw workspace: markdown personas, tool manifests, model caches, and launchd paths that must stay coherent. Copying an entire home directory often ships stale secrets plus 120 GB of cache you never needed. This 2026 guide defines exclude rules, SHA-256 verification, seven ordered steps, a before/after checklist table, large-transfer bandwidth math, DNS cutover cautions, rollback timeboxes, and FAQ-style troubleshooting when the gateway loops after migration.
Baseline daemon health with headless onboard acceptance; secrets with Keychain and .env; upgrades with operations runbooks. Use VNC when you must click TCC prompts mid-migration.
Three path dependencies you must resolve before tar
- Workspace root: If
WorkingDirectoryis hard-coded in a plist, the destination user or plist must match—copying files alone is the top failure mode. - Caches: Model and npm caches may consume 30–120 GB; exclude them unless bandwidth and time windows allow full fidelity.
- Bind addresses: A gateway bound to a specific en0 IP breaks health checks after DHCP changes—document intended bind mode before you cut traffic.
Write acceptable drift into the change ticket: for example first-response latency may exceed baseline by 20% while caches warm, but tool-call failure rate must not move. Numeric acceptance beats subjective "feels fine" arguments at 2 a.m.
Archive exclude guidance
| Path pattern | Include? | Notes |
|---|---|---|
| SOUL.md, TOOLS.md, AGENTS.md | Required | Usually under 5 MB |
| **/node_modules | Usually exclude | Reinstall with npm ci for reproducibility |
| Model weight caches | Policy choice | Narrow uplinks may accept 15 min warm-up post-cut |
Large transfers and resumable sync
Beyond 50 GB, prefer rsync --partial or multipart object uploads, and checksum both sides. At 100 Mbps symmetric, 80 GB needs about 110 minutes in theory—real WAN links often double that, so communicate worst-case windows. Never push sensitive tarballs through unencrypted commodity buckets; wrap with customer KMS when a provider object store sits in the middle.
Seven-step migration sequence
- Read-only snapshot: Stop the gateway or flip a maintenance label so
MEMORY.mddoes not mutate mid-copy. - Archive and hash: Record SHA-256 in the ticket body, not only chat.
- Transfer: Prefer private networks; restrict ACLs on temporary storage.
- Extract to final path: Match ownership to the service user.
- Install dependencies: Pin Node minor and honor lockfiles—silent major jumps masquerade as "OpenClaw broke."
- Align LaunchAgent: Verify
ProgramArgumentsandEnvironmentVariables. - Dual acceptance: Three consecutive health checks plus a minimal tool invocation.
Before/after checklist table
| Check | Pre baseline | Post expectation |
|---|---|---|
| openclaw semver | Recorded triple | Match, or document intentional upgrade with doctor output |
| API key fingerprint | Last 6 hex chars | Match unless keys rotated in tandem |
| Listening port | lsof capture |
Match or security-group docs updated |
Rollback timebox: If dual acceptance fails within 45 minutes, default to the old gateway in read-only mode instead of dual-writes—split-brain states and double billing follow messy parallel paths. Keep read-only legacy online no longer than 72 hours unless compliance mandates longer archival.
FAQ-style troubleshooting after cutover
May I migrate config without caches?
Yes—often preferred. Ship a lightweight bundle of markdown and lockfiles, reinstall dependencies on the target, and accept 2–4× slower first responses while caches repopulate. For low-latency SLAs, incremental rsync the cache directory with the same checksum discipline as the main archive.
Gateway restart loops?
Compare plist environment variables for stray whitespace introduced during copy-paste. Validate Node absolute paths differ between hosts. Foreground the binary temporarily, capture the last 200 stderr lines, and attach them to the ticket—future you will appreciate the paper trail.
Should staging mirror production disk layout?
Yes. Rehearse the same paths, usernames, and plist locations on a non-production Mac mini quarterly. Teams that only practice migrations on laptops discover path surprises at the worst possible moment—Friday evening before a marketing launch. Staging drills cost far less than executive escalations.
DNS, ingress, and multi-gateway cautions
Freeze inbound routes that still target the old instance—Slack event subscriptions or IP allowlists otherwise deliver traffic to a host you thought was idle. Drop DNS TTL to about 60 seconds during the move for faster rollback, then restore longer TTL after 24 hours of stability to cut resolver chatter.
Publish a one-page migration diff: archive bytes, exclude list, old versus new ports, and doctor output diff. Searchable documentation beats Slack scrollback when someone asks three months later why node_modules stayed excluded.
Automation hooks: when to script versus stay manual
First-time migrations benefit from careful manual checkpoints—humans catch odd plist quoting and unexpected file modes. After you succeed twice with identical exclude lists, wrap the tar, hash, and rsync steps in a version-controlled script tagged with the OpenClaw version. Never let automation skip the checksum compare; silent bit-flip on cheap Wi-Fi has corrupted more archives than you would expect. Keep the script idempotent so reruns after partial failure do not duplicate directories.
Integrate migration events with your calendar: notify downstream teams when first-response SLO may widen, and attach the rollback command snippet to the calendar description. Operators searching their inbox six months later will find the context faster than digging through ephemeral chat threads.
Latency, residency, and compliance
Moving from Hong Kong to Tokyo may change RTT from 25 ms to 55 ms, surfacing hard-coded timeouts in tools. Logs may contain conversational PII—encrypt archives in transit and register processing in your ROPA. NodeMac regions across Hong Kong, Japan, South Korea, Singapore, and the United States let you place gateways next to data without rewriting application code.
Mac mini M4 unified memory and cache bandwidth rebuild dependencies quickly after migration, and native macOS avoids container path surprises. Renting dedicated hosts through NodeMac keeps SSH/VNC workflows identical from old host to new, turning hardware moves into repeatable operations instead of one-off heroics. Bare-metal predictability also makes before/after table comparisons trustworthy when you argue whether an incident was config drift or upstream model behavior. Treat each successful migration as a reusable template, not a one-off hero story.