2026 Mac mini M4: резервирование Runner, вытеснение и cooldown

Once concurrency slices and shared-pool fairness are in place, the next failure mode is political: Project A “needs the Mac now,” Project B already has work running, and someone asks ops to “just bump priority.” Without explicit rules, every escalation becomes a manual reboot of trust. This 2026 guide treats dedicated Mac mini M4 hosts as a finite set of reservable slots, defines when preemption is legal, and adds cooldown windows so high-priority traffic cannot thrash runners or starve baseline teams. You get two matrices, eight implementation steps, and explicit ties to concurrency and capacity-lending playbooks you may already run.

Baseline fairness and slices: concurrency slices and CI/agent pool fairness. Temporary surges: capacity lending. Queue depth: wait-time SLOs. If you label hosts for automation, keep the same CMDB keys as dispatchable Mac nodes. Pricing: pricing; connectivity: help.

Reservations vs elastic overflow: vocabulary that prevents arguments

A reservation is a contract: team X is entitled to up to R concurrent heavy jobs on label mac-ci-heavy between 09:00 and 21:00 local, regardless of who else is in the building. Elastic overflow is everything else: best-effort jobs that may run when spare capacity exists but must yield when a reservation holder is queued. If your orchestrator cannot express both concepts, you will keep re-implementing them in Slack threads. Reservations should be few, time-bounded, and attached to a named cost center; elastic work should be the default for experiments and low-severity branches.

On Apple Silicon, “spare capacity” is not just CPU. Memory footprint of two overlapping xcodebuild jobs can leave almost no headroom for a third, even when Activity Monitor looks comfortable. Tie reservation counts to the same workload classes you used in the concurrency matrix—heavy compile, UI simulators, lint-only—rather than a single global number per team. That alignment stops the scenario where a team books “three slots” but each slot is actually a UI suite that should have been one.

Preemption policy matrix (when killing or re-queuing is allowed)

Incoming job	Victim workload	Preempt?
Sev-1 release hotfix (documented incident)	Elastic lint / docs build	Yes—re-queue victim with reason code
Sev-1 release hotfix	Another team’s reserved heavy slot in contract window	No—use spare host or executive exception with audit trail
VP “urgent” without incident ID	Any	No—treat as elastic; escalate capacity, not preemption
Scheduled nightly soak	Interactive developer CI	No—move soak start time or dedicated soak labels

The key invariant is preemption is a governance action, not a scheduler knob anyone can flip. Each allowed preemption should emit: incident or change ticket ID, victim job IDs, and the engineer who authorized it. If you cannot produce those three fields in under a minute during an audit, your policy is fiction. Most teams find that forbidding preemption against reservation holders—except via a written executive exception—reduces weekend pages more than any tuning of timeouts.

Cooldown and anti-thrash parameters

Event	Suggested cooldown	Why
Job preempted (re-queued)	15–30 min before same project can preempt again	Stops ping-pong between two “urgent” teams
Reservation window starts	5 min grace—no new preemptions	Lets in-flight elastic jobs drain cleanly
Host returns from maintenance	10 min “warm-up” elastic only	Avoids slamming cold caches with heavy jobs

Metric hook: track preemption count per week and correlate with p95 wall-clock regression. If preemptions rise but SLOs do not improve, you are thrashing—add hosts or narrow reservation windows instead of more kills.

Eight-step rollout checklist

Inventory labels: ensure mac-ci-* and agent labels do not accidentally satisfy the same reservation rules.
Encode reservations in Git (YAML) with owner, window, and R per workload class.
Orchestrator hooks: implement “reservation token” or queue priority that cannot be overridden from a repo file.
Preemption API: single internal endpoint or slash-command that logs ticket + victims + actor.
Cooldown state: store last preemption timestamp per project in your queue metadata or Redis with TTL.
Dashboards: one panel for “reserved slots in use vs entitled” and one for “preemptions per day.”
Game day: rehearse Sev-1 hotfix with a staged elastic victim; verify audit log completeness.
Quarterly review: sunset reservations that stayed below 40% utilization for eight weeks—unused entitlement rots trust.

How this pairs with concurrency slices and disk policy

Reservations set who may occupy slots; concurrency slices set how many jobs one host can run safely. If you grant a team two reserved heavy slots but your per-host slice only allows one heavy compile without swap, you have created a contractual lie. Before publishing reservation numbers, replay them against the slice matrix from the fairness article and against disk and artifact retention—preemption storms often correlate with DerivedData storms because killed jobs leave partial state on disk.

For orgs running OpenClaw or similar automation on the same fleet, treat agent traffic as a separate reservation class with its own R_agent, or forbid preemption of agent hosts entirely during business hours. Mixing “CI preemption” with “agent gateway restart” without coordination produces the worst class of flaky automation: jobs that pass in staging because no one preempted there.

Finance and chargeback without fantasy spreadsheets

Reservations are a natural input to internal chargeback: each reserved slot-hour can be priced from your Mac fleet amortization plus power and colo. Elastic usage can stay on a lower rate or be bundled into platform engineering. The mistake is letting finance invent slot counts without SRE sign-off—when reservations exceed physical safe concurrency, either projects pay for air, or SRE pays with pages. Publish a one-pager that shows “max safe heavy slots per M4” beside “sum of contracted reservations” every month.

Common anti-patterns

Unlimited priority levels (P0–P9) with no preemption rules—everyone marks P0. Silent preemption without victim notification—developers lose half a day reproducing “flaky CI.” Permanent reservations for teams that ship monthly—capacity hoarding. Cooldown set to zero—two VPs alternate preempting each other’s pipelines. Fixing these is less about tools than about publishing the matrices above in the same repo that defines your runner images.

When you need additional dedicated hosts to validate reservation math before a fiscal quarter lock, short-lived Mac mini M4 rentals with SSH/VNC in multiple regions let you replay production labels without a capital request. NodeMac operates nodes in Hong Kong, Japan, Korea, Singapore, and the United States so you can mirror label topology and prove safe R values with real builds, not slide decks.

2026 матрица решений: резервирование Runner нескольких проектов, вытеснение и cooldown на выделенном Mac mini M4