In 2026, most indie devs are juggling Cursor, OpenClaw, homegrown scripts, and two or three CLI agents at once. Every tool wants its own API key. Every vendor ships a different base URL. The real headache isn't whether the model is smart enough—it's that you don't have a unified AI gateway: keys scattered everywhere, bills that never reconcile, and one provider outage taking down your entire stack. This post walks you through a copy-paste setup: VPSSpark Cloud Mac as control plane and execution layer, OpenRouter for upstream model aggregation, LiteLLM as your self-hosted gateway shell—a personal, enterprise-grade entry point with virtual keys, budget circuit breakers, and model fallbacks in about thirty minutes.
http://127.0.0.1:4000 outward, route inward to OpenRouter. Point Cursor, OpenClaw, and scripts at the local gateway. Master keys never touch client machines.
Fig. 1 · Cloud Mac + OpenRouter personal AI gateway stack
Why Cloud Mac + OpenRouter—not OpenRouter alone?
OpenRouter is a managed gateway: sign up, get a key, and you're routing. Its official docs describe an API that's largely OpenAI Chat Completions–compatible—great for fast integration. What it solves is upstream aggregation, not your governance boundary. You can't hand Cursor and OpenClaw separate, independently revocable keys. You can't easily layer team-level spend caps on top of provider billing. And you definitely can't colocate the gateway with macOS-native tooling—Xcode, AppleScript, local MCP—in the same execution environment.
A Cloud Mac merges control plane and Apple-ecosystem execution. The gateway runs under launchd; secrets live only in the server's .env. When you need OpenClaw, local Git, or an iOS build trigger, you don't shuttle context back to your laptop. If you already run a gateway on a Linux VPS, keep the VPS for IM/webhooks and dedicate the Cloud Mac to gateway + builds—see the Linux VPS OpenClaw gateway CI/CD matrix for that split. For launchd specifics on Cloud Mac, check the cloud Mac launchd FAQ.
Architecture: what each layer does
Get the responsibilities straight before you touch config—saves a lot of debugging later.
| Layer | Component | Owns | Does not |
|---|---|---|---|
| Upstream | OpenRouter | Unified billing, provider fallback, pay-per-token | Replace your key governance or network isolation |
| Gateway | LiteLLM Proxy (Cloud Mac) | Virtual keys, routing table, logs, budgets, OpenAI-compatible egress | Host long-lived chat sessions (that's OpenClaw's job) |
| Execution | Cloud Mac + OpenClaw | 24/7 agents, MCP, macOS automation, CI triggers | Push master API keys to laptops |
.env; clients always get virtual keys.
Pre-flight checklist (15 minutes)
Work through this in order. It eliminates most "can't connect" rabbit holes.
- OpenRouter account with an API key and a monthly credit cap set.
- Cloud Mac reachable via SSH,
arm64Apple Silicon, macOS 14+. - Homebrew installed; Python 3.11+ or Docker Desktop (pick one—we use pip here, lightest path).
- Gateway binds
127.0.0.1:4000only. Remote access via Tailscale or SSH tunnel—never expose 4000 to the open internet. - Laptop-side Cursor / OpenClaw can SSH to the Cloud Mac (key auth, password login disabled).
New to Cloud Mac? Start with the Mac VPS guide for basics and sizing.
Step 1: Configure OpenRouter upstream
In the OpenRouter console, create a key named cloud-mac-gateway. Enable a credit limit (e.g. $20/month) as a hard fuse. Once you have the key, write it to the Cloud Mac immediately—never commit it to Git.
Verify upstream from the Cloud Mac (replace $OPENROUTER_API_KEY):
export OPENROUTER_API_KEY="sk-or-v1-xxxxxxxx"
curl -s https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer $OPENROUTER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o-mini",
"messages": [{"role": "user", "content": "ping"}]
}' | head -c 400
JSON with choices means upstream is healthy. Model IDs use provider/model format—see OpenRouter's Models page for the full list. Common picks: anthropic/claude-sonnet-4, openai/gpt-4o, google/gemini-2.5-pro-preview.
Step 2: Install LiteLLM gateway on Cloud Mac
LiteLLM is an open-source LLM gateway; docs live at docs.litellm.ai. It wraps OpenRouter behind your own OpenAI-compatible endpoint and adds virtual keys plus spend tracking.
# 云 Mac SSH 会话内
brew install python@3.12
python3.12 -m venv ~/ai-gateway/.venv
source ~/ai-gateway/.venv/bin/activate
pip install 'litellm[proxy]'
mkdir -p ~/ai-gateway && cd ~/ai-gateway
Create config.yaml—the routing heart of your gateway. This example defaults to Claude Sonnet with GPT-4o mini fallback (OpenRouter's models array triggers fallback):
model_list:
- model_name: smart
litellm_params:
model: openrouter/anthropic/claude-sonnet-4
api_key: os.environ/OPENROUTER_API_KEY
models:
- openrouter/anthropic/claude-sonnet-4
- openrouter/openai/gpt-4o-mini
- model_name: fast
litellm_params:
model: openrouter/openai/gpt-4o-mini
api_key: os.environ/OPENROUTER_API_KEY
litellm_settings:
drop_params: true
set_verbose: false
general_settings:
master_key: os.environ/LITELLM_MASTER_KEY
database_url: "sqlite:///./litellm.db"
Create .env in the same directory (chmod 600):
OPENROUTER_API_KEY=sk-or-v1-xxxxxxxx
LITELLM_MASTER_KEY=sk-local-master-xxxxxxxx
Start the proxy (foreground first):
cd ~/ai-gateway
source .venv/bin/activate
set -a && source .env && set +a
litellm --config config.yaml --host 127.0.0.1 --port 4000
In another terminal, loopback test with the master key:
curl -s http://127.0.0.1:4000/v1/chat/completions \
-H "Authorization: Bearer $LITELLM_MASTER_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "smart",
"messages": [{"role": "user", "content": "gateway ok?"}]
}'
litellm --config config.yaml --detailed_debug, then hit the /ui path from the docs) to mint separate virtual keys for Cursor and OpenClaw—$5/$10 monthly budgets each. If one client leaks, revoke that virtual key only; the OpenRouter master key stays put.
Step 3: launchd for 24/7 uptime
Once verified, hand the gateway to launchd so SSH disconnects don't kill it. Create ~/Library/LaunchAgents/com.vpsspark.litellm.plist:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key><string>com.vpsspark.litellm</string>
<key>ProgramArguments</key>
<array>
<string>/Users/YOUR_USER/ai-gateway/.venv/bin/litellm</string>
<string>--config</string>
<string>/Users/YOUR_USER/ai-gateway/config.yaml</string>
<string>--host</string><string>127.0.0.1</string>
<string>--port</string><string>4000</string>
</array>
<key>EnvironmentVariables</key>
<dict>
<key>OPENROUTER_API_KEY</key><string>sk-or-v1-xxx</string>
<key>LITELLM_MASTER_KEY</key><string>sk-local-master-xxx</string>
</dict>
<key>RunAtLoad</key><true/>
<key>KeepAlive</key><true/>
<key>StandardOutPath</key>
<string>/Users/YOUR_USER/ai-gateway/litellm.log</string>
<key>StandardErrorPath</key>
<string>/Users/YOUR_USER/ai-gateway/litellm.err</string>
</dict>
</plist>
Load and verify:
launchctl load ~/Library/LaunchAgents/com.vpsspark.litellm.plist
launchctl list | grep litellm
curl -fsS http://127.0.0.1:4000/health || echo "check logs"
For deeper Cloud Mac persistence and troubleshooting, see the cloud Mac launchd FAQ section on login sessions vs. background daemons—gateway and OpenClaw can each have their own plist without stepping on restarts.
Step 4: Wire up Cursor, OpenClaw, and scripts
Every client changes two things: base URL points at your gateway; API key is a virtual key (master key for admin only).
Cursor: Settings → Models → Override OpenAI Base URL → http://127.0.0.1:4000/v1 (if Cursor runs on your laptop and the gateway is on Cloud Mac, tunnel first: ssh -L 4000:127.0.0.1:4000 user@cloud-mac, or use a Tailscale address). Model names: smart / fast, matching model_name in config.yaml.
OpenClaw: Per the gateway configuration docs, set the LLM provider to OpenAI-compatible: OPENAI_API_BASE=http://127.0.0.1:4000/v1, OPENAI_API_KEY=<virtual-key>. OpenClaw on the Cloud Mac itself needs no tunnel. OpenClaw on a Linux VPS? Either move LiteLLM to the VPS or use private networking—never put the master key in a public compose repo on the VPS.
Scripts: Anything using the OpenAI SDK just changes base_url:
from openai import OpenAI
client = OpenAI(
base_url="http://127.0.0.1:4000/v1",
api_key="sk-virtual-cursor-xxxxxxxx",
)
resp = client.chat.completions.create(
model="smart",
messages=[{"role": "user", "content": "总结今日 commit"}],
)
print(resp.choices[0].message.content)
Routing and cost control
An enterprise-grade gateway isn't about calling models—it's about knowing when to call the expensive ones. Three routing rules for a personal stack:
- Default to fast: completions, formatting, simple Q&A on
gpt-4o-mini—roughly one-tenth the cost of Sonnet-class models. - Explicit smart upgrade: architecture design, complex refactors, multi-file reasoning—switch client or OpenClaw routing to
smart(with OpenRouter fallback chain). - Two-layer budgets: OpenRouter console sets total credit cap; LiteLLM virtual keys set per-client caps—both must trip before you're truly cut off.
OpenRouter accepts a models array for provider-level fallback. LiteLLM's model_list decouples business aliases (smart/fast) from real model IDs—swap models in YAML later, zero client changes.
Security baseline (non-negotiable)
The most common personal-gateway incidents: keys in Git, or the gateway exposed on the public internet. Minimum five rules:
.envand plist secrets never in version control;.gitignorecovers.env,*.db,litellm.log.- LiteLLM binds
127.0.0.1only; remote access via SSH -L or Tailscale; add Nginx + mTLS for multi-user setups. - OpenRouter key leaked? Rotate immediately in the console. OpenRouter partners with GitHub Secret Scanning, but proactive rotation beats waiting for an alert.
- Periodically
sqlite3 litellm.dbexport spend logs and reconcile against the OpenRouter dashboard—anomalous traffic means revoke virtual keys now. - Enable FileVault on the Cloud Mac; SSH key-only login. Same discipline as Linux gateway delivery: separate change audit from secret audit.
FAQ
Can I skip LiteLLM and point clients straight at OpenRouter? Yes—for a solo minimal setup. You lose virtual keys, unified logs, and local routing aliases. Add a second client and you'll rebuild the gateway layer anyway.
Gateway on Cloud Mac or Linux VPS? Pure LLM forwarding with no macOS tooling? VPS is cheaper. OpenClaw + MCP + Xcode automation? One Cloud Mac for execution + gateway is simpler.
What if OpenRouter goes down? Add a second upstream in LiteLLM's model_list (e.g. direct Anthropic key as emergency route), or temporarily point fast at an OpenRouter free-tier model for degradation.
Does latency increase? One extra proxy hop is usually single-digit to low tens of milliseconds—noise compared to LLM inference. Bottlenecks are model choice and region routing, not LiteLLM itself.
Gateway and agent on the same Cloud Mac mini
Personal AI gateways fall apart when control plane lives on a laptop, execution on a cloud box, and secrets in chat logs—three machines, three sources of truth. Running LiteLLM and OpenClaw together on a VPSSpark Cloud Mac mini M4 puts unified memory to work: concurrent agents plus a lightweight proxy without breaking a sweat. Native macOS Unix means Homebrew, Python venv, launchd, and local MCP tooling work out of the box—no replicating macOS capabilities on Linux.
M-series Mac minis idle in single-digit watts—fine for 24/7 gateway duty without sweating the power bill. Gatekeeper, SIP, and FileVault stack up to a smaller long-term API-key exposure than a typical Windows workstation. When your laptop sleeps, your agent shouldn't die—collapse model ingress and execution onto one stable Cloud Mac.
Building an OpenRouter-backed personal AI gateway? VPSSpark Cloud Mac mini M4 is a one-stop control-plane starting point—see plans and pricing, and put keys, routing, and agents in the same room for once.