VPSSpark Blog
← Back to Dev Diary

Cloud Mac + OpenRouter Hands-On: Build Your Personal Enterprise AI Gateway

Dev Tips · 2026.06.16 · ~9 min read

MacBook workspace with code editor and terminal — Cloud Mac AI Gateway and OpenRouter
Cloud Mac workspace: OpenRouter upstream and local Gateway control plane on one always-on machine.

In 2026, most indie devs are juggling Cursor, OpenClaw, homegrown scripts, and two or three CLI agents at once. Every tool wants its own API key. Every vendor ships a different base URL. The real headache isn't whether the model is smart enough—it's that you don't have a unified AI gateway: keys scattered everywhere, bills that never reconcile, and one provider outage taking down your entire stack. This post walks you through a copy-paste setup: VPSSpark Cloud Mac as control plane and execution layer, OpenRouter for upstream model aggregation, LiteLLM as your self-hosted gateway shell—a personal, enterprise-grade entry point with virtual keys, budget circuit breakers, and model fallbacks in about thirty minutes.

TL;DR: OpenRouter gives you "one key, 500+ models." LiteLLM Proxy on a Cloud Mac is your gateway—expose only http://127.0.0.1:4000 outward, route inward to OpenRouter. Point Cursor, OpenClaw, and scripts at the local gateway. Master keys never touch client machines.

Fig. 1 · Cloud Mac + OpenRouter personal AI gateway stack

Client layerCursor · OpenClaw · CLI · MCP
Self-hosted gateway (Cloud Mac)LiteLLM Proxy · Virtual keys · Logs · Budgets
Upstream model layerOpenRouter · Multi-provider fallback
500+
OpenRouter models
1
Gateway endpoint
24/7
Cloud Mac launchd uptime

Why Cloud Mac + OpenRouter—not OpenRouter alone?

OpenRouter is a managed gateway: sign up, get a key, and you're routing. Its official docs describe an API that's largely OpenAI Chat Completions–compatible—great for fast integration. What it solves is upstream aggregation, not your governance boundary. You can't hand Cursor and OpenClaw separate, independently revocable keys. You can't easily layer team-level spend caps on top of provider billing. And you definitely can't colocate the gateway with macOS-native tooling—Xcode, AppleScript, local MCP—in the same execution environment.

A Cloud Mac merges control plane and Apple-ecosystem execution. The gateway runs under launchd; secrets live only in the server's .env. When you need OpenClaw, local Git, or an iOS build trigger, you don't shuttle context back to your laptop. If you already run a gateway on a Linux VPS, keep the VPS for IM/webhooks and dedicate the Cloud Mac to gateway + builds—see the Linux VPS OpenClaw gateway CI/CD matrix for that split. For launchd specifics on Cloud Mac, check the cloud Mac launchd FAQ.

Architecture: what each layer does

Get the responsibilities straight before you touch config—saves a lot of debugging later.

Layer Component Owns Does not
Upstream OpenRouter Unified billing, provider fallback, pay-per-token Replace your key governance or network isolation
Gateway LiteLLM Proxy (Cloud Mac) Virtual keys, routing table, logs, budgets, OpenAI-compatible egress Host long-lived chat sessions (that's OpenClaw's job)
Execution Cloud Mac + OpenClaw 24/7 agents, MCP, macOS automation, CI triggers Push master API keys to laptops
Solo vs. small team
Solo? LiteLLM's master key plus two or three virtual keys is plenty. Three to five people: add Nginx reverse proxy and Tailscale—don't expose the gateway raw on the public internet. At any scale, the OpenRouter master key lives only in the Cloud Mac .env; clients always get virtual keys.

Pre-flight checklist (15 minutes)

Work through this in order. It eliminates most "can't connect" rabbit holes.

  • OpenRouter account with an API key and a monthly credit cap set.
  • Cloud Mac reachable via SSH, arm64 Apple Silicon, macOS 14+.
  • Homebrew installed; Python 3.11+ or Docker Desktop (pick one—we use pip here, lightest path).
  • Gateway binds 127.0.0.1:4000 only. Remote access via Tailscale or SSH tunnel—never expose 4000 to the open internet.
  • Laptop-side Cursor / OpenClaw can SSH to the Cloud Mac (key auth, password login disabled).

New to Cloud Mac? Start with the Mac VPS guide for basics and sizing.

Step 1: Configure OpenRouter upstream

In the OpenRouter console, create a key named cloud-mac-gateway. Enable a credit limit (e.g. $20/month) as a hard fuse. Once you have the key, write it to the Cloud Mac immediately—never commit it to Git.

Verify upstream from the Cloud Mac (replace $OPENROUTER_API_KEY):

验证 OpenRouter 上游
export OPENROUTER_API_KEY="sk-or-v1-xxxxxxxx"

                curl -s https://openrouter.ai/api/v1/chat/completions \
                  -H "Authorization: Bearer $OPENROUTER_API_KEY" \
                  -H "Content-Type: application/json" \
                  -d '{
                    "model": "openai/gpt-4o-mini",
                    "messages": [{"role": "user", "content": "ping"}]
                  }' | head -c 400

JSON with choices means upstream is healthy. Model IDs use provider/model format—see OpenRouter's Models page for the full list. Common picks: anthropic/claude-sonnet-4, openai/gpt-4o, google/gemini-2.5-pro-preview.

Step 2: Install LiteLLM gateway on Cloud Mac

LiteLLM is an open-source LLM gateway; docs live at docs.litellm.ai. It wraps OpenRouter behind your own OpenAI-compatible endpoint and adds virtual keys plus spend tracking.

安装与目录初始化
# 云 Mac SSH 会话内
                brew install python@3.12
                python3.12 -m venv ~/ai-gateway/.venv
                source ~/ai-gateway/.venv/bin/activate
                pip install 'litellm[proxy]'

                mkdir -p ~/ai-gateway && cd ~/ai-gateway

Create config.yaml—the routing heart of your gateway. This example defaults to Claude Sonnet with GPT-4o mini fallback (OpenRouter's models array triggers fallback):

~/ai-gateway/config.yaml
model_list:
                  - model_name: smart
                    litellm_params:
                      model: openrouter/anthropic/claude-sonnet-4
                      api_key: os.environ/OPENROUTER_API_KEY
                      models:
                        - openrouter/anthropic/claude-sonnet-4
                        - openrouter/openai/gpt-4o-mini

                  - model_name: fast
                    litellm_params:
                      model: openrouter/openai/gpt-4o-mini
                      api_key: os.environ/OPENROUTER_API_KEY

                litellm_settings:
                  drop_params: true
                  set_verbose: false

                general_settings:
                  master_key: os.environ/LITELLM_MASTER_KEY
                  database_url: "sqlite:///./litellm.db"

Create .env in the same directory (chmod 600):

~/ai-gateway/.env
OPENROUTER_API_KEY=sk-or-v1-xxxxxxxx
                LITELLM_MASTER_KEY=sk-local-master-xxxxxxxx

Start the proxy (foreground first):

启动 LiteLLM Proxy
cd ~/ai-gateway
                source .venv/bin/activate
                set -a && source .env && set +a

                litellm --config config.yaml --host 127.0.0.1 --port 4000

In another terminal, loopback test with the master key:

环回验证 Gateway
curl -s http://127.0.0.1:4000/v1/chat/completions \
                  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
                  -H "Content-Type: application/json" \
                  -d '{
                    "model": "smart",
                    "messages": [{"role": "user", "content": "gateway ok?"}]
                  }'
Virtual keys (do this next)
After the foreground test passes, use LiteLLM's Admin UI (litellm --config config.yaml --detailed_debug, then hit the /ui path from the docs) to mint separate virtual keys for Cursor and OpenClaw—$5/$10 monthly budgets each. If one client leaks, revoke that virtual key only; the OpenRouter master key stays put.

Step 3: launchd for 24/7 uptime

Once verified, hand the gateway to launchd so SSH disconnects don't kill it. Create ~/Library/LaunchAgents/com.vpsspark.litellm.plist:

launchd plist 示意
<?xml version="1.0" encoding="UTF-8"?>
                <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
                 "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
                <plist version="1.0">
                <dict>
                  <key>Label</key><string>com.vpsspark.litellm</string>
                  <key>ProgramArguments</key>
                  <array>
                    <string>/Users/YOUR_USER/ai-gateway/.venv/bin/litellm</string>
                    <string>--config</string>
                    <string>/Users/YOUR_USER/ai-gateway/config.yaml</string>
                    <string>--host</string><string>127.0.0.1</string>
                    <string>--port</string><string>4000</string>
                  </array>
                  <key>EnvironmentVariables</key>
                  <dict>
                    <key>OPENROUTER_API_KEY</key><string>sk-or-v1-xxx</string>
                    <key>LITELLM_MASTER_KEY</key><string>sk-local-master-xxx</string>
                  </dict>
                  <key>RunAtLoad</key><true/>
                  <key>KeepAlive</key><true/>
                  <key>StandardOutPath</key>
                  <string>/Users/YOUR_USER/ai-gateway/litellm.log</string>
                  <key>StandardErrorPath</key>
                  <string>/Users/YOUR_USER/ai-gateway/litellm.err</string>
                </dict>
                </plist>

Load and verify:

launchctl
launchctl load ~/Library/LaunchAgents/com.vpsspark.litellm.plist
                launchctl list | grep litellm
                curl -fsS http://127.0.0.1:4000/health || echo "check logs"

For deeper Cloud Mac persistence and troubleshooting, see the cloud Mac launchd FAQ section on login sessions vs. background daemons—gateway and OpenClaw can each have their own plist without stepping on restarts.

Step 4: Wire up Cursor, OpenClaw, and scripts

Every client changes two things: base URL points at your gateway; API key is a virtual key (master key for admin only).

Cursor: Settings → Models → Override OpenAI Base URL → http://127.0.0.1:4000/v1 (if Cursor runs on your laptop and the gateway is on Cloud Mac, tunnel first: ssh -L 4000:127.0.0.1:4000 user@cloud-mac, or use a Tailscale address). Model names: smart / fast, matching model_name in config.yaml.

OpenClaw: Per the gateway configuration docs, set the LLM provider to OpenAI-compatible: OPENAI_API_BASE=http://127.0.0.1:4000/v1, OPENAI_API_KEY=<virtual-key>. OpenClaw on the Cloud Mac itself needs no tunnel. OpenClaw on a Linux VPS? Either move LiteLLM to the VPS or use private networking—never put the master key in a public compose repo on the VPS.

Scripts: Anything using the OpenAI SDK just changes base_url:

Python 示例
from openai import OpenAI

                client = OpenAI(
                    base_url="http://127.0.0.1:4000/v1",
                    api_key="sk-virtual-cursor-xxxxxxxx",
                )

                resp = client.chat.completions.create(
                    model="smart",
                    messages=[{"role": "user", "content": "总结今日 commit"}],
                )
                print(resp.choices[0].message.content)

Routing and cost control

An enterprise-grade gateway isn't about calling models—it's about knowing when to call the expensive ones. Three routing rules for a personal stack:

  • Default to fast: completions, formatting, simple Q&A on gpt-4o-mini—roughly one-tenth the cost of Sonnet-class models.
  • Explicit smart upgrade: architecture design, complex refactors, multi-file reasoning—switch client or OpenClaw routing to smart (with OpenRouter fallback chain).
  • Two-layer budgets: OpenRouter console sets total credit cap; LiteLLM virtual keys set per-client caps—both must trip before you're truly cut off.

OpenRouter accepts a models array for provider-level fallback. LiteLLM's model_list decouples business aliases (smart/fast) from real model IDs—swap models in YAML later, zero client changes.

Security baseline (non-negotiable)

The most common personal-gateway incidents: keys in Git, or the gateway exposed on the public internet. Minimum five rules:

  • .env and plist secrets never in version control; .gitignore covers .env, *.db, litellm.log.
  • LiteLLM binds 127.0.0.1 only; remote access via SSH -L or Tailscale; add Nginx + mTLS for multi-user setups.
  • OpenRouter key leaked? Rotate immediately in the console. OpenRouter partners with GitHub Secret Scanning, but proactive rotation beats waiting for an alert.
  • Periodically sqlite3 litellm.db export spend logs and reconcile against the OpenRouter dashboard—anomalous traffic means revoke virtual keys now.
  • Enable FileVault on the Cloud Mac; SSH key-only login. Same discipline as Linux gateway delivery: separate change audit from secret audit.
Common mistake
Pasting the OpenRouter key directly into Cursor on your laptop while running a gateway on Cloud Mac—double billing, double leak surface. One entry point: laptop tunnels in, all LLM traffic exits through the Cloud Mac gateway.

FAQ

Can I skip LiteLLM and point clients straight at OpenRouter? Yes—for a solo minimal setup. You lose virtual keys, unified logs, and local routing aliases. Add a second client and you'll rebuild the gateway layer anyway.

Gateway on Cloud Mac or Linux VPS? Pure LLM forwarding with no macOS tooling? VPS is cheaper. OpenClaw + MCP + Xcode automation? One Cloud Mac for execution + gateway is simpler.

What if OpenRouter goes down? Add a second upstream in LiteLLM's model_list (e.g. direct Anthropic key as emergency route), or temporarily point fast at an OpenRouter free-tier model for degradation.

Does latency increase? One extra proxy hop is usually single-digit to low tens of milliseconds—noise compared to LLM inference. Bottlenecks are model choice and region routing, not LiteLLM itself.

Gateway and agent on the same Cloud Mac mini

Personal AI gateways fall apart when control plane lives on a laptop, execution on a cloud box, and secrets in chat logs—three machines, three sources of truth. Running LiteLLM and OpenClaw together on a VPSSpark Cloud Mac mini M4 puts unified memory to work: concurrent agents plus a lightweight proxy without breaking a sweat. Native macOS Unix means Homebrew, Python venv, launchd, and local MCP tooling work out of the box—no replicating macOS capabilities on Linux.

M-series Mac minis idle in single-digit watts—fine for 24/7 gateway duty without sweating the power bill. Gatekeeper, SIP, and FileVault stack up to a smaller long-term API-key exposure than a typical Windows workstation. When your laptop sleeps, your agent shouldn't die—collapse model ingress and execution onto one stable Cloud Mac.

Building an OpenRouter-backed personal AI gateway? VPSSpark Cloud Mac mini M4 is a one-stop control-plane starting pointsee plans and pricing, and put keys, routing, and agents in the same room for once.

Limited offer

OpenRouter upstream · Cloud Mac Gateway · Agent on the same host

Virtual key governance · model fallback · launchd 24/7 · keys never on clients

Back to home
Limited offer See plans now