Cloud Mac + OpenRouter 실전 가이드: 개인 전용 엔터프라이즈 AI Gateway 구축

2026년 현재 개인 개발자의 데스크에는 Cursor, OpenClaw, 자작 스크립트, CLI Agent가 동시에 돌아가는 경우가 많습니다. 도구마다 API Key를 따로 들고, 벤더마다 Base URL이 제각각——진짜 골치 아픈 건 「모델이 약하다」가 아니라 통합 AI Gateway가 없다는 점입니다. 키가 흩어지고, 청구서가 맞지 않으며, Provider 한 곳이 죽으면 전부 멈춥니다. 이 글은 VPSSpark 클라우드 Mac을 제어·실행면으로, OpenRouter를 상류 모델 집약으로, LiteLLM을 자체 호스팅 Gateway 셸로 쓰는, 가상 키·예산 차단·모델 폴백이 있는 「개인용 엔터프라이즈」 진입점을 30분 안에 그대로 따라 할 수 있는 경로를 제시합니다.

한 줄 요약:OpenRouter가 「Key 하나로 500+ 모델」; 클라우드 Mac의 LiteLLM Proxy가 「나만의 Gateway」——외부에는 http://127.0.0.1:4000만 노출, 내부는 OpenRouter로 라우팅. OpenClaw / Cursor / 스크립트는 모두 로컬 Gateway를 바라보고, 마스터 키는 클라이언트에 넣지 않습니다.

그림 1 · Cloud Mac + OpenRouter 개인 AI Gateway 3계층 스택

클라이언트 계층Cursor · OpenClaw · CLI · MCP

자체 호스팅 Gateway（클라우드 Mac）LiteLLM Proxy · 가상 Key · 로그 · 예산

상류 모델 계층OpenRouter · 다중 Provider 폴백

500+

OpenRouter 사용 가능 모델

대외 Gateway 엔드포인트

7×24

클라우드 Mac launchd 상시

왜 「클라우드 Mac + OpenRouter」인가——OpenRouter만으로는 부족한 이유

OpenRouter는 호스팅형 Gateway입니다. 가입하면 Key를 받고, 공식 문서에 따르면 API는 OpenAI Chat Completions와 높은 호환성을 갖습니다. 빠른 연동에는 적합하지만, 해결하는 것은 상류 집약이지 당신의 거버넌스 경계가 아닙니다. Cursor와 OpenClaw에 각각 독립적으로 폐기할 수 있는 Key를 주기 어렵고, Provider 청구 밖에서 팀 단위 spend cap을 걸기도 힘듭니다. Gateway를 Xcode·AppleScript·로컬 MCP와 같은 macOS 실행면에 두는 것도 불가능에 가깝습니다.

클라우드 Mac의 가치는 「제어면과 Apple 생태계 실행면이 한 몸」이라는 점입니다. Gateway 프로세스를 launchd로 상시 구동하고, 비밀은 서버 .env에만 둡니다. OpenClaw 실행, 로컬 Git, iOS 빌드 트리거까지 노트북으로 컨텍스트를 되돌릴 필요가 없습니다. Linux VPS에서 게이트웨이를 이미 돌리고 있다면 VPS는 IM/Webhook 진입 전용, 클라우드 Mac은 Gateway + 빌드 전용으로 나누는 편이 낫습니다——역할 분담은 Linux VPS OpenClaw 게이트웨이와 CI/CD 의사결정 매트릭스를, 클라우드 Mac 쪽 launchd 상시 차이는 클라우드 Mac OpenClaw 배포와 launchd FAQ를 참고하세요.

아키텍처 선택: 3계층 역할 분담

먼저 책임을 나누면 이후 설정이 꼬이지 않습니다.

계층	구성요소	역할	하지 않는 것
상류	OpenRouter	다중 모델 통합 과금, Provider 폴백, 토큰 종량	키 거버넌스·내부망 격리 대체 불가
Gateway	LiteLLM Proxy（클라우드 Mac）	가상 Key, 라우팅 테이블, 로그, 예산, OpenAI 호환 출구	채팅 세션 장기 호스팅（OpenClaw에 위임）
실행	클라우드 Mac + OpenClaw	7×24 Agent, MCP, macOS 자동화, CI 트리거	마스터 API Key를 노트북에 배포하지 않음

개인 vs 소규모 팀

혼자 쓸 때는 LiteLLM Master Key와 Virtual Key 2~3개면 충분합니다. 3~5인 소팀이면 Nginx 리버스 프록시 + Tailscale 내부망을 더해 Gateway를 공인망에 그대로 노출하지 마세요. 규모와 관계없이 OpenRouter 마스터 Key는 클라우드 Mac .env에만 쓰고, 클라이언트는 항상 Virtual Key를 씁니다.

착수 전 체크리스트（15분 자가 점검）

순서대로 체크하면 「연결 안 됨」 류 문제의 9할을 막을 수 있습니다.

OpenRouter 계정 개설, API Key 생성, 월간 credit 상한 설정 완료.
클라우드 Mac SSH 가능, arm64（Apple Silicon）, macOS 14 이상.
Homebrew 설치됨. Python 3.11+ 또는 Docker Desktop（택일, 본문은 pip 경로로 최경량）.
Gateway는 127.0.0.1:4000만 리슨. 원격 호출 시 Tailscale / SSH 터널을 먼저 쓰고 4000을 공인망에 직접 열지 말 것.
노트북 Cursor / OpenClaw에서 클라우드 Mac SSH 가능（키 로그인, 비밀번호 비활성）.

클라우드 Mac 기초와 구매는 Mac VPS와 클라우드 macOS 구매 가이드（2026）를 참고하세요.

1단계: OpenRouter 상류 설정

OpenRouter 콘솔에 로그인해 cloud-mac-gateway 이름의 Key를 만듭니다. credit limit（예: $20/월）을 체크해 하드 차단으로 쓰는 것을 권장합니다. Key를 적은 뒤 즉시 클라우드 Mac에만 기록하고 Git에는 넣지 마세요.

클라우드 Mac에서 curl로 상류 연결을 검증합니다（$OPENROUTER_API_KEY 치환）:

验证 OpenRouter 上游

export OPENROUTER_API_KEY="sk-or-v1-xxxxxxxx"

                curl -s https://openrouter.ai/api/v1/chat/completions \
                  -H "Authorization: Bearer $OPENROUTER_API_KEY" \
                  -H "Content-Type: application/json" \
                  -d '{
                    "model": "openai/gpt-4o-mini",
                    "messages": [{"role": "user", "content": "ping"}]
                  }' | head -c 400

JSON이 돌아오고 choices가 있으면 상류 OK입니다. 모델 ID는 provider/model 형식. 전체 목록은 OpenRouter Models 페이지, 자주 쓰는 예: anthropic/claude-sonnet-4, openai/gpt-4o, google/gemini-2.5-pro-preview.

2단계: 클라우드 Mac에 LiteLLM Gateway 설치

LiteLLM은 오픈소스 LLM Gateway이며 문서는 docs.litellm.ai입니다. OpenRouter를 자신만의 OpenAI 호환 엔드포인트로 감싸고 가상 키와 spend tracking을 지원합니다.

安装与目录初始化

# 云 Mac SSH 会话内
                brew install python@3.12
                python3.12 -m venv ~/ai-gateway/.venv
                source ~/ai-gateway/.venv/bin/activate
                pip install 'litellm[proxy]'

                mkdir -p ~/ai-gateway && cd ~/ai-gateway

config.yaml을 만듭니다——Gateway 라우팅의 심장입니다. 아래 예시는 기본을 Claude Sonnet으로 두고 실패 시 GPT-4o mini로 폴백합니다（OpenRouter models 배열로 폴백 트리거）:

~/ai-gateway/config.yaml

model_list:
                  - model_name: smart
                    litellm_params:
                      model: openrouter/anthropic/claude-sonnet-4
                      api_key: os.environ/OPENROUTER_API_KEY
                      models:
                        - openrouter/anthropic/claude-sonnet-4
                        - openrouter/openai/gpt-4o-mini

                  - model_name: fast
                    litellm_params:
                      model: openrouter/openai/gpt-4o-mini
                      api_key: os.environ/OPENROUTER_API_KEY

                litellm_settings:
                  drop_params: true
                  set_verbose: false

                general_settings:
                  master_key: os.environ/LITELLM_MASTER_KEY
                  database_url: "sqlite:///./litellm.db"

같은 디렉터리에 .env 생성（권한 chmod 600）:

~/ai-gateway/.env

OPENROUTER_API_KEY=sk-or-v1-xxxxxxxx
                LITELLM_MASTER_KEY=sk-local-master-xxxxxxxx

Proxy 시작（먼저 포그라운드 시험）:

启动 LiteLLM Proxy

cd ~/ai-gateway
                source .venv/bin/activate
                set -a && source .env && set +a

                litellm --config config.yaml --host 127.0.0.1 --port 4000

다른 터미널에서 Master Key로 루프백 테스트:

环回验证 Gateway

curl -s http://127.0.0.1:4000/v1/chat/completions \
                  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
                  -H "Content-Type: application/json" \
                  -d '{
                    "model": "smart",
                    "messages": [{"role": "user", "content": "gateway ok?"}]
                  }'

가상 키（다음 단계 권장）

포그라운드 검증 후 LiteLLM Admin UI（litellm --config config.yaml --detailed_debug 실행 뒤 문서의 /ui 경로）에서 Cursor·OpenClaw용 Virtual Key를 각각 만들고 월 $5/$10 예산을 설정하세요. 특정 클라이언트가 유출돼도 해당 Virtual Key만 폐기하고 OpenRouter 마스터 Key는 그대로 둡니다.

3단계: launchd 상시 구동（클라우드 Mac 7×24）

검증이 끝나면 Gateway를 launchd에 맡겨 SSH가 끊겨도 멈추지 않게 합니다. ~/Library/LaunchAgents/com.vpsspark.litellm.plist 생성:

launchd plist 示意

<?xml version="1.0" encoding="UTF-8"?>
                <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
                 "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
                <plist version="1.0">
                <dict>
                  <key>Label</key><string>com.vpsspark.litellm</string>
                  <key>ProgramArguments</key>
                  <array>
                    <string>/Users/YOUR_USER/ai-gateway/.venv/bin/litellm</string>
                    <string>--config</string>
                    <string>/Users/YOUR_USER/ai-gateway/config.yaml</string>
                    <string>--host</string><string>127.0.0.1</string>
                    <string>--port</string><string>4000</string>
                  </array>
                  <key>EnvironmentVariables</key>
                  <dict>
                    <key>OPENROUTER_API_KEY</key><string>sk-or-v1-xxx</string>
                    <key>LITELLM_MASTER_KEY</key><string>sk-local-master-xxx</string>
                  </dict>
                  <key>RunAtLoad</key><true/>
                  <key>KeepAlive</key><true/>
                  <key>StandardOutPath</key>
                  <string>/Users/YOUR_USER/ai-gateway/litellm.log</string>
                  <key>StandardErrorPath</key>
                  <string>/Users/YOUR_USER/ai-gateway/litellm.err</string>
                </dict>
                </plist>

로드 및 확인:

launchctl

launchctl load ~/Library/LaunchAgents/com.vpsspark.litellm.plist
                launchctl list | grep litellm
                curl -fsS http://127.0.0.1:4000/health || echo "check logs"

클라우드 Mac 상시 구동과 장애 대응은 launchd 환경 검증 FAQ의 「로그인 세션 vs 백그라운드 데몬」 절을 대조하세요——Gateway와 OpenClaw는 각각 plist를 써서 재시작 영향을 분리할 수 있습니다.

4단계: Cursor·OpenClaw·스크립트 연결

모든 클라이언트에서 바꿀 것은 두 가지뿐: Base URL을 Gateway로, API Key는 Virtual Key（관리용만 Master Key）.

Cursor: Settings → Models → Override OpenAI Base URL → http://127.0.0.1:4000/v1（Cursor가 노트북·Gateway가 클라우드 Mac이면 ssh -L 4000:127.0.0.1:4000 user@cloud-mac 로컬 포워딩 또는 Tailscale 내부 주소）. 모델명은 smart / fast로 config.yaml model_name과 일치.

OpenClaw: Gateway 설정 문서의 환경 변수 또는 설정 파일에서 LLM Provider를 OpenAI 호환으로 두고 OPENAI_API_BASE=http://127.0.0.1:4000/v1, OPENAI_API_KEY=<virtual-key>. OpenClaw가 클라우드 Mac 본체에서 돌면 터널 불필요. Linux VPS에서 돌리면 LiteLLM을 VPS로 옮기거나 내부망 터널을 쓰고——Master Key를 VPS 공개 compose 저장소에 쓰지 마세요.

범용 스크립트: OpenAI SDK 지원 코드는 base_url만 변경:

Python 示例

from openai import OpenAI

                client = OpenAI(
                    base_url="http://127.0.0.1:4000/v1",
                    api_key="sk-virtual-cursor-xxxxxxxx",
                )

                resp = client.chat.completions.create(
                    model="smart",
                    messages=[{"role": "user", "content": "总结今日 commit"}],
                )
                print(resp.choices[0].message.content)

모델 라우팅과 비용 통제

엔터프라이즈급 Gateway의 핵심은 「모델을 부를 수 있다」가 아니라 「언제 비싼 모델을 써야 하는지 안다」입니다. 개인 스택에서는 아래 세 가지 라우팅을 고정하는 것을 권합니다:

기본은 fast: 코드 완성·포맷·단순 Q&A는 gpt-4o-mini급. 비용은 Sonnet의 대략 1/10.
명시적으로 smart: 아키텍처 설계·대규모 리팩터·다중 파일 추론 시 smart（OpenRouter 폴백 체인 포함）.
이중 예산: OpenRouter 콘솔 총 credit cap + LiteLLM Virtual Key 클라이언트별 cap——둘 다 걸려야 진짜 「차단」.

OpenRouter는 요청 models 배열로 Provider급 폴백을 지원하고, LiteLLM model_list는 비즈니스 별칭（smart/fast）과 실제 모델 ID를 분리해 이후 모델 교체는 YAML만 고치면 클라이언트는 그대로입니다.

보안 기준선（필수, 선택이 아님）

개인 Gateway에서 가장 흔한 사고는 Key가 Git에 들어가거나 Gateway가 공인망에 노출되는 것입니다. 최소한 아래 다섯 가지를 지키세요:

.env·plist 비밀은 버전 관리에 넣지 않음. .gitignore에 .env, *.db, litellm.log 고정.
LiteLLM은 127.0.0.1만 바인딩. 원격은 SSH -L 또는 Tailscale, 다인 공유 시 Nginx + mTLS 앞단.
OpenRouter Key 유출 시 콘솔에서 즉시 Rotate. GitHub Secret Scanning 연동은 있으나 능동 로테이션이 기준.
정기적으로 sqlite3 litellm.db에서 spend 사용 로그를 추출해 OpenRouter Dashboard와 대조. 이상 트래픽은 즉시 Virtual Key 폐기.
클라우드 Mac FileVault 활성, SSH는 키만. Linux 게이트웨이 배포와 같이 「변경」과 「비밀」을 감사상 분리.

흔한 실수

노트북 Cursor에 OpenRouter Key를 직접 넣고 클라우드 Mac에서 또 Gateway를 돌리면——청구가 이중이 되고 유출 면도 둘. 통일 진입점: 노트북은 터널만, 모든 LLM 트래픽은 클라우드 Mac Gateway 경유로 외부망.

FAQ

LiteLLM 없이 클라이언트가 OpenRouter 직결? 가능하며 1인 미니멀에 맞습니다. 다만 가상 키·통합 로그·로컬 라우트 별칭을 잃고, 클라이언트가 늘면 결국 Gateway 층을 보강하게 됩니다.

Gateway는 클라우드 Mac vs Linux VPS? LLM 중계만 있고 macOS 도구가 없으면 VPS가 저렴합니다. OpenClaw + MCP + Xcode 자동화가 섞이면 클라우드 Mac 한 대로 실행면 + Gateway가 편합니다.

OpenRouter 장애 시? LiteLLM model_list에 제2 상류（예: Anthropic Key 직결 emergency route）를 추가하거나, 임시로 fast를 OpenRouter free tier 모델로 다운그레이드.

지연이 늘어나나? Proxy 한 홉은 보통 수 ms~수십 ms. LLM 추론 시간 대비 무시 가능. 병목은 모델 선택과 리전 라우팅에 많고 LiteLLM 자체는 아닙니다.