← All docs

LLM Gateway

Every exe.dev VM has access to the LLM Gateway, a built-in proxy to Anthropic, OpenAI, and Fireworks APIs. Your subscription includes a monthly token allocation, and you can purchase additional tokens at https://exe.dev/user.

See the full list of supported models (JSON).

The gateway is available inside your VM at http://169.254.169.254/gateway/llm/provider, where provider is one of anthropic, openai, or fireworks. No API keys are necessary.

Shelley uses the LLM Gateway by default, but you can also use it directly from any program running on your VM.

Using the gateway with Codex

Add an OpenAI-compatible provider to ~/.codex/config.toml:

model_provider = "exe-openai"

[model_providers.exe-openai]
name = "exe.dev LLM Gateway"
base_url = "http://169.254.169.254/gateway/llm/openai/v1"
requires_openai_auth = false

Then run Codex normally:

$ codex

The base_url ends at /v1. Codex adds the Responses API path when it makes model requests.

Using the gateway with Claude Code

Add the Anthropic gateway base URL to ~/.claude/settings.json:

{
  "apiKeyHelper": "printf exe-gateway",
  "env": {
    "ANTHROPIC_BASE_URL": "http://169.254.169.254/gateway/llm/anthropic"
  }
}

Claude Code expects an API key source, so apiKeyHelper returns a harmless placeholder. The gateway authenticates the VM; you do not need an Anthropic API key.

Then run Claude Code normally:

$ claude

The ANTHROPIC_BASE_URL ends at /anthropic. Claude Code adds the Anthropic API paths when it makes model requests.

Using the gateway with curl

Point your requests at the gateway URL instead of the provider:

$ curl -s http://169.254.169.254/gateway/llm/anthropic/v1/messages \
    -H "content-type: application/json" \
    -H "anthropic-version: 2023-06-01" \
    -d '{
      "model": "claude-sonnet-4-6",
      "max_tokens": 256,
      "messages": [{"role": "user", "content": "Hello!"}]
    }'

OpenAI and Fireworks work the same way:

$ curl -s http://169.254.169.254/gateway/llm/openai/v1/chat/completions \
    -H "content-type: application/json" \
    -d '{
      "model": "gpt-5.5",
      "messages": [{"role": "user", "content": "Hello!"}]
    }'

$ curl -s http://169.254.169.254/gateway/llm/fireworks/inference/v1/chat/completions \
    -H "content-type: application/json" \
    -d '{
      "model": "accounts/fireworks/models/llama-v3p1-8b-instruct",
      "messages": [{"role": "user", "content": "Hello!"}]
    }'

Every `exe.dev` VM has access to the LLM Gateway, a built-in proxy to
Anthropic, OpenAI, and Fireworks APIs. Your subscription includes a monthly
token allocation, and you can purchase additional tokens at
[https://exe.dev/user](https://exe.dev/user).

See the [full list of supported models](/llm-gateway-models) ([JSON](/llm-gateway-models.json)).

The gateway is available inside your VM at
`http://169.254.169.254/gateway/llm/provider`, where `provider` is one of
`anthropic`, `openai`, or `fireworks`. No API keys are necessary.

[Shelley](/docs/shelley/intro) uses the LLM Gateway by default, but you can
also use it directly from any program running on your VM.

## Using the gateway with Codex

Add an OpenAI-compatible provider to `~/.codex/config.toml`:

```toml
model_provider = "exe-openai"

[model_providers.exe-openai]
name = "exe.dev LLM Gateway"
base_url = "http://169.254.169.254/gateway/llm/openai/v1"
requires_openai_auth = false
```

Then run Codex normally:

```sh
$ codex
```

The `base_url` ends at `/v1`. Codex adds the Responses API path when it
makes model requests.

## Using the gateway with Claude Code

Add the Anthropic gateway base URL to `~/.claude/settings.json`:

```json
{
  "apiKeyHelper": "printf exe-gateway",
  "env": {
    "ANTHROPIC_BASE_URL": "http://169.254.169.254/gateway/llm/anthropic"
  }
}
```

Claude Code expects an API key source, so `apiKeyHelper` returns a harmless
placeholder. The gateway authenticates the VM; you do not need an Anthropic
API key.

Then run Claude Code normally:

```sh
$ claude
```

The `ANTHROPIC_BASE_URL` ends at `/anthropic`. Claude Code adds the Anthropic
API paths when it makes model requests.

## Using the gateway with curl

Point your requests at the gateway URL instead of the provider:

```
$ curl -s http://169.254.169.254/gateway/llm/anthropic/v1/messages \
    -H "content-type: application/json" \
    -H "anthropic-version: 2023-06-01" \
    -d '{
      "model": "claude-sonnet-4-6",
      "max_tokens": 256,
      "messages": [{"role": "user", "content": "Hello!"}]
    }'
```

OpenAI and Fireworks work the same way:

```
$ curl -s http://169.254.169.254/gateway/llm/openai/v1/chat/completions \
    -H "content-type: application/json" \
    -d '{
      "model": "gpt-5.5",
      "messages": [{"role": "user", "content": "Hello!"}]
    }'
```

```
$ curl -s http://169.254.169.254/gateway/llm/fireworks/inference/v1/chat/completions \
    -H "content-type: application/json" \
    -d '{
      "model": "accounts/fireworks/models/llama-v3p1-8b-instruct",
      "messages": [{"role": "user", "content": "Hello!"}]
    }'
```