DevoraDevoraDocs

AI Models (LLM)

OpenAI-compatible reverse proxy for multiple AI models.

Devora provides an OpenAI-compatible reverse proxy that lets you access multiple AI models from different providers through a single unified endpoint.


How It Works

When you send a chat request, Devora acts as a proxy between you and the upstream AI provider:

┌─────────┐     ┌──────────────────┐     ┌─────────────────┐
│  You    │────▶│  Devora Proxy    │────▶│  AI Provider    │
│ (Client)│     │  /api/v1/ai      │     │  (OpenAI, etc.) │
└─────────┘     └──────────────────┘     └─────────────────┘

Devora handles authentication, model validation, routing, and streaming — you just use the standard OpenAI API format.


Getting Started

1. Get an API Key

Before using the LLM API, you need a Devora API key:

Your API key must be included in every request as a Bearer token:

Authorization: Bearer devora_your_api_key

2. Choose a Model

Models are managed inside Devora by administrators. The list below is auto-fetched live from our database — if an admin adds a new model, it appears here automatically.

Loading available models…

Where do models come from?
Admins add models via the AI Providers dashboard. Each model has an ID, name, provider (owned_by), and optional routing config (baseUrl). The table above pulls directly from /api/v1/ai/models.

3. Send a Request

Use the base URL below with any OpenAI-compatible client or SDK.


Base Configuration

SettingValue
Base URLhttps://devora.my.id/api/v1/ai
Chat Endpoint/chat/completions
Models Endpoint/models
Auth HeaderAuthorization: Bearer <token>

[!TIP] When using the OpenAI SDK, set base_url to https://devora.my.id/api/v1/ai. The SDK automatically appends /chat/completions.


Authentication

All requests require an API Key in the Authorization header:

Authorization: Bearer devora_your_api_key

If your key is missing, expired, or invalid, the API returns:

{
  "error": {
    "message": "Invalid API Key",
    "type": "invalid_request_error",
    "code": "invalid_api_key"
  }
}

Model Identifiers

Use the exact id from the table above (for example, gpt-4o, gemini-1.5-pro, etc.) as the model parameter in your request body.

Private Models

Models with an email prefix (for example, [email protected]/gpt-4o) are private. They are only accessible by:

  • The owner of that email address.
  • Users with the ULTRA role.

Model Statuses

Models can be in different states which affect their availability:

StatusBehaviorVisibility
ACTIVEFull access to all authorized users.Public
SUSPENDEDReturns 403 Forbidden. Temporarily disabled.Public
RESTRICTEDOnly accessible by whitelisted emails.Public
HIDDENReturns 403 Forbidden. Only visible to Admins.Internal

Request Format

The chat endpoint accepts a standard OpenAI-compatible JSON body:

{
  "model": "gpt-4o",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "Hello!" }
  ],
  "stream": false,
  "temperature": 0.7,
  "max_tokens": 2048
}

Supported Parameters

ParameterTypeDescription
modelstringRequired. Model ID from the list above.
messagesarrayRequired. Conversation messages (role, content).
streambooleanOptional. Set true for Server-Sent Events (SSE) streaming.
temperaturenumberOptional. Sampling temperature (0–2).
max_tokensintegerOptional. Maximum tokens to generate.
top_pnumberOptional. Nucleus sampling.
presence_penaltynumberOptional. Penalty for new topics.
frequency_penaltynumberOptional. Penalty for repetition.

Example Usage

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    api_key="devora_your_key",
    base_url="https://devora.my.id/api/v1/ai"
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True
)

for chunk in response:
    print(chunk.choices[0].delta.content or "", end="")

cURL

curl https://devora.my.id/api/v1/ai/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer devora_your_api_key" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": false
  }'

JavaScript / TypeScript

const res = await fetch("https://devora.my.id/api/v1/ai/chat/completions", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    Authorization: "Bearer devora_your_api_key",
  },
  body: JSON.stringify({
    model: "gpt-4o",
    messages: [{ role: "user", content: "Hello!" }],
    stream: false,
  }),
});

const data = await res.json();
console.log(data.choices[0].message.content);

Streaming Responses

Set stream: true to receive tokens in real-time via Server-Sent Events (SSE). This is ideal for chat interfaces.

Headers returned:

Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive

Error Handling

Devora returns standard HTTP status codes and OpenAI-compatible error objects:

StatusMeaningWhen It Happens
400Bad RequestInvalid JSON or missing model parameter.
401UnauthorizedMissing or invalid API key.
403ForbiddenModel is suspended, hidden, or restricted.
404Not FoundModel ID does not exist in the database.
429Rate LimitUpstream provider is in cooldown.
502Bad GatewayCannot connect to the AI provider.
504Gateway TimeoutProvider took too long to respond.

Example Error Response

{
  "error": {
    "message": "Model 'gpt-4o' is currently suspended.",
    "type": "access_denied",
    "code": 403
  }
}


[!NOTE] The model table above is rendered by a React component (<ModelTable />) that fetches /api/v1/ai/models on page load. As soon as an admin adds or removes a model in the dashboard, the table updates automatically — no manual doc edits required.

On this page