12. Advanced (1): LLM Providers & Models

The agents behind omicOS are powered by large language models. By default you can just use the models provided by the omicOS cloud, but advanced users can bring their own API key and specify a provider and model. This chapter explains the selection logic.

12.1 Provider Selection Priority

When omicos decides "which vendor to use," it tries the following in order and stops at the first match:

  1. config.provider in the chat request (passed explicitly from the web UI / API)
  2. The OMICOS_LLM_PROVIDER environment variable
  3. The OMICOS_PROVIDER environment variable (legacy alias)
  4. Inferred from the model name (e.g. if the model name contains deepseek, use deepseek)
  5. Detected from the first available API key (DEEPSEEK_API_KEY → deepseek; MINIMAX_API_KEY → minimax; OPENAI_API_KEY → openai)

If none of these match, it raises the error no model provider configured.

12.2 Model Selection Priority

Deciding "which model to use" works similarly:

  1. config.model in the chat request
  2. OMICOS_LLM_MODEL
  3. OMICOS_MODEL (legacy alias)
  4. The provider's default: deepseek → deepseek-v4-flash, everything else → gpt-4o-mini

The model name may carry a provider/ prefix (which is stripped automatically), e.g. deepseek/deepseek-v4-flash.

12.3 Which Providers Are Supported

Category Examples Notes
OpenAI-compatible (catalog-driven) openai, deepseek, qwen, zhipu, moonshot, xai, groq, mistral, ollama, openrouter, together, fireworks, deepinfra, cerebras, perplexity, minimax, siliconflow, and more Supplied dynamically by the cloud model catalog; the most common case
OAuth codex (OpenAI), gemini-cli (Google) Use a third-party account's OAuth credentials instead of an API key
Custom custom_openai, custom_anthropic Point at your own self-hosted / private endpoint

Note: The mock provider is explicitly disabled at runtime; native anthropic is not implemented, so you need to connect through a custom_anthropic compatible endpoint. Also, omicos does not send a temperature parameter to the server (some reasoning models reject temperature≠1).

12.4 Where the API Key Comes From

When omicos resolves the API key for a given provider, it looks in the following order:

  1. Environment variable: <PROVIDER>_API_KEY, where hyphens in the provider id become underscores. For example, the key for alibaba-coding-plan is ALIBABA_CODING_PLAN_API_KEY.
  2. The auth.json fallback file, searched in this order:
    • $OMICOS_LOCAL_HOME/auth.json
    • $OMICOS_RUNTIME_HOME/auth.json
    • <current directory>/.omicos/auth.json
    • ~/.omicos/auth.json

auth.json is a simple JSON dictionary; only non-empty values take effect:

{
  "DEEPSEEK_API_KEY": "sk-...",
  "OPENAI_API_KEY": "sk-..."
}

Ollama is a special case: when no key is set it automatically uses the placeholder ollama, since a local Ollama instance does not require a real key.

12.5 Custom Endpoints

Each provider's endpoint is resolved in this order:

  1. The <PROVIDER>_API_BASE environment variable
  2. The api_base from the cloud catalog (cached in ~/.omicos/cloud-models/models.json)

Special case: custom_openai defaults to the endpoint http://127.0.0.1:8000/v1 (convenient for connecting to a local inference service such as vLLM).

A complete example — using a self-hosted vLLM service:

export OMICOS_LLM_PROVIDER=custom_openai
export CUSTOM_OPENAI_API_BASE=http://127.0.0.1:8000/v1
export CUSTOM_OPENAI_API_KEY=dummy           # local services usually don't validate the key
export OMICOS_LLM_MODEL=Qwen2.5-72B-Instruct
omicos serve

12.6 The Cloud Model Catalog

omicos pulls a model catalog from omicos-admin (which models are available, their context windows, dynamic flags such as whether they support vision / inline images, etc.) and caches it in ~/.omicos/cloud-models/models.json. Related variables:

Variable Purpose
OMICOS_MODELS_OFFLINE Offline mode; use only the local cache
OMICOS_MODELS_CLOUD_URL Override the catalog fetch URL
OMICOS_MODELS_CACHE_DIR Override the cache directory

12.7 Vision Models

If your analysis needs the model to "look at images" (for example, to interpret a generated chart), you can configure a separate vision model, independent of the main chat model:

export OMICOS_VISION_MODEL=gpt-4o
export OMICOS_VISION_BASE_URL=https://api.openai.com/v1
export OMICOS_VISION_API_KEY=sk-...

results matching ""

    No results matching ""