14. Troubleshooting & Common Pitfalls

This chapter collects the most common problems, their causes, and their fixes, along with a few design quirks that are "working as intended" but easy to find alarming.

14.1 Quick Troubleshooting Table

Symptom Cause / How to Diagnose
Kernel reports scanpy not found The Python environment isn't set up. OMICOS_ENV_DIR is unset or points to an invalid directory. Run omicos env setup and omicos env doctor, export the variable, then restart omicos.
omicos cli reports no user_token Not logged in. Log in with omicos login (email/password) or omicos cli login (device code).
[omicos] offline cloud_login.json has no process_token. Run omicos login again.
omicos serve already running on this workspace A daemon is already running in the same directory. Stop it with Ctrl+C, run pkill -f 'omicos serve', or use a different --data-dir.
The web UI is stuck on "Connecting..." The local daemon isn't running, or the port is taken. Run pgrep -fl 'omicos serve' to check for the process.
The web UI shows "No history yet" You selected the wrong process. In the process picker, entries with the (cli) suffix and entries without it are two separate namespaces.
Remote access fails with connection refused The SSH port-forwarding window dropped. Keep the SSH window open, or switch to the cloud relay (after running omicos login on the server, omicos serve --no-browser relays automatically — no --upstream-base-url needed).
no model provider configured No usable provider or API key. Set OMICOS_LLM_PROVIDER or configure the corresponding *_API_KEY.
Paid features suddenly unavailable The plan token failed to renew and downgraded to community. Run omicos login again.

14.2 Design Pitfalls That Trip People Up

The following are not bugs — they're intentional, but they're easy to find confusing the first time you hit them:

serve and cli sessions don't share state

The two use different data-dirs by default (.omicos vs .omicos/cli), so their session histories are isolated from each other. They also cannot run at the same time in the same workspace (they share a single lock, serve.pid).

--upstream-base-url doesn't decide where the kernel runs, and doesn't enable the cloud relay

It only configures a local HTTP proxy fallback (forwarding unmatched /api/* requests to the backend — the escape hatch for cloud sync). It does not decide where the kernel runs, and it is not the switch for the cloud relay — the relay is determined by whether the daemon holds a process_token (from omicos login or OMICOS_PROCESS_TOKEN). The kernel defaults to 5055 on the local machine and is controlled separately by --kernel-base-url. Setting only --kernel-base-url without --upstream-base-url produces a warning but no error.

scanpy not found is a deferred error

When the Python environment isn't set up, the daemon still starts fine — the error only surfaces once you actually run analysis code. So always run omicos env setup first.

Switching accounts triggers a "self-healing restart"

When you switch login accounts within a workspace, the cloud sends WebSocket close code 4403, and omicos rotates the workspace_id and automatically restarts itself — it looks like a crash, but it's expected recovery. The cost is that the session-sync high-water mark is cleared, so the session list under the new account may appear empty (your old local session files are still there).

Hitting Ctrl+C too soon can drop unsynced messages

Session sync is asynchronous. If you Ctrl+C / kill right after sending a message, messages still in memory that haven't been synced out yet can be lost. After important work, wait a few seconds to let it finish syncing before you exit.

You can't use --debug in cli mode

The cli's stdout is taken over by the TUI, so --debug / --log-filter aren't available — you can only use RUST_LOG=... omicos cli. And invalid filter expressions in RUST_LOG / OMICOS_LOG_FILTER are silently ignored and fall back to the default, so you may not realize your filter never took effect.

Device-code login requires "another already-logged-in device"

omicos cli login (the device-code method) needs another device that's already logged into an omicOS account to confirm the pairing code. If all you have is a single brand-new machine that has never been logged in, just use omicos login with email/password instead — it completes entirely in the terminal and needs no other device.

Some values that get silently handled

  • OMICOS_CATALOG_SYNC_SECS < 30 is not clamped to 30; it's discarded and falls back to the 600-second default.
  • URL variables like OMICOS_CLOUD_BASE must not have a trailing slash (it gets trimmed off).
  • Boolean switches only recognize 1/true/yes/on; any other value is treated as "off".
  • Empty or whitespace-only environment variables are always treated as unset.

Credential files are not encrypted

cloud_login.json, auth.json, and plan_token.jwt are all plaintext, protected only by 0600 permissions. Never commit them to git, paste them, or hand them to anyone. Windows uses a different permission model, so mind your umask.

14.3 Collecting Diagnostic Information

Before reporting a problem, gather the following:

omicos env doctor                          # Python / kernel diagnostics
omicos login --status                      # login status (prints "not logged in" if not logged in)
curl -sS http://127.0.0.1:5055/api/version # version + git revision (omicos has no --version flag)
curl -sS http://127.0.0.1:5055/health      # service health (returns JSON)
RUST_LOG=omicos_core=debug omicos serve    # reproduce with debug logging (or omicos serve --debug true)

14.4 Notes on Offline Mode

When you enable the OMICOS_*_OFFLINE family of switches, agents / skills / memory / models may be a stale cache and won't reflect the latest content from the cloud. When investigating "why don't I see the new agent," first make sure you're not in offline mode.

results matching ""

    No results matching ""