Architecture Overview
Overview
Ibex is a multi-service data platform deployed on AWS EC2 behind Traefik, with two React UIs served via S3 + CloudFront. It provides federated SQL analytics, BI reporting, AI-powered data chat, and pipeline management across heterogeneous data sources.Deployment Topology
DNS / Domains
| Domain | Purpose |
|---|---|
api.triviz.cloud | All backend APIs (Traefik entry point) |
ibex.triviz.cloud | Config UI — pipelines, data sources, monitoring (CloudFront → ibex-platform-ui) |
bi.triviz.cloud | BI UI — dashboards, charts, reports, AI chat, RAG (CloudFront → ajna-data-platform-ui-lib) |
listmonk.triviz.cloud | Email marketing (Listmonk, Traefik) |
Services
Application Services
| Service | Language | Port | Image |
|---|---|---|---|
ibex-data-platform | Python/FastAPI | 8080 | ghcr.io/ajnacloud-ksj/ibex-data-platform |
ibex-identity-service | Python/FastAPI | 8090 | ghcr.io/ajnacloud-ksj/ibex-identity-service |
ibex-analytics-service | Python/FastAPI + DuckDB | 8000 | ghcr.io/ajnacloud-ksj/ibex-analytics-service |
ibex-bi-backend | Go | 8085 | ghcr.io/ajnacloud-ksj/ibex-bi-backend |
ibex-ai-service | Python/FastAPI | 8010 | ghcr.io/ajnacloud-ksj/ibex-ai-service |
ibex-agent-engine | Node.js | 3000 | ghcr.io/ajnacloud-ksj/ibex-agent-engine |
ibex-listmonk | Go (Listmonk) | 9000 | ghcr.io/ajnacloud-ksj/ibex-listmonk |
Infrastructure Services
| Service | Purpose | Port |
|---|---|---|
traefik | TLS reverse proxy + routing | 80, 443 |
postgres-metadata | Central metadata PostgreSQL DB | 5432 (internal), 5433 (host) |
minio | S3-compatible object storage (Iceberg warehouse, file uploads) | 9000 |
minio-setup | One-shot: creates warehouse + data-ingestion buckets | — |
iceberg-rest | Apache Iceberg REST catalog (backed by MinIO) | 8181 |
vault-secrets | HashiCorp Vault — credential store | 8200 |
vault-init | One-shot: unseals Vault, seeds credentials | — |
redpanda | Kafka-compatible event streaming | 9092 |
watchtower | Auto-redeploys containers when new images are pushed | — |
Container Reference
Startup Order
Containers start in dependency order. The chain ensures each service has its dependencies healthy before it launches:traefik (ibex-traefik)
Image: traefik:v3.1
Ports: 80 (HTTP redirect), 443 (HTTPS)
Purpose: TLS-terminating reverse proxy. All external traffic enters here.
How it works:
- Reads Docker labels from other containers at runtime via the Docker socket (
/var/run/docker.sock) - Each container self-registers its own routing rules via
traefik.http.routers.*labels — no central config file needed for app routes - Automatically provisions and renews Let’s Encrypt TLS certificates via ACME (stored in
traefik_letsencryptvolume) - Routes requests by
Host+PathPrefixto the correct backend container - Config template (
traefik.yml.tpl) is rendered at startup byrender-config.shinto/tmp/traefik.yml
postgres-metadata (postgres-metadata)
Image: pgvector/pgvector:pg15
Port: 5433 (host) → 5432 (container)
Purpose: Central PostgreSQL database. Shared by multiple services.
Databases it hosts:
metadata_db— main application database (pipelines, data sources, file uploads, business rules, demo data, Vault KV store)listmonk— email marketing database (lists, subscribers, campaigns)
docker-entrypoint-initdb.d/):
01-init-business-data.sql— business tables, demo MySQL/orders/products/users data02-init-vault.sql— createsvault_kv_storetable used by Vault as its storage backend03-init-bi-metadata.sql— BI metadata schema (reports, connections)04-enable-pgvector.sql— enables thepgvectorextension for vector similarity search
pgvector image (PostgreSQL 15 + pgvector extension) to support embedding storage for RAG.
redpanda (redpanda)
Image: redpandadata/redpanda:v25.1.7
Ports: 9092 (external Kafka), 29092 (internal Kafka)
Purpose: Kafka-compatible event streaming broker for CDC pipelines.
How it works:
ibex-data-platformpublishes pipeline events and CDC change records to Redpanda topics- Auto-creates topics on first publish (
auto_create_topics_enabled=true) - Single-node, single-partition setup (sufficient for current load)
- Data persisted in
redpanda_datavolume - Two listeners:
PLAINTEXT://redpanda:29092for internal container-to-container,OUTSIDE://localhost:9092for host access
minio (minio)
Image: minio/minio:latest
Ports: 9010 (S3 API), 9011 (console UI)
Purpose: S3-compatible object storage. Stores Iceberg table data (Parquet files) and pipeline ingestion staging files.
Two buckets (created by minio-setup):
warehouse/— Iceberg table data, managed byiceberg-rest. Files are Parquet-format column data for analytics tablesdata-ingestion/— CDC and pipeline staging area whereibex-data-platformwrites ingested records before they’re committed to Iceberg
iceberg-rest and ibex-data-platform.
minio-setup (one-shot)
Image:minio/mc:latest
Purpose: Runs once at startup to create the warehouse and data-ingestion buckets in MinIO. Exits after completion. Never restarts.
iceberg-rest (iceberg-rest)
Image: tabulario/iceberg-rest:0.9.0
Port: 8181
Purpose: Apache Iceberg REST catalog. Tracks table metadata (schemas, partition specs, snapshots) for Iceberg tables stored in MinIO.
How it works:
ibex-analytics-service(DuckDB) connects to this catalog via the Iceberg REST API to discover and read Iceberg tables- The catalog stores table metadata; actual data files live in MinIO
warehouse/as Parquet - When
ibex-data-platformwrites a CDC pipeline record to an Iceberg table, it registers the new snapshot here - DuckDB then reads the latest snapshot via the catalog and executes queries directly against the Parquet files in MinIO
vault-secrets (vault-secrets)
Image: hashicorp/vault:1.15
Port: 8200
Purpose: HashiCorp Vault — secrets manager. Stores all database passwords, MinIO credentials, and API keys at runtime so they’re never hardcoded in config files.
How it works:
- Uses PostgreSQL (
vault_kv_storetable inmetadata_db) as its storage backend — no separate volume for secret data - Starts sealed;
vault-initunseals it and seeds initial secrets ibex-data-platformreads credentials from Vault at startup via a root token file mounted from thevault_keysshared volume- Vault config is rendered from a template (
render-vault-config.sh) on startup
vault-init (one-shot)
Image:hashicorp/vault:1.15
Purpose: Runs once to initialize, unseal, and seed Vault with initial credentials. Writes the root token to vault_keys volume so ibex-data-platform can read it. Exits after completion.
Secrets it seeds: PostgreSQL password, MySQL password, MinIO credentials, Listmonk API credentials, config manager admin token.
ibex-listmonk (ibex-listmonk)
Image: ghcr.io/ajnacloud-ksj/ibex-listmonk:latest
Port: 9000
Purpose: Email marketing and transactional email platform (Listmonk). Manages mailing lists, subscribers, campaigns, and SMTP delivery.
How it works:
- Stores all data in the
listmonkdatabase onpostgres-metadata - SMTP configured for Gmail (configurable via env vars)
- Exposed externally at
listmonk.triviz.cloudvia Traefik (full app, not just API) - File uploads stored in
listmonk_uploadsvolume
ibex-identity-service (ibex-identity-service)
Image: ghcr.io/ajnacloud-ksj/ibex-identity-service:latest
Port: 8090
Purpose: Authentication and user management. Issues and validates JWTs.
How it works:
- Supports two auth modes (set via
IDENTITY_AUTH_MODE):local— users stored in SQLite (/data/ajna-identity.db), passwords hashed locallycognito— delegates auth to AWS Cognito (User Pool ID + Client ID from env)
- Issues JWT access tokens (default 24h) and refresh tokens (default 7d)
- All other services call
GET /validateto verify tokens on every request - Exposed via Traefik at
/auth/*and/users/*
ibex-data-platform (ibex-data-platform)
Image: ghcr.io/ajnacloud-ksj/ibex-data-platform:latest
Port: 8080
Purpose: Config/platform manager. The control plane for data sources, CDC pipelines, business rules, and configurations.
How it works:
- Registers and manages data sources: MySQL, PostgreSQL, Iceberg, S3, REST APIs
- Configures CDC (Change Data Capture) pipelines using Redpanda as the message bus
- Syncs registered sources to
ibex-analytics-serviceso DuckDB can attach them as catalogs - Reads DB credentials from Vault at startup via the root token file mounted at
/vault/keys/root_token - Stores all state in
postgres-metadata(metadata_db) - Exposed via Traefik at
/api/*,/configs/*,/business-configs/*,/health,/metrics(priority 10 — catch-all) - UI: ibex.triviz.cloud (Config UI —
ibex-platform-ui)
ibex-analytics-service (ibex-analytics-service)
Image: ghcr.io/ajnacloud-ksj/ibex-analytics-service:latest
Port: 8000
Purpose: Federated SQL execution engine. Embeds DuckDB and attaches heterogeneous data sources as virtual catalogs.
How it works:
- On startup, fetches the data source registry from
ibex-data-platform(GET /api/data-sources) - Attaches each registered source to DuckDB:
- MySQL → DuckDB MySQL scanner extension
- PostgreSQL → DuckDB Postgres scanner extension
- Iceberg → Iceberg REST catalog → MinIO Parquet files (via S3FileIO)
- S3/CSV/Parquet → direct file scan
- Queries can reference multiple sources in one SQL statement:
SELECT * FROM mysql_business.orders JOIN postgres_metadata.users - Internal only — no Traefik label, only reachable inside the
ajnaDocker network - Called by:
ibex-bi-backend(report/analytics queries),ibex-ai-service(direct SQL),ibex-agent-engine(agent SQL execution) - Protected by an API key (
ANALYTICS_SERVICE_API_KEY)
ibex-bi-backend (ibex-bi-backend)
Image: ghcr.io/ajnacloud-ksj/ibex-bi-backend:latest
Port: 8085
Purpose: Go-based BI metadata service. The primary API for the BI UI.
How it works:
- Validates every request by calling
ibex-identity-service(GET /validate) — acts as an auth gateway - Stores BI metadata in SQLite (
/data/ajna-bi-metadata.db→ajna_bi_metadatavolume): reports, database connections, dashboards, charts, RAG documents - Proxies SQL execution to
ibex-analytics-servicefor report preview and analytics queries - Exposes CRUD for: reports, metadata, database connections, dashboards, charts, RAG knowledge base
- Exposed via Traefik at
/api/metadata,/api/reports,/api/analytics,/api/database-connections,/internal/data-sources,/api/dashboards,/api/charts,/api/rag(priority 90) - UI: bi.triviz.cloud (BI UI —
ajna_data_platform_ui_lib)
ibex-ai-service (ibex-ai-service)
Image: ghcr.io/ajnacloud-ksj/ibex-ai-service:latest
Port: 8010
Purpose: AI conversation orchestrator. Manages chat sessions and routes user messages to the agent engine.
How it works:
- Receives chat messages from the BI UI (
POST /api/chat) - Manages conversation history (CRUD via
/api/conversations) - Forwards each user message to
ibex-agent-engine(POST /api/agents/{name}/query) - Validates JWTs via
ibex-identity-serviceon every request - Streams the agent response back to the browser
- Default agent:
demo_data_analyst(configurable viaDEFAULT_AGENT_NAME) - Exposed via Traefik at
/api/chatand/api/conversations(priority 90)
ibex-agent-engine (ibex-agent-engine)
Image: ghcr.io/ajnacloud-ksj/ibex-agent-engine:latest
Port: 3000
Purpose: Node.js LLM agent runtime. Executes multi-step reasoning pipelines that combine LLM calls with SQL execution.
How it works:
- Each agent is a defined pipeline of nodes (e.g.
tool-kb-1 → tool-exec-1 → tool-gen-1):- tool-kb-1 — searches the RAG knowledge base for relevant context (table schemas, business docs)
- tool-exec-1 — LLM generates SQL; executes it against
ibex-analytics-service; returns data - tool-gen-1 — LLM formats the data into a natural language response
- LLM routing via
llm-routing.json: endpointhttps://openrouter.ai/api/v1, modelopenai/gpt-4o-mini - Uses
OPENROUTER_API_KEYfor authentication to OpenRouter - Internal only — no Traefik label; called exclusively by
ibex-ai-service
ibex-watchtower (ibex-watchtower)
Image: containrrr/watchtower:1.7.1
Purpose: Automatic container updater. Polls GHCR every 5 minutes and restarts containers when a new image is published.
How it works:
- Only monitors containers with the label
com.centurylinklabs.watchtower.enable=true - Pulls the new image → stops the old container → starts a new one (rolling restart)
- Uses the Docker credentials file (
/root/.docker/config.json) to authenticate to GHCR --cleanupremoves old image layers after update to save disk- Important: Watchtower only updates the image — it does NOT re-read
docker-compose.yml. Changes to labels, env vars, or volumes require a manualdocker compose up -d <service>on the server.
Traefik Routing
All requests toapi.triviz.cloud are routed by path prefix:
| PathPrefix | → Service | Priority |
|---|---|---|
/auth, /users | ibex-identity-service :8090 | default |
/api/metadata, /api/reports, /api/analytics, /api/database-connections, /internal/data-sources, /api/dashboards, /api/charts, /api/rag | ibex-bi-backend :8085 | 90 |
/api/chat, /api/conversations | ibex-ai-service :8010 | 90 |
/api, /configs, /business-configs, /health, /metrics | ibex-data-platform :8080 | 10 (catch-all) |
listmonk.triviz.cloud (all paths) | ibex-listmonk :9000 | — |
Note:ibex-analytics-serviceandibex-agent-engineare internal only — no Traefik labels, only reachable inside the Dockerajnanetwork.
Service Responsibilities
ibex-identity-service
- JWT-based authentication (local SQLite or AWS Cognito)
- User/role management
- Token validation endpoint consumed by
ibex-bi-backendmiddleware - Store:
/data/ajna-identity.db(SQLite, persisted in Docker volume)
ibex-data-platform (Config Manager)
- Registers data sources: MySQL, PostgreSQL, Iceberg, S3, APIs
- Manages CDC pipelines and Redpanda Kafka topics
- Syncs connection credentials from HashiCorp Vault
- Propagates source registry to
ibex-analytics-service - UI: bi.triviz.cloud (Config UI)
ibex-analytics-service
- Embeds DuckDB for in-process federated SQL execution
- Attaches MySQL/Postgres/Iceberg sources as DuckDB catalogs
- Supports fully-qualified cross-source queries:
mysql_business.orders JOIN postgres_metadata.public.users - Iceberg tables via REST catalog → MinIO S3 Parquet files
- Adhoc file uploads (CSV/Parquet) queryable as temporary DuckDB tables
- Not directly exposed to the internet — called by
ibex-bi-backend,ibex-ai-service,ibex-agent-engine
ibex-bi-backend
- Go service for BI metadata: reports, dashboards, charts, DB connections, RAG knowledge base
- Proxies SQL execution to
ibex-analytics-service - Auth middleware validates JWT via
ibex-identity-service - Store:
/data/ajna-bi-metadata.db(SQLite, persisted in Docker volume) - UI: bi.triviz.cloud (BI UI — dashboards, charts, reports, connections, RAG pages)
ibex-ai-service
- Conversation/chat session management (CRUD via
/api/conversations) - Orchestrates AI queries: receives user message → calls
ibex-agent-engine→ returns response - Passes analytics context to agents
ibex-agent-engine
- Node.js LLM agent runtime
- Connects to LLM via OpenRouter (
OPENROUTER_API_KEY, endpointhttps://openrouter.ai/api/v1) - Executes SQL via
ibex-analytics-serviceto answer data questions - Exposes pre-defined agents (e.g.
demo_data_analyst)
ibex-listmonk
- Email marketing and transactional email
- Uses
postgres-metadataDB (separatelistmonkdatabase) - SMTP via Gmail (configurable)
Data Flow
Federated SQL Query (Reports / SQL Lab)
AI Chat Query
Authentication Flow
DB Connections — Unified List (UI)
Storage Architecture
Frontend Architecture
Two UIs, One Component Library
Both UIs are built from the sharedajna_data_platform_ui_lib component library (React + TypeScript + Vite + Tailwind + shadcn/ui).
| UI | Repo | Hosted | Purpose |
|---|---|---|---|
| BI UI | ajna_data_platform_ui_lib | bi.triviz.cloud (CloudFront → S3 /ajna-data-platform-ui-lib) | Dashboards, charts, reports, DB connections, AI chat, RAG |
| Config UI | ibex-platform-ui | ibex.triviz.cloud (CloudFront → S3 /ibex-platform-ui) | Pipeline management, data sources, monitoring |
Frontend → Backend Mapping
| UI Page | UI | API Calls |
|---|---|---|
| Login | Both | POST /auth/login → ibex-identity-service |
| DB Connections | BI UI | GET /api/database-connections/ (bi-backend) + GET /api/data-sources (data-platform) + GET /analytics/catalogs (analytics-service) |
| Reports (create/edit) | BI UI | POST/PUT /api/metadata/reports → bi-backend → analytics-service (SQL exec) |
| SQL Lab | BI UI | POST /api/analytics/query → bi-backend → analytics-service |
| AI Chat | BI UI | POST /api/chat, GET/POST /api/conversations/ → ai-service → agent-engine → analytics-service |
| Dashboards | BI UI | GET/POST /api/dashboards → bi-backend |
| Charts | BI UI | GET/POST /api/charts/, POST /api/charts/data → bi-backend |
| RAG Knowledge Base | BI UI | GET /api/rag/documents, POST /api/rag/upload, POST /api/rag/query → bi-backend |
| Pipelines | Config UI | GET/POST /api/pipelines → ibex-data-platform |
| Data Sources | Config UI | GET/POST /api/data-sources → ibex-data-platform |
Build & Deploy
ajna_data_platform_ui_lib → bi.triviz.cloud):
VITE_API_BASE_URL—https://api.triviz.cloud/api— base for bi-backend callsVITE_AI_SERVICE_URL—https://api.triviz.cloud— base for ai-service callsVITE_ANALYTICS_API_URL—https://api.triviz.cloud/api— analytics proxy via bi-backendVITE_BI_SERVICE_URL—https://api.triviz.cloud— bi-backend direct callsVITE_PLATFORM_API_URL—https://api.triviz.cloud— ibex-data-platform URLVITE_IDENTITY_SERVICE_URL—https://api.triviz.cloud— identity service
ibex-platform-ui → ibex.triviz.cloud):
VITE_CONFIG_MANAGER_URL—https://api.triviz.cloud/api— ibex-data-platformVITE_IDENTITY_SERVICE_URL—https://api.triviz.cloud— identity serviceVITE_ANALYTICS_API_URL—https://api.triviz.cloud/api/analyticsVITE_INGESTION_API_URL—https://api.triviz.cloud/api
Auto-Deployment (Watchtower)
Inter-Service Communication
All services communicate over the internal Docker bridge networkajna. No inter-service traffic leaves the host.
Security
- TLS: Let’s Encrypt via Traefik ACME, all external traffic HTTPS only
- Auth: JWT issued by
ibex-identity-service, validated on every request byibex-bi-backendandibex-ai-service - Credentials: All DB passwords and API keys stored in HashiCorp Vault, injected via environment at runtime
- CORS: Explicit allowlist —
https://ibex.triviz.cloud,https://bi.triviz.cloudon all backend services - Network isolation: Analytics and Agent Engine are internal-only (no Traefik exposure)
- Mixed Content: All frontend API calls use HTTPS; FastAPI services configured to avoid HTTP redirects
Key Configuration Files
| File | Location | Purpose |
|---|---|---|
docker-compose.aws.yml | /opt/ibex-platform-runner/ on EC2 | Production service definitions |
docker-compose.deploy-base.yml | same | Infrastructure services (Postgres, MinIO, Vault, etc.) |
docker-compose.demo-data.yml | same | Demo MySQL + seed data |
.env | /opt/ibex-platform-runner/ | Secrets and overrides (not in git) |
deploy/traefik/traefik.yml.tpl | same | Traefik config template |
deploy/init-db/init.sql | same | PostgreSQL schema + demo data |
deploy/scripts/vault-init.sh | same | Vault unseal + credential seeding |
