Features
Deployments
Deployment Logs & Monitoring

Deployment Logs & Monitoring

Real-time deployment monitoring with comprehensive logging, progress tracking, and diagnostics.


Overview

OEC.SH provides real-time deployment monitoring through Server-Sent Events (SSE), allowing you to watch deployments progress step-by-step with live log streaming. Every deployment action is tracked, logged, and persisted for historical analysis and troubleshooting.

Key Features:

  • Real-time Log Streaming: Watch logs appear as deployment progresses via SSE
  • 16-Stage Deployment Pipeline: Granular visibility into each deployment phase
  • Historical Logs: Access logs from any past deployment
  • Log Levels: INFO, WARNING, ERROR, DEBUG with filtering
  • Performance Metrics: Track deployment duration and stage timing
  • Error Diagnostics: Detailed error messages with troubleshooting context

Deployment Progress Tracking

Real-Time Progress UI

The deployment progress component provides live updates during deployment:

import { DeploymentProgress } from "@/components/DeploymentProgress";
 
<DeploymentProgress
  deploymentId={deployment.id}
  environmentId={environment.id}
  pollingInterval={2000}
  onComplete={() => console.log("Deployment complete!")}
  onError={(error) => console.error(error)}
/>

Features:

  • Collapsible Steps: Completed deployments collapsed by default, active deployments expanded
  • Progress Percentage: 0-100% progress bar based on completed steps
  • Step Status Icons: ✓ Completed, ⟳ Running, ✗ Failed, ○ Pending
  • Timing Information: Shows duration for each stage and total deployment time
  • Log Expansion: Click to view detailed logs for each deployment step

Deployment Stages

OEC.SH deployments follow a 16-stage pipeline. Each stage is tracked independently with its own logs and timing:

Stage 1: Initializing

Purpose: Prepare deployment configuration and validate environment

Logs:

Deployment created and queued
Deploying MyProject (production)
Instance UUID: 3f4a5e2b-9c1d-4e8a-b2c3-7d6e5f4a3b2c
Target server: 165.22.65.97

What happens:

  • Validate environment configuration
  • Check VM assignment
  • Create deployment record in database
  • Set environment status to "deploying"

Stage 2: Connecting

Purpose: Establish SSH connection to target server

Logs:

Connecting to 165.22.65.97...
SSH connection established (0.8s)

What happens:

  • Decrypt SSH credentials (if encrypted)
  • Connect via SSH (password or key-based)
  • Validate server accessibility

Common Errors:

  • Failed to connect to server via SSH: Check server IP, SSH credentials, firewall rules

Stage 3: Creating Network

Purpose: Set up isolated Docker network for environment

Logs:

Creating network paasportal_net_3f4a5e2b-9c1d-4e8a-b2c3-7d6e5f4a3b2c...
Network created (0.3s)

What happens:

  • Create isolated Docker bridge network: paasportal_net_&#123;env_uuid&#125;
  • Ensures container isolation between environments
  • Allows Odoo ↔ PostgreSQL communication

Network Naming Convention:

paasportal_net_{environment_uuid}

Stage 4: Configuring DNS

Purpose: Create DNS record early to allow propagation during deployment

Logs:

Configuring DNS record (early for propagation)...
DNS configured: staging-myproject.oecsh.com -> 165.22.65.97 (1.2s)

What happens:

  • Create/update DNS A record if DNS provider configured
  • Critical for SSL certificate issuance (Let's Encrypt ACME challenge)
  • Non-fatal: Deployment continues if DNS setup fails

DNS Configuration:

  • Provider: Cloudflare DNS (configured in Organization Settings)
  • Record format: {subdomain}.{apps_domain}{vm_ip}
  • TTL: 300 seconds (5 minutes)

Common Warnings:

DNS setup skipped (1.0s): No DNS provider configured

Stage 5: Creating Database

Purpose: Deploy PostgreSQL container with optimized configuration

Logs:

Creating PostgreSQL container 3f4a5e2b-9c1d-4e8a-b2c3-7d6e5f4a3b2c_db...
PostgreSQL container running (3.5s)
Waiting for PostgreSQL to accept connections...
PostgreSQL is ready and accepting connections
PostgreSQL optimized (2.1s)

What happens:

  1. Container Creation: Start PostgreSQL 15 container with resource limits
  2. Readiness Check: Poll until PostgreSQL accepts connections (max 30 retries)
  3. PGTune Optimization: Apply performance tuning based on available RAM
  4. PgBouncer Setup: Deploy connection pooler (transaction mode)
  5. Read Replica (Odoo 18+): Configure streaming replication if enabled

PostgreSQL Resources:

  • CPU: 30% of total environment allocation
  • RAM: 30% of total environment allocation
  • Volume: paasportal_pgdata_&#123;env_uuid&#125;

Container Naming:

{environment_uuid}_db            # Primary PostgreSQL
postgres-primary-{env_uuid}      # With read replica (Sprint 2E40)
postgres-replica-{env_uuid}      # Read replica container

PgBouncer Configuration:

  • Mode: Transaction pooling
  • Port: 6432 (primary), 6433 (replica)
  • Max connections: Based on RAM (50 per GB)

Stage 6: Cloning Platform Repos

Purpose: Clone platform-level addon repositories (shared across all projects)

Logs:

Cloning 3 platform addon repositories...
[1/3] Cloning platform-reporting (17.0)...
[1/3] platform-reporting ready (2.3s)
[2/3] Cloning platform-integrations (17.0)...
[2/3] platform-integrations ready (1.8s)
[3/3] Cloning platform-utilities (17.0)...
[3/3] platform-utilities ready (1.5s)
All 3 platform repos ready (5.6s)

What happens:

  • Clone all platform addon repositories configured by portal administrators
  • Each repo cloned to: /opt/paasportal/&#123;env_uuid&#125;/addons/&#123;repo_slug&#125;
  • Supports both public and private repositories (OAuth token authentication)
  • Failures logged as warnings (non-fatal)

Repository Structure:

/opt/paasportal/{env_uuid}/addons/
├── platform-reporting/       # Platform repo
├── platform-integrations/    # Platform repo
└── platform-utilities/       # Platform repo

Stage 7: Cloning Organization Repos

Purpose: Clone organization-level addon repositories (shared within organization)

Logs:

Cloning 2 organization addon repositories...
[1/2] Cloning org-custom-modules (17.0)...
[1/2] org-custom-modules ready (3.1s)
[2/2] Cloning org-theme (17.0)...
[2/2] org-theme ready (1.9s)
All 2 organization repos ready (5.0s)

What happens:

  • Clone organization-specific addon repositories
  • Same directory structure as platform repos
  • Supports Git connection authentication (GitHub, GitLab)

Stage 8: Cloning Project Repository

Purpose: Clone the primary project repository containing project-specific code

Logs:

Cloning project repository...
Project repository cloned (4.2s)
Commit: a3b2c1d by John Doe - Add customer portal module

What happens:

  • Clone project's primary Git repository
  • Capture Git metadata: commit SHA, message, author, date
  • Clone to: /opt/paasportal/&#123;env_uuid&#125;/addons/&#123;repo_name&#125;
  • Use specified branch (from environment configuration)

Git Information Captured:

{
  "git_commit": "a3b2c1d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0",
  "git_message": "Add customer portal module",
  "git_branch": "main",
  "committer_name": "John Doe",
  "committer_email": "john@example.com",
  "commit_date": "2025-01-08T14:32:15Z"
}

Private Repository Authentication:

  • GitHub/GitLab OAuth tokens used automatically
  • Tokens injected into Git URLs: https://&#123;token&#125;@github.com/org/repo.git

Stage 9: Pulling Image

Purpose: Download Odoo Docker image from registry

Logs:

Pulling Odoo 18.0 image...
Image ready (12.4s)

What happens:

  • Pull Docker image specified in OdooVersion configuration
  • Supports private registries with authentication
  • Image format: registry.example.com/namespace/odoo:18.0
  • Uses cached image if already present on server

Docker Images:

  • Official: odoo:17.0, odoo:18.0, odoo:19.0
  • Custom: Portal administrators can configure custom image URLs
  • Private registries: Registry credentials stored securely (encrypted)

Cache Behavior:

# Docker checks local image first
docker pull odoo:18.0
# Only downloads if:
# - Image not present locally
# - Newer version available (for :latest tag)

Stage 10: Generating Config

Purpose: Create optimized odoo.conf configuration file

Logs:

Generating Odoo configuration...
Configuration generated
Generated odoo.conf with 8 custom overrides

What happens:

  1. Build addons_path: Include all addon repositories in priority order
  2. Database Connection: Configure PgBouncer connection (host={uuid}_pgbouncer, port=6432)
  3. Read Replica: Add db_replica_host for Odoo 18+ if replica enabled
  4. Performance Tuning: Apply CPU/RAM-based worker calculation
  5. Custom Overrides: Merge user-specified odoo.conf parameters

Addons Path Priority (highest to lowest):

  1. Project additional repositories
  2. Project primary repository
  3. Organization repositories
  4. Platform repositories
  5. Base Odoo addons (/usr/lib/python3/dist-packages/odoo/addons)

Configuration Example:

[options]
addons_path = /mnt/extra-addons/my-project,/mnt/extra-addons/org-modules,/mnt/extra-addons/platform-utils,/usr/lib/python3/dist-packages/odoo/addons
data_dir = /var/lib/odoo
db_host = 3f4a5e2b-9c1d-4e8a-b2c3-7d6e5f4a3b2c_pgbouncer
db_port = 6432
db_user = odoo
db_name = 3f4a5e2b-9c1d-4e8a-b2c3-7d6e5f4a3b2c
db_password = {generated_secure_password}
dbfilter = ^3f4a5e2b-9c1d-4e8a-b2c3-7d6e5f4a3b2c$
workers = 4
limit_memory_hard = 2684354560
limit_memory_soft = 2147483648
limit_time_cpu = 900
proxy_mode = True
logfile = False

Odoo 18+ Replica Configuration:

db_replica_host = 3f4a5e2b-9c1d-4e8a-b2c3-7d6e5f4a3b2c_pgbouncer-replica
db_replica_port = 6433

Custom Overrides: Users can specify custom odoo.conf parameters in environment settings. These override defaults:

{
  "performance": {
    "workers": 8,
    "max_cron_threads": 2
  },
  "logging": {
    "log_level": "debug",
    "log_handler": ":DEBUG"
  }
}

Stage 11: Starting Container

Purpose: Launch Odoo Docker container with resource limits and networking

Logs:

Starting Odoo container 3f4a5e2b-9c1d-4e8a-b2c3-7d6e5f4a3b2c_odoo...
Applying Odoo resource limits: CPU=1.4 cores, RAM=1434m (70% of allocated)
Container started: f2e3d4c5b6a7 (3.2s)
Odoo container started on internal network: paasportal_net_3f4a5e2b-9c1d-4e8a-b2c3-7d6e5f4a3b2c
Container connected to traefik-public network

What happens:

  1. Stop and remove existing container (if redeployment)
  2. Create directories with proper permissions (uid=100, gid=101 for Odoo user)
  3. Start Odoo container on internal network (for database access)
  4. Connect to traefik-public network (for HTTP routing)
  5. Apply Traefik labels for routing and SSL

Resource Allocation:

  • Total Environment Resources: CPU cores, RAM, Disk (configured in environment settings)
  • Odoo Container: 70% of total (remaining 30% for PostgreSQL)
  • Worker Calculation: workers = max(1, int(odoo_cpu_cores * 2))

Docker Run Command:

docker run -d \
  --name {env_uuid}_odoo \
  --network paasportal_net_{env_uuid} \
  --restart unless-stopped \
  --memory 1434m \
  --cpus 1.4 \
  --log-driver json-file \
  --log-opt max-size=10m \
  --log-opt max-file=3 \
  -v /opt/paasportal/{env_uuid}/odoo.conf:/etc/odoo/odoo.conf:ro \
  -v /opt/paasportal/{env_uuid}/addons:/mnt/extra-addons:ro \
  -v /opt/paasportal/{env_uuid}/data:/var/lib/odoo \
  -v /opt/paasportal/{env_uuid}/logs:/var/log/odoo \
  -e ODOO_RC=/etc/odoo/odoo.conf \
  -l traefik.enable=true \
  -l traefik.http.routers.{env_uuid}.rule=Host(`staging-myproject.oecsh.com`) \
  -l traefik.http.routers.{env_uuid}.tls.certresolver=letsencrypt \
  odoo:18.0

Volume Mounts:

  • /opt/paasportal/&#123;env_uuid&#125;/odoo.conf/etc/odoo/odoo.conf (read-only)
  • /opt/paasportal/&#123;env_uuid&#125;/addons/mnt/extra-addons (read-only)
  • /opt/paasportal/&#123;env_uuid&#125;/data/var/lib/odoo (read-write, contains filestore/sessions)
  • /opt/paasportal/&#123;env_uuid&#125;/logs/var/log/odoo (read-write)

Networking:

  • Internal Network: paasportal_net_&#123;env_uuid&#125; - For PostgreSQL/PgBouncer communication
  • Traefik Network: traefik-public - For HTTP routing and SSL termination

Stage 12: Installing Dependencies

Purpose: Install apt.txt and requirements.txt from all addon repositories

Logs:

Checking ALL repositories for dependencies (apt.txt, requirements.txt)...
Scanning all repos for apt.txt and requirements.txt...
Dependencies installed from: platform-integrations, my-project
Total apt packages: 5
Total pip packages: 12
All dependencies installed (8.3s)

What happens:

  1. Scan all addon repositories for apt.txt and requirements.txt
  2. Merge all apt packages and install via apt-get install (inside container)
  3. Merge all pip packages and install via pip3 install (inside container)
  4. Execute as root user: docker exec --user root &#123;container&#125; ...

Dependency Files:

# apt.txt - System packages
postgresql-client
libldap2-dev
libsasl2-dev

# requirements.txt - Python packages
python-ldap==3.4.3
requests-oauthlib==1.3.1
stripe==5.4.0

Installation Process:

# 1. Collect all apt.txt files from repos
docker exec --user root {container} apt-get update
docker exec --user root {container} apt-get install -y \
  postgresql-client libldap2-dev libsasl2-dev wkhtmltopdf
 
# 2. Collect all requirements.txt files
docker exec --user root {container} pip3 install \
  python-ldap==3.4.3 requests-oauthlib==1.3.1 stripe==5.4.0

Performance Note:

  • Cached on server after first installation
  • Subsequent deployments only install new packages
  • Failed installations logged as warnings (non-fatal)

Stage 13: Initializing Database

Purpose: Initialize Odoo database or restore from migration backup

Logs:

Fresh Database:

Initializing Odoo database (this may take 1-2 minutes)...
Database initialization complete (94.3s)

Migration Restore:

Restoring from migration backup...
Restoring database from migration backup...
Migration restore complete (127.5s)

What happens:

Fresh Database Initialization:

  1. Run odoo -d &#123;db_name&#125; -i base --stop-after-init
  2. Install base Odoo modules
  3. Create admin user with generated password
  4. Initialize database schema

Migration Restore:

  1. Download backup from R2 cloud storage
  2. Extract ZIP (dump.sql + filestore.tar.gz + manifest.json)
  3. Restore PostgreSQL dump via psql
  4. Restore filestore to /var/lib/odoo/filestore/&#123;db_name&#125;
  5. Update ir_config_parameter for sanitization

Critical Flag: migration_restore_completed

# IMPORTANT: Only restore ONCE on first deployment
# Prevents data loss on redeployment
if environment.migration_id and not environment.migration_restore_completed:
    # Restore from backup
    await restore_from_migration(config, db_password)
    # Mark as completed to prevent future restores
    environment.migration_restore_completed = True
    environment.migration_restore_completed_at = datetime.now(UTC)

Database Initialization Timing:

  • Fresh database: 60-120 seconds
  • Migration restore: 90-180 seconds (depends on backup size)
  • Large databases (>5GB): 300+ seconds

Stage 14: Verifying DNS

Purpose: Verify DNS propagation before SSL certificate issuance

Logs:

Verifying DNS propagation before SSL...
Verifying DNS propagation (may take up to 3 minutes)...
DNS verified - safe to proceed with SSL (28.5s)

What happens:

  1. Query DNS servers for environment subdomain
  2. Verify DNS record points to correct VM IP
  3. Retry every 10 seconds for up to 3 minutes
  4. Non-fatal: Continues even if verification times out

DNS Verification:

import dns.resolver
 
hostname = f"{subdomain}.{apps_domain}"  # staging-myproject.oecsh.com
expected_ip = vm_ip  # 165.22.65.97
 
# Query DNS
answers = dns.resolver.resolve(hostname, 'A')
actual_ip = str(answers[0])
 
if actual_ip == expected_ip:
    return True  # DNS propagated successfully

Why This Matters:

  • Let's Encrypt requires DNS to resolve for ACME HTTP-01 challenge
  • If DNS not propagated, SSL certificate issuance fails
  • Traefik will retry automatically, but delays environment accessibility

Common Warnings:

DNS verification timed out (180.0s) - proceeding anyway
SSL may take longer to provision

Stage 15: Configuring Traefik

Purpose: Configure HTTP routing and SSL certificate management

Logs:

Configuring Traefik routing...
Traefik configured (0.5s)

What happens:

  1. Traefik detects container via Docker labels (Docker provider)
  2. Creates HTTP router for domain
  3. Requests SSL certificate from Let's Encrypt
  4. Configures automatic HTTP → HTTPS redirect

Traefik Labels Applied:

traefik.enable: true
traefik.http.routers.{env_uuid}.rule: Host(`staging-myproject.oecsh.com`)
traefik.http.routers.{env_uuid}.entrypoints: websecure
traefik.http.routers.{env_uuid}.tls: true
traefik.http.routers.{env_uuid}.tls.certresolver: letsencrypt
traefik.http.services.{env_uuid}.loadbalancer.server.port: 8069
 
# Middlewares
traefik.http.middlewares.{env_uuid}-ratelimit.ratelimit.average: 100
traefik.http.middlewares.{env_uuid}-ratelimit.ratelimit.burst: 50

SSL Certificate Issuance:

  1. Traefik sends ACME HTTP-01 challenge to Let's Encrypt
  2. Let's Encrypt validates domain ownership via HTTP request
  3. Certificate issued and stored in /letsencrypt/acme.json
  4. Auto-renewal 30 days before expiration

Routing Flow:

Client Request (HTTP/HTTPS)

Traefik (165.22.65.97:80/443)

Host-based routing: staging-myproject.oecsh.com

Odoo Container (3f4a5e2b_odoo:8069)

Odoo Web Server

Stage 16: Health Check

Purpose: Verify environment is accessible and responding

Logs:

Performing health check...
Health check passed! (5.2s)

What happens:

  1. Wait 10 seconds for Odoo to fully start
  2. Send HTTP request to http://&#123;container_ip&#125;:8069/web/health
  3. Verify HTTP 200 response
  4. Non-fatal: Logs warning if check fails but container is running

Health Check Implementation:

# Wait for Odoo to start
await asyncio.sleep(10)
 
# HTTP health check
url = f"http://{container_ip}:8069/web/health"
response = requests.get(url, timeout=10)
 
if response.status_code == 200:
    return True  # Healthy
else:
    logger.warning("Health check returned non-200 status")
    return False  # Warning, but deployment continues

Why Health Check May Fail:

  • Odoo still initializing (database migration in progress)
  • High CPU load (container slow to respond)
  • Network connectivity issues

Result:

  • Success: Environment immediately accessible
  • Warning: Environment may need a few more minutes to become fully responsive

Stage 17: Completed

Purpose: Finalize deployment and update status

Logs:

Deployment completed successfully! Total time: 187.3s (3.1 min)
Deployment completed! Total: 187.3s (3.1 min)

What happens:

  1. Calculate total deployment duration
  2. Update deployment status to SUCCESS
  3. Update environment status to RUNNING
  4. Store container ID and metadata
  5. Commit all database changes

Deployment Metadata:

{
  "deployment_id": "d4e5f6a7-b8c9-4d3e-a2b1-c0d9e8f7a6b5",
  "status": "success",
  "duration_seconds": 187.3,
  "container_id": "f2e3d4c5b6a7...",
  "container_name": "3f4a5e2b-9c1d-4e8a-b2c3-7d6e5f4a3b2c_odoo",
  "git_commit": "a3b2c1d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0",
  "git_branch": "main"
}

Average Deployment Times:

  • Fresh deployment: 2-4 minutes
  • Redeployment (cached images): 1-2 minutes
  • Large migrations: 5-10 minutes
  • With many addons: 3-6 minutes

Viewing Deployment Logs

Real-Time Log Streaming

Logs are streamed in real-time via Server-Sent Events (SSE) during active deployments:

import { useSSEEvent, subscribeToEvents } from "@/hooks/useEventStream";
 
// Subscribe to deployment progress events
useSSEEvent("deployment_progress", (event) => {
  console.log("Deployment update:", event.data);
  // event.data contains: deployment_id, status, step, message, timestamp
});
 
// Or subscribe programmatically
const unsubscribe = subscribeToEvents("deployment_progress", (event) => {
  if (event.data.deployment_id === myDeploymentId) {
    updateProgressUI(event.data);
  }
});

SSE Connection:

GET /api/v1/events/stream?token={jwt_token}

Event Format:

event: deployment_progress
data: {"type":"deployment_progress","data":{"deployment_id":"d4e5f6a7...","step":"pulling_image","message":"Image ready (12.4s)","level":"info","timestamp":"2025-01-08T14:35:22Z"}}

Historical Logs

Access logs from any past deployment via the API:

Endpoint:

GET /api/v1/deployments/{deployment_id}/logs

Query Parameters:

  • level (optional): Filter by log level (debug, info, warning, error)
  • skip (optional): Pagination offset (default: 0)
  • limit (optional): Max logs to return (default: 500)

Response:

{
  "deployment_id": "d4e5f6a7-b8c9-4d3e-a2b1-c0d9e8f7a6b5",
  "logs": [
    {
      "id": "log-001",
      "level": "info",
      "message": "Deployment created and queued",
      "timestamp": "2025-01-08T14:32:15Z",
      "data": null
    },
    {
      "id": "log-002",
      "level": "info",
      "message": "Deploying MyProject (production)",
      "timestamp": "2025-01-08T14:32:16Z",
      "data": null
    },
    {
      "id": "log-015",
      "level": "error",
      "message": "Failed to clone repository: Authentication failed",
      "timestamp": "2025-01-08T14:33:45Z",
      "data": {
        "git_url": "https://github.com/org/private-repo.git",
        "error_code": "AUTH_FAILED"
      }
    }
  ]
}

Log Levels

All deployment logs are tagged with a severity level:

LevelPurposeExamples
INFONormal operation"Container started", "Database initialized"
WARNINGNon-fatal issues"Health check failed", "DNS verification timed out"
ERRORFatal errors"Failed to connect to server", "Container crashed"
DEBUGDetailed diagnosticsSSH command output, Docker inspect results

Filtering by Level:

GET /api/v1/deployments/{id}/logs?level=error

Returns only ERROR-level logs for troubleshooting failed deployments.


SSE Integration

How SSE Works

Server-Sent Events (SSE) provide unidirectional real-time updates from server to client over HTTP.

Architecture:

Backend (FastAPI)
  ↓ Redis pub/sub
Redis Channel: "sse:events"
  ↓ Subscribe
SSE Endpoint: /api/v1/events/stream
  ↓ HTTP Stream
Frontend (EventSource API)
  ↓ Dispatch
React Components

Key Benefits:

  • Simple: Uses standard HTTP (no WebSocket complexity)
  • Scalable: Redis pub/sub allows multiple backend workers
  • Reliable: Automatic reconnection with exponential backoff

Backend Implementation

Publishing Events:

from api.v1.routes.events import broadcast_to_organization
 
# Broadcast deployment progress to organization members
await broadcast_to_organization(
    org_id=project.organization_id,
    event_type="deployment_progress",
    data={
        "deployment_id": str(deployment.id),
        "environment_id": str(environment.id),
        "project_id": str(project.id),
        "status": "running",
        "step": "pulling_image",
        "message": "Pulling Odoo 18.0 image...",
        "progress_percent": 45
    }
)

Event Stream Endpoint:

@router.get("/stream")
async def event_stream(
    request: Request,
    token: str = Query(..., description="JWT access token"),
):
    """
    SSE endpoint for real-time updates using Redis pub/sub.
 
    Authentication: Pass JWT token as query parameter since browser
    EventSource doesn't support custom headers.
    """
    user_id = await get_user_from_token(token)
 
    return StreamingResponse(
        event_generator(request, user_id),
        media_type="text/event-stream",
        headers={
            "Cache-Control": "no-cache",
            "Connection": "keep-alive",
            "X-Accel-Buffering": "no",  # Disable nginx buffering
        },
    )

Redis Pub/Sub:

# backend/api/v1/routes/events.py
SSE_CHANNEL = "sse:events"
 
async def broadcast_to_organization(org_id: str, event_type: str, data: dict):
    """Broadcast event via Redis pub/sub to all connected SSE clients."""
    r = redis.from_url(get_redis_url())
    message = json.dumps({
        "org_id": str(org_id),
        "type": event_type,
        "data": data
    })
    subscriber_count = await r.publish(SSE_CHANNEL, message)
    logger.info(f"SSE broadcast: type={event_type}, receivers={subscriber_count}")
    await r.aclose()

Frontend Implementation

SSE Provider (React Context):

// frontend/src/providers/SSEProvider.tsx
import { SSEProvider } from "@/providers/SSEProvider";
 
// Wrap app with SSEProvider at root level
<SSEProvider>
  <App />
</SSEProvider>

useEventStream Hook:

// frontend/src/hooks/useEventStream.ts
export function useEventStream() {
  const { isAuthenticated } = useAuthStore();
  const eventSourceRef = useRef<EventSource | null>(null);
 
  const connect = useCallback(() => {
    const token = getAccessToken();
    const url = `${API_URL}/events/stream?token=${encodeURIComponent(token)}`;
 
    const eventSource = new EventSource(url);
 
    eventSource.addEventListener("deployment_progress", (e) => {
      const parsed = JSON.parse(e.data);
      dispatchEvent(parsed);
    });
 
    eventSource.onerror = () => {
      // Reconnect with exponential backoff
      setTimeout(connect, backoffDelay);
    };
  }, [isAuthenticated]);
 
  useEffect(() => {
    connect();
    return () => eventSource?.close();
  }, [connect]);
 
  return { isConnected: eventSource?.readyState === EventSource.OPEN };
}

Subscribing to Events:

import { useSSEEvent } from "@/hooks/useEventStream";
 
function DeploymentMonitor({ deploymentId }: Props) {
  useSSEEvent("deployment_progress", (event) => {
    if (event.data.deployment_id === deploymentId) {
      setProgress(event.data);
 
      if (event.data.status === "success") {
        toast.success("Deployment completed!");
      } else if (event.data.status === "failed") {
        toast.error("Deployment failed: " + event.data.error_message);
      }
    }
  });
 
  return <div>Monitoring deployment {deploymentId}...</div>;
}

Event Types

OEC.SH broadcasts various event types via SSE:

Event TypeDescriptionPayload
connectedSSE connection established{"status": "connected"}
deployment_progressDeployment step update{"deployment_id", "step", "status", "message"}
environment_statusEnvironment status change{"environment_id", "status"}
migration_progressMigration restore progress{"migration_id", "step", "progress_percent"}
replica.health_updatedRead replica health change{"environment_id", "replica_status", "lag_bytes"}
alert_triggeredMonitoring alert fired{"alert_id", "severity", "message"}
permissions_changedUser permissions updated{"user_id", "organization_id"}

Example: Deployment Progress Event:

{
  "type": "deployment_progress",
  "data": {
    "deployment_id": "d4e5f6a7-b8c9-4d3e-a2b1-c0d9e8f7a6b5",
    "environment_id": "3f4a5e2b-9c1d-4e8a-b2c3-7d6e5f4a3b2c",
    "project_id": "1a2b3c4d-5e6f-7g8h-9i0j-1k2l3m4n5o6p",
    "project_name": "MyProject",
    "environment_name": "production",
    "status": "running",
    "step": "installing_dependencies",
    "message": "All dependencies installed (8.3s)",
    "progress_percent": 72,
    "triggered_by": "user-uuid",
    "branch": "main"
  }
}

Nginx Configuration

Critical for SSE: Nginx must disable buffering for /api/v1/events/ endpoint:

# nginx/conf.d/devsh.openeducat.ai.conf
location /api/v1/events/ {
    proxy_pass http://backend;
    proxy_http_version 1.1;
    proxy_set_header Connection "";
 
    # Disable buffering for SSE
    proxy_buffering off;
    proxy_cache off;
    chunked_transfer_encoding off;
 
    # Long timeouts for SSE connections (24 hours)
    proxy_read_timeout 86400s;
    proxy_send_timeout 86400s;
 
    # Headers for SSE
    add_header Cache-Control "no-cache";
    add_header X-Accel-Buffering "no";
}

Why This Matters:

  • Default nginx buffering delays SSE events by up to 60 seconds
  • proxy_buffering off ensures events stream immediately
  • X-Accel-Buffering: no prevents any downstream buffering
  • Long timeouts (24 hours) keep SSE connections alive

Reconnection Strategy

Frontend Reconnection:

const maxReconnectAttempts = 10;
const baseReconnectDelay = 1000;  // 1 second
let reconnectAttempts = 0;
 
eventSource.onerror = () => {
  eventSource.close();
 
  if (reconnectAttempts < maxReconnectAttempts) {
    const delay = baseReconnectDelay * Math.pow(2, reconnectAttempts);
    console.log(`[SSE] Reconnecting in ${delay}ms (attempt ${reconnectAttempts + 1})`);
 
    setTimeout(() => {
      reconnectAttempts++;
      connect();
    }, delay);
  } else {
    console.error("[SSE] Max reconnect attempts reached");
  }
};
 
eventSource.onopen = () => {
  reconnectAttempts = 0;  // Reset on successful connection
};

Exponential Backoff:

  • Attempt 1: 1 second
  • Attempt 2: 2 seconds
  • Attempt 3: 4 seconds
  • Attempt 4: 8 seconds
  • Attempt 5: 16 seconds
  • Attempt 6: 32 seconds
  • Attempt 7: 64 seconds
  • Attempt 8: 128 seconds
  • Attempt 9: 256 seconds
  • Attempt 10: 512 seconds
  • Max attempts: Give up after 10 tries

Keepalive Messages:

# backend/api/v1/routes/events.py
try:
    message = await asyncio.wait_for(pubsub.get_message(), timeout=30.0)
    if message:
        yield f"event: {event_type}\ndata: {event_data}\n\n"
except TimeoutError:
    # Send keepalive ping every 30 seconds
    yield ": keepalive\n\n"

Prevents connection timeout from proxies and firewalls.


Deployment History

Listing Deployments

Endpoint:

GET /api/v1/deployments?project_id={uuid}

Query Parameters:

  • project_id (optional): Filter by project
  • status (optional): Filter by status (pending, running, success, failed)
  • skip (default: 0): Pagination offset
  • limit (default: 50): Max results per page

Response:

[
  {
    "id": "d4e5f6a7-b8c9-4d3e-a2b1-c0d9e8f7a6b5",
    "project_id": "1a2b3c4d-5e6f-7g8h-9i0j-1k2l3m4n5o6p",
    "environment_id": "3f4a5e2b-9c1d-4e8a-b2c3-7d6e5f4a3b2c",
    "version": 5,
    "status": "success",
    "trigger": "manual",
    "triggered_by": "user-uuid",
    "git_commit": "a3b2c1d4e5f6g7h8",
    "git_branch": "main",
    "git_message": "Add customer portal module",
    "started_at": "2025-01-08T14:32:15Z",
    "completed_at": "2025-01-08T14:35:22Z",
    "duration_seconds": 187.3,
    "container_id": "f2e3d4c5b6a7...",
    "error_message": null,
    "created_at": "2025-01-08T14:32:15Z"
  },
  {
    "id": "c3d4e5f6-a7b8-9c0d-1e2f-3a4b5c6d7e8f",
    "status": "failed",
    "error_message": "Failed to connect to server via SSH",
    "duration_seconds": 3.2,
    "...": "..."
  }
]

Latest Deployment

Endpoint:

GET /api/v1/deployments/environment/{environment_id}/latest

Response:

{
  "has_deployment": true,
  "environment_id": "3f4a5e2b-9c1d-4e8a-b2c3-7d6e5f4a3b2c",
  "deployment_id": "d4e5f6a7-b8c9-4d3e-a2b1-c0d9e8f7a6b5",
  "status": "success",
  "version": 5,
  "started_at": "2025-01-08T14:32:15Z",
  "completed_at": "2025-01-08T14:35:22Z",
  "error_message": null
}

Deployment Statistics

Endpoint:

GET /api/v1/deployments/project/{project_id}/stats

Response:

{
  "project_id": "1a2b3c4d-5e6f-7g8h-9i0j-1k2l3m4n5o6p",
  "total_deployments": 47,
  "status_counts": {
    "success": 42,
    "failed": 3,
    "cancelled": 2
  },
  "success_rate": 89.4
}

Metrics:

  • total_deployments: Total number of deployments for project
  • status_counts: Breakdown by deployment status
  • success_rate: Percentage of successful deployments

Log Persistence

Storage

Database Table: deployment_logs

CREATE TABLE deployment_logs (
    id UUID PRIMARY KEY,
    deployment_id UUID REFERENCES deployments(id) ON DELETE CASCADE,
    timestamp TIMESTAMPTZ NOT NULL,
    level VARCHAR(10) NOT NULL,  -- debug, info, warning, error
    message TEXT NOT NULL,
    step VARCHAR(50),  -- initializing, connecting, pulling_image, etc.
    data JSONB DEFAULT '{}',  -- Additional structured data
    INDEX idx_log_deployment_time (deployment_id, timestamp)
);

Indexes:

  • idx_log_deployment_time: Fast retrieval of logs for specific deployment
  • deployment_id: Cascading delete ensures cleanup when deployment deleted

Retention Policy

Default Retention: Unlimited

Logs are retained indefinitely by default. You can configure retention policies:

Environment Variable:

LOG_RETENTION_DAYS=90  # Keep logs for 90 days

Cleanup Job:

# backend/tasks/worker.py
@cron("0 3 * * *")  # Run daily at 3 AM
async def cleanup_old_deployment_logs(ctx):
    """Delete deployment logs older than retention period."""
    retention_days = settings.log_retention_days or 90
    cutoff_date = datetime.now(UTC) - timedelta(days=retention_days)
 
    # Delete old logs
    result = await db.execute(
        delete(DeploymentLog).where(DeploymentLog.timestamp < cutoff_date)
    )
    deleted_count = result.rowcount
    logger.info(f"Deleted {deleted_count} deployment logs older than {retention_days} days")

Export Capabilities

Export Deployment Logs (Future Feature):

GET /api/v1/deployments/{id}/logs/export?format=json
GET /api/v1/deployments/{id}/logs/export?format=txt

JSON Export:

{
  "deployment_id": "d4e5f6a7-b8c9-4d3e-a2b1-c0d9e8f7a6b5",
  "project_name": "MyProject",
  "environment_name": "production",
  "exported_at": "2025-01-08T15:00:00Z",
  "logs": [...]
}

Text Export:

=== Deployment Logs ===
Deployment ID: d4e5f6a7-b8c9-4d3e-a2b1-c0d9e8f7a6b5
Project: MyProject
Environment: production
Exported: 2025-01-08 15:00:00 UTC

[2025-01-08 14:32:15] [INFO] Deployment created and queued
[2025-01-08 14:32:16] [INFO] Deploying MyProject (production)
[2025-01-08 14:32:16] [INFO] Instance UUID: 3f4a5e2b-9c1d-4e8a-b2c3-7d6e5f4a3b2c
...

Performance Metrics

Deployment Duration Tracking

Every deployment records precise timing for each stage:

Database Fields:

class Deployment(Base):
    started_at = Column(DateTime(timezone=True))
    completed_at = Column(DateTime(timezone=True))
    duration_seconds = Column(Float)

Stage Timing in Logs:

step_start = time.time()
# ... perform deployment step ...
step_elapsed = time.time() - step_start
 
await self._log_step("pulling_image", f"Image ready ({step_elapsed:.1f}s)")

Total Duration Calculation:

deployment_start = time.time()
# ... complete all deployment steps ...
total_elapsed = time.time() - deployment_start
 
deployment.duration_seconds = total_elapsed
deployment.completed_at = datetime.now(UTC)

Stage Timing Breakdown

Progress API Response Includes Timing:

GET /api/v1/deployments/{id}/progress
{
  "deployment_id": "d4e5f6a7...",
  "status": "success",
  "duration_seconds": 187.3,
  "steps": [
    {
      "id": "initializing",
      "name": "Initializing",
      "status": "completed",
      "latest_message": "Configuration loaded (0.5s)",
      "logs": [
        {
          "message": "Deployment created and queued",
          "timestamp": "2025-01-08T14:32:15Z"
        }
      ]
    },
    {
      "id": "pulling_image",
      "name": "Pulling Image",
      "status": "completed",
      "latest_message": "Image ready (12.4s)",
      "logs": [...]
    },
    ...
  ]
}

Timing Analysis:

StageAverage DurationNotes
Initializing0.5sConfiguration loading
Connecting0.8sSSH connection
Creating Network0.3sDocker network setup
Configuring DNS1.2sDNS record creation (if enabled)
Creating Database3.5sPostgreSQL container start
Cloning Platform Repos5-10sDepends on repo count/size
Cloning Org Repos5-10sDepends on repo count/size
Cloning Project Repo4-8sDepends on repo size
Pulling Image10-30sDepends on image size (cached: 2s)
Generating Config0.5sodoo.conf generation
Starting Container3.2sDocker container start
Installing Dependencies8-15sapt/pip package installation
Initializing Database60-120sFresh database creation
Verifying DNS10-180sDNS propagation check (skippable)
Configuring Traefik0.5sTraefik label detection
Health Check5-15sOdoo responsiveness test
Total2-4 minutesFresh deployment
Total (Redeployment)1-2 minutesCached images/dependencies

Historical Performance Trends

Query Average Deployment Duration:

SELECT
    AVG(duration_seconds) AS avg_duration,
    MIN(duration_seconds) AS fastest,
    MAX(duration_seconds) AS slowest,
    COUNT(*) FILTER (WHERE status = 'success') AS successful,
    COUNT(*) FILTER (WHERE status = 'failed') AS failed
FROM deployments
WHERE project_id = '1a2b3c4d-5e6f-7g8h-9i0j-1k2l3m4n5o6p'
    AND created_at > NOW() - INTERVAL '30 days';

Result:

{
  "avg_duration": 142.5,
  "fastest": 87.2,
  "slowest": 315.8,
  "successful": 42,
  "failed": 3
}

Performance Optimization:

  • Docker Image Caching: Cached images reduce pull time from 30s → 2s
  • Dependency Caching: Cached apt/pip packages reduce install time
  • DNS Pre-configuration: Configure DNS early to avoid SSL delays
  • Parallel Repository Cloning (Future): Clone multiple repos simultaneously

Error Diagnostics

Common Deployment Errors

1. SSH Connection Failed

Error Message:

Failed to connect to server via SSH

Logs:

[ERROR] connecting: Failed to connect to server via SSH

Causes:

  • Incorrect SSH credentials
  • Server IP unreachable
  • Firewall blocking SSH port (22)
  • SSH key passphrase incorrect

Troubleshooting:

  1. Verify server IP and SSH port in Server Settings
  2. Test SSH manually: ssh -p 22 root@165.22.65.97
  3. Check server firewall rules: ufw status
  4. Verify SSH key permissions: chmod 600 ~/.ssh/id_rsa

2. Git Clone Failed

Error Message:

Failed to clone repository: Authentication failed

Logs:

[ERROR] cloning_repo: Failed to clone repository
fatal: Authentication failed for 'https://github.com/org/private-repo.git'

Causes:

  • Repository is private and no Git connection configured
  • OAuth token expired or invalid
  • Repository URL incorrect
  • Git provider (GitHub/GitLab) down

Troubleshooting:

  1. For Private Repos: Go to Organization Settings → Git Connections → Add GitHub/GitLab connection
  2. Verify repository URL is correct
  3. Test Git clone manually: git clone https://github.com/org/repo.git
  4. Check Git provider status page

Fix for Private Repositories:

  • Navigate to: Dashboard → Settings → Git Connections
  • Click "Connect GitHub" or "Connect GitLab"
  • Authorize OEC.SH to access repositories
  • Redeploy environment

3. Docker Image Pull Failed

Error Message:

Failed to pull Docker image: Authentication required

Logs:

[ERROR] pulling_image: Failed to pull Docker image
Error response from daemon: pull access denied for registry.example.com/odoo:18.0

Causes:

  • Private registry requires authentication
  • Registry credentials incorrect
  • Image tag doesn't exist
  • Registry unreachable

Troubleshooting:

  1. For Private Registries: Portal Admin must configure registry credentials in Odoo Versions settings
  2. Verify image tag exists: docker pull odoo:18.0
  3. Check registry URL format: registry.example.com/namespace/image:tag
  4. Test registry access: docker login registry.example.com

4. Database Initialization Timeout

Error Message:

Database initialization failed: Connection timeout

Logs:

[WARNING] initializing_database: Database initialization failed (120.0s)

Causes:

  • PostgreSQL container crashed
  • Insufficient server resources (CPU/RAM)
  • Database migration taking too long
  • Port conflict on server

Troubleshooting:

  1. Check PostgreSQL container logs: docker logs &#123;env_uuid&#125;_db
  2. Verify server has sufficient resources: free -h, htop
  3. Increase environment resource allocation
  4. For migrations: Increase timeout in deployment code

5. SSL Certificate Provisioning Failed

Error Message:

SSL certificate request failed: DNS not propagated

Logs:

[WARNING] verifying_dns: DNS verification timed out (180.0s) - proceeding anyway
SSL may take longer to provision

Causes:

  • DNS record not created (no DNS provider configured)
  • DNS propagation delay (can take 5-15 minutes)
  • Domain points to wrong IP
  • Let's Encrypt rate limit hit (5 failures per hour)

Troubleshooting:

  1. Check DNS Resolution: dig staging-myproject.oecsh.com
    dig staging-myproject.oecsh.com +short
    # Should return: 165.22.65.97
  2. Configure DNS Provider: Dashboard → Settings → DNS → Add Cloudflare credentials
  3. Wait for Propagation: DNS can take 5-15 minutes to propagate globally
  4. Check Let's Encrypt Logs: View Traefik logs for ACME challenge details
    docker logs traefik | grep acme
  5. Verify Traefik Labels: Ensure container has correct Traefik labels
    docker inspect {env_uuid}_odoo | grep traefik

Manual DNS Fix:

# Add DNS record manually in Cloudflare dashboard
# Record Type: A
# Name: staging-myproject
# Content: 165.22.65.97
# TTL: Auto
# Proxy: Off (DNS only)

6. Container Health Check Failed

Error Message:

Health check failed, but container is running

Logs:

[WARNING] health_check: Health check failed (15.2s), but container is running

Causes:

  • Odoo still initializing (database migration in progress)
  • High CPU load causing slow response
  • Odoo crashed after container start
  • Port 8069 not accessible

Troubleshooting:

  1. Check Container Status: docker ps | grep &#123;env_uuid&#125;_odoo
  2. View Container Logs: docker logs &#123;env_uuid&#125;_odoo
  3. Wait 2-3 Minutes: Odoo may still be initializing database
  4. Manual Health Check:
    docker exec {env_uuid}_odoo curl http://localhost:8069/web/health
    # Should return HTTP 200
  5. Check Resource Usage: docker stats &#123;env_uuid&#125;_odoo

Common Odoo Startup Issues:

# Check Odoo logs for errors
docker logs {env_uuid}_odoo --tail 50
 
# Common errors:
# - "Database does not exist" → Database initialization failed
# - "Module not found" → Addon path configuration issue
# - "OperationalError: FATAL: password authentication failed" → Database password mismatch

Error Log Examples

SSH Authentication Error:

{
  "id": "log-042",
  "level": "error",
  "message": "Failed to connect to server via SSH",
  "timestamp": "2025-01-08T14:33:12Z",
  "step": "connecting",
  "data": {
    "vm_ip": "165.22.65.97",
    "ssh_port": 22,
    "error_code": "AUTH_FAILED",
    "ssh_method": "password"
  }
}

Git Clone Error:

{
  "id": "log-089",
  "level": "error",
  "message": "Failed to clone repository: Authentication failed",
  "timestamp": "2025-01-08T14:34:25Z",
  "step": "cloning_repo",
  "data": {
    "git_url": "https://github.com/myorg/private-repo.git",
    "branch": "main",
    "error": "fatal: Authentication failed for 'https://github.com/myorg/private-repo.git'",
    "git_provider": "github"
  }
}

Database Timeout Error:

{
  "id": "log-132",
  "level": "error",
  "message": "Database initialization failed: Connection timeout",
  "timestamp": "2025-01-08T14:36:45Z",
  "step": "initializing_database",
  "data": {
    "db_name": "3f4a5e2b-9c1d-4e8a-b2c3-7d6e5f4a3b2c",
    "elapsed_seconds": 120.0,
    "max_retries": 30,
    "last_error": "FATAL: the database system is starting up"
  }
}

Permissions

Required Permissions

View Deployment Progress:

  • Permission: project.deployments.view
  • Scope: Project level
  • Who: Project members, organization admins, portal admins

List Deployments:

  • Permission: project.deployments.list
  • Scope: Project level
  • Who: Project members, organization admins, portal admins

Create Deployment:

  • Permission: project.environments.deploy
  • Scope: Project level
  • Who: Project admins, organization admins, portal admins

Cancel Deployment:

  • Permission: project.deployments.cancel
  • Scope: Project level
  • Who: Project admins, organization admins, portal admins

View Deployment Logs:

  • Permission: project.deployments.view
  • Scope: Project level
  • Who: Project members, organization admins, portal admins

Organization vs Project Access

Organization-Level Access:

  • Organization admins can view deployments for all projects in their organization
  • Organization owners have full deployment management access

Project-Level Access:

  • Project members can view deployments for their assigned projects only
  • Project admins can manage deployments (deploy, cancel)
  • Project viewers can view deployment progress but cannot trigger deployments

Permission Hierarchy:

Portal Admin (portal.admin)
  ↓ All permissions globally
Organization Owner (org.owner)
  ↓ All permissions in organization
Organization Admin (org.admin)
  ↓ All project permissions in organization
Project Admin (project.admin)
  ↓ Deploy, cancel, view deployments for project
Project Member (project.member)
  ↓ View deployments only

API Reference

Get Deployment Details

Endpoint:

GET /api/v1/deployments/{deployment_id}

Authentication: Required (JWT token)

Response:

{
  "id": "d4e5f6a7-b8c9-4d3e-a2b1-c0d9e8f7a6b5",
  "project_id": "1a2b3c4d-5e6f-7g8h-9i0j-1k2l3m4n5o6p",
  "environment_id": "3f4a5e2b-9c1d-4e8a-b2c3-7d6e5f4a3b2c",
  "vm_id": "vm-uuid",
  "version": 5,
  "status": "success",
  "trigger": "manual",
  "triggered_by": "user-uuid",
  "git_commit": "a3b2c1d4e5f6g7h8",
  "git_branch": "main",
  "git_message": "Add customer portal module",
  "started_at": "2025-01-08T14:32:15Z",
  "completed_at": "2025-01-08T14:35:22Z",
  "duration_seconds": 187.3,
  "container_id": "f2e3d4c5b6a7...",
  "image_tag": "odoo:18.0",
  "error_message": null,
  "created_at": "2025-01-08T14:32:15Z",
  "updated_at": "2025-01-08T14:35:22Z"
}

Status Codes:

  • 200 OK: Deployment found
  • 404 Not Found: Deployment doesn't exist
  • 403 Forbidden: No permission to view deployment

Get Deployment Logs

Endpoint:

GET /api/v1/deployments/{deployment_id}/logs

Query Parameters:

  • level (optional): Filter by log level (debug, info, warning, error)
  • skip (default: 0): Pagination offset
  • limit (default: 500): Max logs to return

Example:

GET /api/v1/deployments/{id}/logs?level=error&limit=100

Response:

{
  "deployment_id": "d4e5f6a7-b8c9-4d3e-a2b1-c0d9e8f7a6b5",
  "logs": [
    {
      "id": "log-001",
      "level": "info",
      "message": "Deployment created and queued",
      "timestamp": "2025-01-08T14:32:15Z",
      "data": null
    }
  ]
}

Status Codes:

  • 200 OK: Logs retrieved
  • 404 Not Found: Deployment doesn't exist
  • 403 Forbidden: No permission to view logs

Get Deployment Progress

Endpoint:

GET /api/v1/deployments/{deployment_id}/progress

Response:

{
  "deployment_id": "d4e5f6a7-b8c9-4d3e-a2b1-c0d9e8f7a6b5",
  "status": "running",
  "progress_percent": 72,
  "steps": [
    {
      "id": "initializing",
      "name": "Initializing",
      "description": "Preparing deployment configuration",
      "status": "completed",
      "logs": [
        {
          "message": "Deployment created and queued",
          "timestamp": "2025-01-08T14:32:15Z",
          "level": "info"
        }
      ],
      "latest_message": "Configuration loaded (0.5s)"
    },
    {
      "id": "pulling_image",
      "name": "Pulling Image",
      "description": "Downloading Odoo Docker image",
      "status": "running",
      "logs": [
        {
          "message": "Pulling Odoo 18.0 image...",
          "timestamp": "2025-01-08T14:33:48Z",
          "level": "info"
        }
      ],
      "latest_message": "Pulling Odoo 18.0 image..."
    },
    {
      "id": "starting_container",
      "name": "Starting Container",
      "description": "Launching Odoo container",
      "status": "pending",
      "logs": [],
      "latest_message": null
    }
  ],
  "current_step": "pulling_image",
  "error_message": null,
  "started_at": "2025-01-08T14:32:15Z",
  "completed_at": null,
  "duration_seconds": null
}

Status Codes:

  • 200 OK: Progress retrieved
  • 404 Not Found: Deployment doesn't exist
  • 403 Forbidden: No permission to view progress

SSE Event Stream

Endpoint:

GET /api/v1/events/stream?token={jwt_token}

Authentication: JWT token as query parameter (required for EventSource API)

Response: Server-Sent Events stream

event: connected
data: {"status": "connected"}

event: deployment_progress
data: {"type":"deployment_progress","data":{"deployment_id":"d4e5f6a7...","step":"pulling_image","message":"Image ready (12.4s)"}}

: keepalive

event: deployment_progress
data: {"type":"deployment_progress","data":{"deployment_id":"d4e5f6a7...","step":"starting_container","message":"Container started (3.2s)"}}

Client Implementation:

const token = getAccessToken();
const eventSource = new EventSource(
  `https://api.oecsh.com/api/v1/events/stream?token=${token}`
);
 
eventSource.addEventListener("deployment_progress", (e) => {
  const data = JSON.parse(e.data);
  console.log("Deployment update:", data);
});
 
eventSource.onerror = () => {
  eventSource.close();
  // Reconnect with exponential backoff
};

List Deployments

Endpoint:

GET /api/v1/deployments

Query Parameters:

  • project_id (optional): Filter by project
  • status (optional): Filter by status (pending, running, success, failed)
  • skip (default: 0): Pagination offset
  • limit (default: 50): Max results per page

Example:

GET /api/v1/deployments?project_id={uuid}&status=failed&limit=20

Response:

[
  {
    "id": "d4e5f6a7...",
    "status": "success",
    "...": "..."
  }
]

Cancel Deployment

Endpoint:

POST /api/v1/deployments/{deployment_id}/cancel

Permission Required: project.deployments.cancel

Response:

{
  "message": "Deployment cancelled"
}

Status Codes:

  • 200 OK: Deployment cancelled
  • 400 Bad Request: Cannot cancel (already completed/failed)
  • 403 Forbidden: No permission
  • 404 Not Found: Deployment doesn't exist

Troubleshooting

SSE Connection Issues

Problem: SSE events not receiving updates

Symptoms:

  • Deployment progress stuck on "Connecting..."
  • No real-time log updates
  • Console error: EventSource failed

Causes:

  1. JWT token expired
  2. Nginx buffering not disabled
  3. Firewall blocking SSE endpoint
  4. CORS configuration issue
  5. Redis connection failed

Solutions:

1. Check JWT Token:

const token = getAccessToken();
console.log("Token valid?", !!token);
// If expired, refresh token via /api/v1/auth/refresh

2. Verify Nginx Configuration:

# nginx/conf.d/devsh.openeducat.ai.conf
location /api/v1/events/ {
    proxy_buffering off;  # MUST be disabled
    proxy_cache off;
    proxy_read_timeout 86400s;
    add_header X-Accel-Buffering "no";
}

Test:

curl -N -H "Accept: text/event-stream" \
  "https://api.oecsh.com/api/v1/events/stream?token={token}"
# Should see: event: connected

3. Check Browser Console:

// Monitor SSE connection state
console.log("EventSource state:", eventSource.readyState);
// 0 = CONNECTING, 1 = OPEN, 2 = CLOSED

4. Test Redis Connection:

# On backend server
redis-cli ping
# Should return: PONG
 
# Check pub/sub
redis-cli
> SUBSCRIBE sse:events
> # Wait for messages during deployment

5. Restart Services:

# Restart backend (to reconnect Redis)
docker compose -f docker-compose.prod.yml restart backend
 
# Restart nginx (to apply config changes)
sudo systemctl restart nginx

Missing Logs

Problem: Deployment logs incomplete or missing

Symptoms:

  • Deployment shows "success" but no logs
  • Some stages missing from progress
  • Empty logs array in API response

Causes:

  1. Database connection lost during deployment
  2. Deployment record created but task failed to start
  3. Logs not committed to database (transaction rollback)

Solutions:

1. Check Deployment Status:

SELECT id, status, started_at, completed_at, error_message
FROM deployments
WHERE id = 'd4e5f6a7-b8c9-4d3e-a2b1-c0d9e8f7a6b5';

2. Verify Logs in Database:

SELECT COUNT(*), level
FROM deployment_logs
WHERE deployment_id = 'd4e5f6a7-b8c9-4d3e-a2b1-c0d9e8f7a6b5'
GROUP BY level;

If count is 0, logs were never written (database transaction issue).

3. Check Docker Container Logs:

# View OdooDeployer output
docker logs backend --tail 200 | grep deployment_id

4. Check ARQ Worker Logs:

# Deployment runs as background task via ARQ
docker logs worker --tail 200 | grep deployment

5. Database Transaction Isolation: Ensure deployment logs are committed immediately:

# backend/services/odoo_deployer.py
log_entry = DeploymentLog(...)
self.db.add(log_entry)
await self.db.commit()  # Commit immediately, don't wait for end

Log Streaming Delays

Problem: Real-time logs delayed by 30-60 seconds

Symptoms:

  • SSE events arrive in bursts
  • Logs appear long after step completes
  • Progress bar jumps instead of smooth updates

Cause: Nginx/proxy buffering enabled

Solution:

Disable Nginx Buffering:

location /api/v1/events/ {
    proxy_buffering off;  # Critical!
    proxy_cache off;
    chunked_transfer_encoding off;
    add_header X-Accel-Buffering "no";
}

Test Buffering:

# Should see events immediately (no delay)
curl -N -H "Accept: text/event-stream" \
  "https://api.oecsh.com/api/v1/events/stream?token={token}"
 
# If delayed, check nginx config
sudo nginx -t
sudo systemctl reload nginx

Check Headers:

curl -I "https://api.oecsh.com/api/v1/events/stream?token={token}"
# Should include:
# X-Accel-Buffering: no
# Cache-Control: no-cache

Deployment Progress Stuck

Problem: Deployment stuck at specific stage for >5 minutes

Symptoms:

  • Progress shows "running" but no updates
  • Same step for extended period
  • No new logs appearing

Causes:

  1. SSH connection lost mid-deployment
  2. Docker daemon unresponsive
  3. Long-running operation (database restore, large Git clone)
  4. Background task crashed

Solutions:

1. Check Environment Status:

SELECT id, name, status, updated_at
FROM project_environments
WHERE id = '3f4a5e2b-9c1d-4e8a-b2c3-7d6e5f4a3b2c';
 
-- If status = "deploying" for >10 minutes, deployment likely stuck

2. Check Deployment Step:

GET /api/v1/deployments/{id}/progress

Look at current_step and latest log message.

3. Check ARQ Worker:

# View background task queue
docker logs worker --tail 50
 
# Check if task is still running
docker exec worker ps aux | grep deployment

4. Check Server Resources:

# SSH into server
ssh root@165.22.65.97
 
# Check CPU/RAM
htop
 
# Check disk space
df -h
 
# Check Docker daemon
systemctl status docker

5. Manually Inspect Deployment:

# On target server
docker ps | grep {env_uuid}
 
# Check container logs
docker logs {env_uuid}_odoo --tail 50
docker logs {env_uuid}_db --tail 50
 
# Check if Git clone in progress
ps aux | grep git

6. Cancel and Retry: If stuck for >10 minutes:

POST /api/v1/deployments/{id}/cancel

Then trigger new deployment:

POST /api/v1/deployments

Related Documentation


Summary

OEC.SH provides comprehensive deployment monitoring through:

  • 16-Stage Pipeline: Granular visibility into every deployment phase
  • Real-Time Streaming: SSE-powered live log updates with Redis pub/sub
  • Historical Logs: Persistent storage with filtering and search
  • Performance Metrics: Stage timing breakdown and duration tracking
  • Error Diagnostics: Detailed error messages with troubleshooting context
  • Scalable Architecture: Redis-backed SSE supports multiple backend workers

Monitor deployments in real-time, troubleshoot issues with detailed logs, and track performance trends with comprehensive metrics.