Database Management

Feature ID: DB-001 Category: Database & Storage Required Permission: environment.manage (read replicas require environment.replicas.*) API Endpoint: Base /api/v1/environments/{id}/

Overview

OEC.SH provides enterprise-grade PostgreSQL database management for Odoo environments with automatic optimization, connection pooling, and high-availability read replicas. The platform uses PostgreSQL 15 with PgBouncer for connection pooling and streaming replication for read replicas.

What's included:

PostgreSQL 15 - Containerized database with automatic optimization
PgBouncer Connection Pooling - Efficient connection management (optional)
Read Replicas - PostgreSQL streaming replication for Odoo 18.0 and 19.0
Automated Backups - pg_dump integration with 6 cloud storage providers
Real-time Monitoring - Replication lag tracking with SSE updates
Resource Allocation - Configurable CPU, RAM, and disk quotas

Architecture:

Container-based: Isolated PostgreSQL containers per environment
Network isolation: Private Docker networks per environment
Automatic failover: Health monitoring with alerts
Quota integration: Replica resources counted toward organization limits

Database Architecture

Container Naming Convention

Each environment has dedicated database containers with predictable naming:

Container	Name Pattern	Purpose	Port
Primary PostgreSQL	`postgres-primary-{env_id}`	Main database server	Internal only
Replica PostgreSQL	`postgres-replica-{env_id}`	Read-only replica (Odoo 18/19 only)	Internal only
PgBouncer Primary	`pgbouncer-{env_id}`	Connection pooler for primary	6432
PgBouncer Replica	`pgbouncer-replica-{env_id}`	Connection pooler for replica	6433

Example: For environment abc123-def456:

postgres-primary-abc123-def456  # Primary database
postgres-replica-abc123-def456  # Read replica (if enabled)
pgbouncer-abc123-def456         # Primary connection pooler
pgbouncer-replica-abc123-def456 # Replica connection pooler

Network Architecture

┌─────────────────────────────────────────────────────────────┐
│ Environment Network: paasportal_net_{env_id}                │
│                                                               │
│  ┌──────────────┐      ┌──────────────────┐                 │
│  │ Odoo         │─────▶│ PgBouncer :6432 │                  │
│  │ Container    │      │  (Primary)       │                  │
│  └──────────────┘      └──────────────────┘                  │
│         │                       │                            │
│         │              ┌────────▼─────────┐                  │
│         │              │ PostgreSQL       │                  │
│         │              │ Primary          │                  │
│         │              └────────┬─────────┘                  │
│         │                       │                            │
│         │                       │ Streaming                  │
│         │                       │ Replication                │
│         │                       ▼                            │
│         │              ┌──────────────────┐                  │
│         │              │ PostgreSQL       │                  │
│         │              │ Replica          │                  │
│         │              └────────▲─────────┘                  │
│         │                       │                            │
│         │      ┌────────────────┘                            │
│         │      │                                             │
│         └─────▶│ PgBouncer :6433 │                           │
│                │  (Replica)       │                           │
│                └──────────────────┘                           │
└───────────────────────────────────────────────────────────────┘

Database Access

Connection Strings

Primary Database (read-write):

# With PgBouncer (recommended)
postgresql://odoo:{password}@pgbouncer-{env_id}:6432/{db_name}
 
# Direct connection (without PgBouncer)
postgresql://odoo:{password}@postgres-primary-{env_id}:5432/{db_name}

Read Replica (read-only, Odoo 18/19 only):

# With PgBouncer (recommended)
postgresql://odoo:{password}@pgbouncer-replica-{env_id}:6433/{db_name}
 
# Direct connection (without PgBouncer)
postgresql://odoo:{password}@postgres-replica-{env_id}:5432/{db_name}

Credentials

Database Credentials are auto-generated during deployment:

Field	Value	Location
Username	`odoo`	Fixed
Password	Auto-generated (24 chars)	Retrieved from container env: `POSTGRES_PASSWORD`
Database Name	Environment ID	Fixed: `{environment_id}`
Port	5432 (PostgreSQL), 6432/6433 (PgBouncer)	Container ports

Retrieve password:

# From container environment variable
docker exec postgres-primary-{env_id} printenv POSTGRES_PASSWORD
 
# From Odoo config (fallback)
grep -oP 'db_password = \K.*' /opt/paasportal/{env_id}/odoo.conf

Ports

Service	Port	Access	Purpose
PostgreSQL Primary	5432	Internal	Direct database connection
PostgreSQL Replica	5432	Internal	Direct read-only connection
PgBouncer Primary	6432	Internal	Pooled primary connection
PgBouncer Replica	6433	Internal	Pooled replica connection

Note: All database ports are internal only (no external exposure). Access is limited to containers within the same environment network.

PostgreSQL Read Replicas

Overview

PostgreSQL read replicas provide horizontal scalability for read-heavy Odoo workloads using PostgreSQL streaming replication. Replicas are read-only and continuously sync from the primary database.

Available for: Odoo 18.0 and 19.0 ONLY

Use cases:

Offload reporting queries from production database
Improve dashboard performance with parallel reads
Support analytics and BI tools without impacting primary
Provide read-only access for external integrations
Enable high-availability architecture

Key features:

✅ Real-time streaming replication (near-zero lag)
✅ Automatic health monitoring every 2 minutes
✅ Lag alerts (Warning: 50MB/15s, Critical: 100MB/30s)
✅ SSE events for real-time UI updates
✅ Automatic quota integration (30% CPU/RAM, 100% disk)
✅ One-click rebuild from primary

Version Requirements

Supported Versions: Odoo 18.0 and 19.0 ONLY

Odoo Version	Read Replica Support	Reason
18.0	✅ Supported	Tested and validated
19.0	✅ Supported	Tested and validated
17.0 or earlier	❌ Not Supported	Not tested, compatibility unknown
Saas versions	❌ Not Supported	Version-specific limitations

Version validation:

Backend: validate_replica_support() in /backend/utils/replica_validation.py
Frontend: isReplicaSupported() in /frontend/src/lib/replica-validation.ts

Attempting to deploy replica on unsupported version:

{
  "error": "unsupported_odoo_version",
  "message": "Read replicas are only supported for Odoo versions 18.0, 19.0. Your project is using Odoo 17.0.",
  "supported_versions": ["18.0", "19.0"],
  "current_version": "17.0",
  "hint": "Please upgrade to Odoo 18.0 or 19.0 to use read replica features."
}

Resource Allocation

Replicas consume additional resources counted toward organization quota:

Resource	Primary	Replica	Total Consumed
CPU	2.0 cores	0.6 cores (30%)	2.6 cores
RAM	4096 MB	1229 MB (30%)	5325 MB
Disk	20 GB	20 GB (100%)	40 GB

Quota check before deployment:

# QuotaService validates organization has capacity
replica_cpu = primary_cpu * 0.30  # 30% of primary
replica_ram = primary_ram * 0.30  # 30% of primary
replica_disk = primary_disk * 1.0  # 100% of primary (full data copy)
 
# Example: 2 CPU, 4GB RAM, 20GB disk environment
# Total quota needed: 2.6 CPU, 5.2GB RAM, 40GB disk

Quota exceeded example:

{
  "detail": "Cannot deploy replica - organization CPU quota exceeded (10.6 > 10.0 cores)"
}

Deploy Replica

Step 1: Verify Prerequisites

Before deploying:

✅ Environment running Odoo 18.0 or 19.0
✅ Environment status is running or stopped (not deploying or pending)
✅ Organization has sufficient quota (30% CPU/RAM, 100% disk)
✅ User has environment.replicas.create permission (future)

Check from UI:

Navigate to Environment Details → PostgreSQL Optimization panel
"Read Replica" section shows "Deploy Replica" button for Odoo 18/19
Unsupported versions show: "Read replicas are only available for Odoo 18.0 and 19.0"

Step 2: Deploy via API

Endpoint: POST /api/v1/environments/{environment_id}/replicas

Request body:

{
  "force_rebuild": false  // Optional: Force rebuild if already deployed
}

Success response (200 OK):

{
  "success": true,
  "message": "Read replica deployed successfully",
  "replica_deployed": true,
  "replica_status": "online"
}

Error responses:

Status	Error	Reason
400	`unsupported_odoo_version`	Project not on Odoo 18.0 or 19.0
400	Invalid environment status	Environment is deploying or pending
403	CPU quota exceeded	Organization out of CPU quota
403	RAM quota exceeded	Organization out of RAM quota
404	Environment not found	Invalid environment ID

Example error (unsupported version):

{
  "detail": {
    "error": "unsupported_odoo_version",
    "message": "Read replicas are only supported for Odoo versions 18.0, 19.0. Your project is using Odoo 17.0.",
    "supported_versions": ["18.0", "19.0"],
    "current_version": "17.0",
    "hint": "Please upgrade to Odoo 18.0 or 19.0 to use read replica features."
  }
}

Step 3: Verify Deployment

Check replica status:

GET /api/v1/environments/{environment_id}/replicas/health

Response:

{
  "replica_deployed": true,
  "replica_status": "online",
  "replica_lag_bytes": 0,
  "replica_lag_seconds": 0.0,
  "replica_last_check": "2025-12-11T14:30:00Z",
  "replica_error_message": null
}

Container verification (via SSH):

# Check replica container exists
docker ps | grep postgres-replica-{env_id}
 
# Verify replication status from primary
docker exec postgres-primary-{env_id} \
  psql -U odoo -d postgres -c "SELECT * FROM pg_stat_replication;"

Expected replication output:

 application_name |  state   | sent_lsn  | flush_lsn | replay_lsn | write_lag | flush_lag | replay_lag
------------------+----------+-----------+-----------+------------+-----------+-----------+------------
 replica          | streaming| 0/3000000 | 0/3000000 | 0/3000000  | 00:00:00  | 00:00:00  | 00:00:00

Replica Health Monitoring

Monitoring Architecture

Automatic health checks run every 2 minutes via ARQ background worker:

# /backend/tasks/worker.py
cron(
    check_replica_health_task,
    minute={0, 2, 4, 6, 8, ..., 58},  # Every 2 minutes
)

What's monitored:

Replication State - streaming, catchup, startup, or stopped
Lag in Bytes - pg_wal_lsn_diff(sent_lsn, flush_lsn) from pg_stat_replication
Lag in Seconds - write_lag from pg_stat_replication
Health Status - Derived: online, lagging, offline, or error

Health Status Definitions

Status	Condition	Description	Action Required
online	Replication streaming, lag < thresholds	Replica healthy and syncing	None
lagging	Lag > 50MB or > 15s (warning)	Replica falling behind	Monitor, consider rebuild if critical
lagging	Lag > 100MB or > 30s (critical)	Replica critically behind	Rebuild recommended
offline	Replication state not `streaming`	Replica disconnected	Check network/container, rebuild
error	SSH/query failure	Unable to check health	Check SSH access, container status

Lag Thresholds

Warning Thresholds (triggers lagging status):

LAG_WARNING_BYTES = 50 * 1024 * 1024   # 50 MB
LAG_WARNING_SECONDS = 15               # 15 seconds

Critical Thresholds (triggers lagging status + alerts):

LAG_CRITICAL_BYTES = 100 * 1024 * 1024  # 100 MB
LAG_CRITICAL_SECONDS = 30               # 30 seconds

Health determination logic:

if state != "streaming":
    status = "offline"
elif lag_bytes > LAG_CRITICAL_BYTES or lag_seconds > LAG_CRITICAL_SECONDS:
    status = "lagging"  # Critical
elif lag_bytes > LAG_WARNING_BYTES or lag_seconds > LAG_WARNING_SECONDS:
    status = "lagging"  # Warning
else:
    status = "online"

Real-time Updates via SSE

Event types:

1. `replica.health_updated`

Emitted on every health check (every 2 minutes) with updated metrics:

// Frontend: useSSEEvent hook
useSSEEvent('replica.health_updated', (event) => {
  if (event.data.environment_id === environmentId) {
    // Update UI with latest health metrics
    setReplicaHealth({
      status: event.data.status,
      lag_bytes: event.data.lag_bytes,
      lag_seconds: event.data.lag_seconds,
      last_check: event.data.last_check,
    });
  }
});

Event payload:

{
  "event_type": "replica.health_updated",
  "data": {
    "environment_id": "abc123-def456",
    "status": "online",
    "lag_bytes": 2048,
    "lag_seconds": 0.5,
    "last_check": "2025-12-11T14:32:00Z"
  }
}

2. `replica.lag_alert`

Emitted when lag exceeds warning or critical thresholds:

{
  "event_type": "replica.lag_alert",
  "data": {
    "environment_id": "abc123-def456",
    "severity": "critical",
    "lag_bytes": 105906176,
    "lag_seconds": 35.2,
    "threshold_bytes": 104857600,
    "threshold_seconds": 30
  }
}

Severity levels:

"warning": Lag > 50MB or > 15s
"critical": Lag > 100MB or > 30s

Get Current Health Status

Endpoint: GET /api/v1/environments/{environment_id}/replicas/health

Response (replica deployed and healthy):

{
  "replica_deployed": true,
  "replica_status": "online",
  "replica_lag_bytes": 1024,
  "replica_lag_seconds": 0.2,
  "replica_last_check": "2025-12-11T14:30:00Z",
  "replica_error_message": null
}

Response (replica not deployed):

{
  "replica_deployed": false,
  "replica_status": "offline",
  "replica_lag_bytes": null,
  "replica_lag_seconds": null,
  "replica_last_check": null,
  "replica_error_message": null
}

Response (replica in error state):

{
  "replica_deployed": true,
  "replica_status": "error",
  "replica_lag_bytes": null,
  "replica_lag_seconds": null,
  "replica_last_check": "2025-12-11T14:30:00Z",
  "replica_error_message": "SSH error: Connection timeout"
}

Frontend Health Widget

ReplicaHealthWidget component displays real-time health:

Location: /frontend/src/components/environments/ReplicaHealthWidget.tsx

Features:

Real-time updates via SSE (replica.health_updated)
Auto-refresh every 30 seconds
Color-coded status badges
Lag metrics (bytes and seconds)
Last check timestamp

Status badges:

🟢 Online: Green badge, lag within limits
🟡 Lagging: Yellow badge, lag > warning threshold
⚫ Offline: Gray badge, replication not streaming
🔴 Error: Red badge, health check failed

Usage:

import { ReplicaHealthWidget } from '@/components/environments/ReplicaHealthWidget';
 
<ReplicaHealthWidget
  projectId={projectId}
  environmentId={environmentId}
  replicaDeployed={environment.replica_deployed}
/>

Auto-hides when:

replicaDeployed is false
Odoo version is not 18.0 or 19.0 (via isReplicaSupported())

Replica Rebuild

When to Rebuild

Rebuild replica from primary when:

⚠️ Replication lag is critical (> 100MB or > 30s)
⚠️ Replica status is offline or error
⚠️ Data corruption suspected
⚠️ Network interruption caused replication break
ℹ️ Manual rebuild for testing/maintenance

Rebuild process:

Stop existing replica container
Clear replica data directory
Create fresh base backup from primary (pg_basebackup)
Restart replica with replication configuration
Verify streaming replication is working
Update health metrics to online

Rebuild via API

Endpoint: POST /api/v1/environments/{environment_id}/replicas/rebuild

Request body:

{
  "force": false  // Optional: Rebuild even if replica is healthy
}

Success response (200 OK):

{
  "success": true,
  "message": "Read replica rebuilt successfully from primary",
  "replica_deployed": true,
  "replica_status": "online"
}

Smart rebuild logic:

If force: false (default), rebuild skipped if replica is healthy:

{
  "success": true,
  "message": "Replica is healthy. Use force=true to rebuild anyway.",
  "replica_deployed": true,
  "replica_status": "online"
}

Healthy criteria:

Replication status is online
Lag < 10MB bytes
Lag < 60 seconds

Error responses:

Status	Error	Reason
400	`unsupported_odoo_version`	Not Odoo 18.0 or 19.0
400	No replica deployed	Must deploy replica first
404	Environment not found	Invalid environment ID

Rebuild via UI

Location: Environment Details → PostgreSQL Optimization → Read Replica panel

Steps:

Click "Rebuild Replica" button
Confirm rebuild action (warns about temporary downtime)
Monitor rebuild progress (typically 1-5 minutes depending on database size)
Health widget shows online status when complete

During rebuild:

Replica is temporarily unavailable (read queries fail)
Primary database continues normal operation
No data loss occurs (full resync from primary)

Remove Replica

When to Remove

Remove replica when:

No longer needed (reducing costs)
Switching to different high-availability strategy
Troubleshooting replication issues
Freeing up organization quota for other environments

What happens on removal:

Replica container is stopped and removed
Replica data is deleted (primary unaffected)
Quota is released (30% CPU/RAM, 100% disk)
Database fields reset: replica_deployed = false, replica_status = "offline"

Remove via API

Endpoint: DELETE /api/v1/environments/{environment_id}/replicas

No request body required

Success response (200 OK):

{
  "success": true,
  "message": "Read replica removed successfully",
  "replica_deployed": false,
  "replica_status": "offline"
}

Error responses:

Status	Error	Reason
400	`unsupported_odoo_version`	Not Odoo 18.0 or 19.0
400	No replica deployed	Replica not currently deployed
404	Environment not found	Invalid environment ID

Remove via UI

Location: Environment Details → PostgreSQL Optimization → Read Replica panel

Steps:

Click "Remove Replica" button
Confirm removal action (warns replica will be deleted)
Replica container removed (typically instant)
Quota released and available for other resources

Safety notes:

✅ Primary database is never affected by replica removal
✅ Removal can be reversed by re-deploying replica
✅ No data loss on primary (replica is read-only copy)
⚠️ All read queries to replica will fail after removal

PgBouncer Configuration

Overview

PgBouncer is a lightweight connection pooler for PostgreSQL that reduces connection overhead and improves performance for high-traffic Odoo environments.

Benefits:

✅ Reduced connection overhead (connection reuse)
✅ Protection against connection exhaustion
✅ Lower memory usage on PostgreSQL server
✅ Better performance for web/API traffic patterns
✅ Separate pooling for primary and replica

When to enable:

Production environments with > 50 concurrent users
Environments experiencing connection limit errors
High-traffic e-commerce or SaaS deployments
Environments with frequent connection spikes

Pool Size Calculation

Automatic pool size based on environment RAM:

RAM (MB)	Pool Size	Reasoning
≤ 2048	20	Small environment, conservative pooling
2049 - 4096	40	Medium environment, moderate pooling
4097 - 8192	80	Large environment, aggressive pooling
> 8192	100	Enterprise environment, maximum pooling

Formula:

def calculate_pgbouncer_pool_size(ram_mb: int) -> int:
    if ram_mb <= 2048:
        return 20
    elif ram_mb <= 4096:
        return 40
    elif ram_mb <= 8192:
        return 80
    else:
        return 100

Configuration Files

pgbouncer.ini (primary):

[databases]
{db_name} = host=postgres-primary-{env_id} port=5432 dbname={db_name}
 
[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
max_client_conn = {pool_size}
default_pool_size = {pool_size}
reserve_pool_size = 5
reserve_pool_timeout = 3
log_connections = 0
log_disconnections = 0

pgbouncer.ini (replica):

[databases]
{db_name} = host=postgres-replica-{env_id} port=5432 dbname={db_name}
 
[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6433  # Different port for replica
# ... (rest same as primary)

userlist.txt:

"odoo" "{password_plaintext}"

Ports

PgBouncer Instance	Port	Target	Purpose
Primary	6432	`postgres-primary-{env_id}:5432`	Pooled primary connection
Replica	6433	`postgres-replica-{env_id}:5432`	Pooled replica connection

Odoo configuration (with PgBouncer):

# /opt/paasportal/{env_id}/odoo.conf
db_host = pgbouncer-{env_id}
db_port = 6432
db_user = odoo
db_password = {auto_generated}
db_name = {db_name}

Enable/Disable PgBouncer

Via Environment Settings:

Field: pgbouncer_pool_size
null or 0 = PgBouncer disabled
> 0 (e.g., 40) = PgBouncer enabled with specified pool size

Backend logic (DatabaseOptimizerService):

# User-controlled setting
if environment.pgbouncer_pool_size and environment.pgbouncer_pool_size > 0:
    enable_pgbouncer = True
    pool_size = environment.pgbouncer_pool_size
else:
    enable_pgbouncer = False
    pool_size = None

Database port selection:

def get_database_port(pgbouncer_pool_size: int | None) -> int:
    if pgbouncer_pool_size and pgbouncer_pool_size > 0:
        return 6432  # PgBouncer listen port
    else:
        return 5432  # Direct PostgreSQL port

Database Backup & Restore

Backup Process

Automated backups use pg_dump to create SQL dumps:

Backup components:

PostgreSQL dump - pg_dump -U odoo -d {db_name} -Fp (plain-text format)
Filestore archive - tar -czf filestore.tar.gz /var/lib/odoo/filestore
Manifest metadata - JSON with checksums, timestamps, Odoo version

Backup execution (via SSH):

# Dump database
docker exec postgres-primary-{env_id} \
  pg_dump -U odoo -d {db_name} -Fp > /tmp/dump.sql
 
# Archive filestore
docker exec {odoo_container} \
  tar -czf /tmp/filestore.tar.gz -C /var/lib/odoo filestore
 
# Create ZIP archive
zip backup.zip dump.sql filestore.tar.gz manifest.json

Storage providers:

AWS S3
Cloudflare R2
Backblaze B2
MinIO (self-hosted)
SFTP
FTP

For detailed backup instructions: See Create Manual Backup

Restore Process

Point-in-time recovery from backups:

Restore steps:

Download backup ZIP from cloud storage
Extract dump.sql and filestore.tar.gz
Stop target environment containers
Drop existing database: DROP DATABASE "{db_name}"
Create fresh database: CREATE DATABASE "{db_name}" OWNER odoo
Restore SQL dump: psql -U odoo -d {db_name} < dump.sql
Restore filestore: tar -xzf filestore.tar.gz -C /var/lib/odoo
Restart environment containers
Verify database integrity

Restore modes:

Same environment - Rollback to previous state
Different environment - Clone production to staging
New environment - Create fresh environment from backup
Cross-server - Migrate to different server

For detailed restore instructions: See Restore from Backup

Backup Verification

Checksum validation ensures backup integrity:

manifest.json:

{
  "environment_id": "abc123-def456",
  "backup_id": "def789-ghi012",
  "created_at": "2025-12-11T14:00:00Z",
  "odoo_version": "18.0",
  "database_size_bytes": 524288000,
  "filestore_size_bytes": 104857600,
  "checksums": {
    "dump.sql": "sha256:a1b2c3d4...",
    "filestore.tar.gz": "sha256:e5f6g7h8..."
  }
}

Verification process:

Download backup from storage
Calculate SHA-256 checksum of each file
Compare with manifest.json checksums
Reject backup if checksums don't match
Log verification failure for audit

Connection Limits

PostgreSQL max_connections

Default configuration:

max_connections = 100

Why 100 connections:

Standard for Odoo deployments
Sufficient for 50-200 concurrent users
Leaves headroom for maintenance connections
Balances memory usage vs availability

Increase max_connections (if needed):

# Edit PostgreSQL config
docker exec postgres-primary-{env_id} \
  sh -c 'echo "max_connections = 200" >> /var/lib/postgresql/data/postgresql.conf'
 
# Restart PostgreSQL
docker restart postgres-primary-{env_id}

Connection limit errors:

FATAL: sorry, too many clients already

Solutions:

✅ Enable PgBouncer (recommended) - Connection pooling prevents exhaustion
✅ Increase max_connections - Allocate more connection slots
✅ Optimize queries - Reduce connection duration
✅ Close idle connections - Set idle_in_transaction_session_timeout

PgBouncer Pooling Strategy

Pool mode: transaction (best for Odoo)

How it works:

Connection returned to pool after each transaction
Multiple clients can share same PostgreSQL connection
Reduces total connections to PostgreSQL
Transparent to application (Odoo doesn't notice)

Example scenario:

100 concurrent Odoo users
Without PgBouncer: 100 PostgreSQL connections
With PgBouncer (pool_size=40): Only 40 PostgreSQL connections

Parameters:

pool_mode = transaction
max_client_conn = 100       # Total client connections PgBouncer accepts
default_pool_size = 40      # Connections to PostgreSQL per database
reserve_pool_size = 5       # Emergency reserve
reserve_pool_timeout = 3    # Timeout for reserve pool

Performance Tuning

Automatic PostgreSQL Optimization

DatabaseOptimizerService provides tier-based optimization using PGTune algorithm:

Optimization tiers:

Tier	RAM Range	Use Case	Tuning Profile
Small	≤ 2GB	Dev/staging	Conservative, low memory
Medium	2-8GB	Small production	Balanced for typical workloads
Large	> 8GB	Enterprise production	Aggressive, high performance

Optimized parameters:

Parameter	Small	Medium	Large	Description
`shared_buffers`	512MB	2GB	4GB	RAM for caching data blocks
`effective_cache_size`	1.5GB	6GB	24GB	Total RAM for disk caching
`work_mem`	16MB	64MB	128MB	RAM per sort/hash operation
`maintenance_work_mem`	128MB	512MB	1GB	RAM for VACUUM, CREATE INDEX
`max_connections`	100	100	100	Maximum concurrent connections
`random_page_cost`	1.1	1.1	1.1	Cost estimate for random I/O (SSD)

Example calculation (4GB RAM environment):

# Tier: Medium (2-8GB)
shared_buffers = "2GB"           # 25% of total RAM
effective_cache_size = "6GB"     # 75% of total RAM
work_mem = "64MB"                # Per-operation memory
maintenance_work_mem = "512MB"   # Maintenance operations

Manual Tuning

Override auto-optimization by editing postgresql.conf:

Step 1: Edit configuration:

docker exec -it postgres-primary-{env_id} bash
 
# Edit config
vi /var/lib/postgresql/data/postgresql.conf
 
# Add custom parameters
shared_buffers = 3GB
work_mem = 128MB
maintenance_work_mem = 1GB

Step 2: Restart PostgreSQL:

docker restart postgres-primary-{env_id}

Step 3: Verify changes:

SHOW shared_buffers;
SHOW work_mem;
SHOW maintenance_work_mem;

Performance Monitoring

Check current configuration:

-- Connect to database
docker exec -it postgres-primary-{env_id} psql -U odoo -d {db_name}
 
-- Show all configuration
SHOW ALL;
 
-- Show specific parameters
SHOW shared_buffers;
SHOW effective_cache_size;
SHOW work_mem;
SHOW max_connections;

Check active connections:

SELECT count(*) FROM pg_stat_activity;

Check slow queries:

SELECT pid, usename, application_name, state, query, query_start
FROM pg_stat_activity
WHERE state != 'idle'
ORDER BY query_start;

Check database size:

SELECT pg_size_pretty(pg_database_size(current_database()));

Permissions

Permission Codes

Read replica management (Sprint 2E40 - future implementation):

Permission	Code	Description	Scope
Create Replica	`environment.replicas.create`	Deploy read replica	Environment
Remove Replica	`environment.replicas.delete`	Remove read replica	Environment
Rebuild Replica	`environment.replicas.rebuild`	Rebuild replica from primary	Environment
View Health	`environment.replicas.view`	View replica health metrics	Environment

Current implementation: Replica endpoints use get_organization_member() for access control (verifies user is org member). Future sprints will add granular permission checks.

Related permissions:

environment.manage - General environment management (includes database access)
project.backups.create - Create backups (uses pg_dump)
project.backups.restore - Restore from backups (affects database)

RBAC Integration

Permission matrix (future):

Role	Create Replica	Remove Replica	Rebuild Replica	View Health
Portal Admin	✅	✅	✅	✅
Org Owner	✅	✅	✅	✅
Org Admin	✅	✅	✅	✅
Org Member	❌	❌	❌	✅
Project Admin	✅	✅	✅	✅
Project Member	❌	❌	❌	✅

Frontend protection (example):

<AbilityGate permission="environment.replicas.create" organizationId={orgId}>
  <Button onClick={handleDeployReplica}>Deploy Replica</Button>
</AbilityGate>

API Reference

Deploy Replica

Endpoint: POST /api/v1/environments/{environment_id}/replicas

Request:

{
  "force_rebuild": false
}

Response (200 OK):

{
  "success": true,
  "message": "Read replica deployed successfully",
  "replica_deployed": true,
  "replica_status": "online"
}

Errors: 400 (unsupported version), 403 (quota exceeded), 404 (not found)

Remove Replica

Endpoint: DELETE /api/v1/environments/{environment_id}/replicas

No request body

Response (200 OK):

{
  "success": true,
  "message": "Read replica removed successfully",
  "replica_deployed": false,
  "replica_status": "offline"
}

Errors: 400 (not deployed), 404 (not found)

Get Replica Health

Endpoint: GET /api/v1/environments/{environment_id}/replicas/health

No request body

Response (200 OK):

{
  "replica_deployed": true,
  "replica_status": "online",
  "replica_lag_bytes": 1024,
  "replica_lag_seconds": 0.2,
  "replica_last_check": "2025-12-11T14:30:00Z",
  "replica_error_message": null
}

Errors: 404 (not found)

Rebuild Replica

Endpoint: POST /api/v1/environments/{environment_id}/replicas/rebuild

Request:

{
  "force": false
}

Response (200 OK):

{
  "success": true,
  "message": "Read replica rebuilt successfully from primary",
  "replica_deployed": true,
  "replica_status": "online"
}

Errors: 400 (not deployed), 404 (not found)

Troubleshooting

Replication Lag Issues

Symptom: Replica health shows lagging status with high lag_bytes or lag_seconds

Common causes:

High write volume on primary - Replica can't keep up
Network congestion - Slow replication stream
Insufficient replica resources - Replica CPU/RAM too low
Long-running transactions - Blocks replication apply

Solutions:

1. Check current lag:

docker exec postgres-primary-{env_id} \
  psql -U odoo -d postgres -c "SELECT application_name, state, pg_wal_lsn_diff(sent_lsn, flush_lsn) AS lag_bytes, write_lag FROM pg_stat_replication;"

2. Check replica apply lag:

docker exec postgres-replica-{env_id} \
  psql -U odoo -d postgres -c "SELECT pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn(), pg_wal_lsn_diff(pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn()) AS replay_lag;"

3. Rebuild replica if lag is critical:

POST /api/v1/environments/{env_id}/replicas/rebuild
{ "force": true }

4. Increase replica resources (if consistently lagging):

Replica inherits 30% of primary resources
Solution: Increase primary environment resources (replica scales automatically)

Connection Failures

Symptom: Odoo cannot connect to database, shows connection errors

Error messages:

FATAL: sorry, too many clients already

could not connect to server: Connection refused

Solutions:

1. Check PostgreSQL is running:

docker ps | grep postgres-primary-{env_id}
docker logs postgres-primary-{env_id}

2. Verify container network:

docker network inspect paasportal_net_{env_id}

3. Check max_connections limit:

docker exec postgres-primary-{env_id} psql -U odoo -d postgres -c "SHOW max_connections;"
docker exec postgres-primary-{env_id} psql -U odoo -d postgres -c "SELECT count(*) FROM pg_stat_activity;"

4. Enable PgBouncer to prevent connection exhaustion:

Edit environment: Set pgbouncer_pool_size = 40
Redeploy environment

5. Verify credentials:

docker exec postgres-primary-{env_id} printenv POSTGRES_PASSWORD
grep db_password /opt/paasportal/{env_id}/odoo.conf

Health Check Failures

Symptom: Replica status shows error with error_message

Common causes:

SSH connection failed - Cannot reach VM
Container not found - Replica container stopped/removed
Query timeout - PostgreSQL not responding
Permission denied - Authentication failure

Solutions:

1. Check SSH access:

# From PaaSPortal server
ssh -i /path/to/key root@{vm_ip} "docker ps"

2. Verify replica container exists:

docker ps -a | grep postgres-replica-{env_id}
docker logs postgres-replica-{env_id}

3. Check replication user permissions:

docker exec postgres-primary-{env_id} \
  psql -U odoo -d postgres -c "SELECT * FROM pg_stat_replication;"

4. Rebuild replica to fix broken replication:

POST /api/v1/environments/{env_id}/replicas/rebuild
{ "force": true }

5. Check ARQ worker logs (health check task):

docker logs worker-container | grep "check_replica_health"

Replica Offline

Symptom: Replica status is offline, replication state not streaming

Common causes:

Replica container stopped - Manual stop or crash
Replication slot deleted - Primary lost replica connection
Network partition - Replica can't reach primary
WAL files purged - Replica too far behind

Solutions:

1. Check replica container status:

docker ps -a | grep postgres-replica-{env_id}
docker start postgres-replica-{env_id}  # If stopped

2. Check replication connection from primary:

docker exec postgres-primary-{env_id} \
  psql -U odoo -d postgres -c "SELECT * FROM pg_stat_replication;"

3. Check pg_hba.conf allows replication:

docker exec postgres-primary-{env_id} cat /var/lib/postgresql/data/pg_hba.conf
# Should include: host replication odoo 0.0.0.0/0 md5

4. Rebuild replica (recreates replication from scratch):

POST /api/v1/environments/{env_id}/replicas/rebuild
{ "force": true }

5. Check network connectivity:

docker exec postgres-replica-{env_id} ping postgres-primary-{env_id}
docker exec postgres-replica-{env_id} nc -zv postgres-primary-{env_id} 5432

Performance Degradation

Symptom: Database queries are slow, Odoo performance degraded

Diagnostic queries:

1. Check slow queries:

SELECT pid, usename, application_name, state, query, query_start, now() - query_start AS duration
FROM pg_stat_activity
WHERE state != 'idle' AND now() - query_start > interval '10 seconds'
ORDER BY duration DESC;

2. Check table bloat:

SELECT schemaname, tablename, pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC
LIMIT 20;

3. Check missing indexes:

SELECT schemaname, tablename, attname, n_distinct, correlation
FROM pg_stats
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
  AND n_distinct > 100
  AND correlation < 0.1
ORDER BY n_distinct DESC;

Solutions:

1. Enable PgBouncer (if not already enabled):

Reduces connection overhead
Improves performance for high-traffic environments

2. Increase shared_buffers:

docker exec postgres-primary-{env_id} \
  sh -c 'echo "shared_buffers = 4GB" >> /var/lib/postgresql/data/postgresql.conf'
docker restart postgres-primary-{env_id}

3. Run VACUUM ANALYZE:

VACUUM ANALYZE;

4. Deploy read replica (if supported):

Offload reporting queries to replica
Reduces load on primary database

5. Increase environment resources:

Edit environment: Increase CPU/RAM allocation
Redeploy environment

Technical Implementation

Backend files:

/backend/api/v1/routes/replicas.py - Replica API endpoints
/backend/services/replication_monitor.py - Health monitoring service
/backend/services/deployment/database_deployer.py - PostgreSQL deployment
/backend/services/database_optimizer.py - PGTune optimization
/backend/services/quota_service.py - Quota validation
/backend/utils/replica_validation.py - Version validation
/backend/tasks/worker.py - ARQ cron job for health checks

Frontend files:

/frontend/src/components/environments/ReplicaHealthWidget.tsx - Health UI
/frontend/src/lib/replica-validation.ts - Version checks
/frontend/src/lib/api.ts - API client

Database fields (ProjectEnvironment model):

replica_deployed: bool - Replica deployment status
replica_status: str - Health status (online/lagging/offline/error)
replica_lag_bytes: int - Replication lag in bytes
replica_lag_seconds: float - Replication lag in seconds
replica_last_check: datetime - Last health check timestamp
replica_error_message: str - Error details if unhealthy

Last Updated: December 11, 2025 - Sprint 2E40 (PostgreSQL Read Replica Production Enhancement)

Retry Failed Deployment Addon Management

Database Management

Overview

Database Architecture

Container Naming Convention

Network Architecture

Database Access

Connection Strings

Credentials

Ports

PostgreSQL Read Replicas

Overview

Version Requirements

Resource Allocation

Deploy Replica

Step 1: Verify Prerequisites

Step 2: Deploy via API

Step 3: Verify Deployment

Replica Health Monitoring

Monitoring Architecture

Health Status Definitions

Lag Thresholds

Real-time Updates via SSE

1. replica.health_updated

2. replica.lag_alert

Get Current Health Status

Frontend Health Widget

Replica Rebuild

When to Rebuild

Rebuild via API

Rebuild via UI

Remove Replica

When to Remove

Remove via API

Remove via UI

PgBouncer Configuration

Overview

Pool Size Calculation

Configuration Files

Ports

Enable/Disable PgBouncer

Database Backup & Restore

Backup Process

Restore Process

Backup Verification

Connection Limits

PostgreSQL max_connections

PgBouncer Pooling Strategy

Performance Tuning

Automatic PostgreSQL Optimization

Manual Tuning

Performance Monitoring

Permissions

Permission Codes

RBAC Integration

API Reference

Deploy Replica

Remove Replica

Get Replica Health

Rebuild Replica

Troubleshooting

Replication Lag Issues

Connection Failures

Health Check Failures

Replica Offline

Performance Degradation

Related Documentation

Technical Implementation

1. `replica.health_updated`

2. `replica.lag_alert`