Skip to content

Helm - Service Orchestration & Admin Dashboard

{: .no_toc }

Port: 5004 Database: PostgreSQL Repository: hivematrix-helm

Table of Contents

  1. Overview
  2. Architecture
  3. Database Schema
  4. Service Management
  5. Centralized Logging
  6. Configuration Management
  7. Installation Management
  8. Keycloak Management
  9. Security Auditing
  10. Backup & Recovery
  11. Web Dashboard
  12. API Reference
  13. Configuration
  14. Installation & Setup
  15. Troubleshooting
  16. See Also

Overview

Helm is HiveMatrix's central command center and orchestration hub that manages the entire platform lifecycle. It provides service management, centralized logging, health monitoring, configuration distribution, and administrative controls for the complete HiveMatrix ecosystem.

As the nerve center of HiveMatrix, Helm is responsible for starting/stopping services, aggregating logs from all components, monitoring system health, and providing administrators with a unified dashboard for platform management.

Primary Responsibilities

  • Service Orchestration - Start, stop, restart, and monitor all HiveMatrix services
  • Centralized Logging - Aggregate logs from all services into PostgreSQL for querying and analysis
  • Health Monitoring - Track service status, CPU usage, memory consumption, and uptime
  • Configuration Management - Distribute system-wide configuration to all services
  • Installation Management - Install, update, and configure HiveMatrix services
  • Keycloak Administration - Manage users, groups, and permissions in Keycloak
  • Security Auditing - Verify port bindings, generate firewall rules, audit system security
  • Backup & Recovery - Automated database and configuration backups with disaster recovery

Key Features

Service Lifecycle Management - Process control with PID tracking and health checks ✅ Centralized Log Aggregation - All service logs in one queryable PostgreSQL database ✅ Real-Time Monitoring - CPU, memory, uptime tracking for all services ✅ Admin Dashboard - Web UI for managing the entire platform ✅ Configuration Distribution - Master configuration synced to all services ✅ Automated Installation - One-command service installation from Git ✅ Keycloak Integration - User/group management via admin API ✅ Security Auditing - Automated security scans and firewall generation ✅ CLI Tools - Command-line interface for all operations ✅ Log File Watcher - Automatic log file ingestion with background monitoring


Architecture

System Architecture

┌──────────────────────────────────────────────────────────┐
│                      Helm Service (5004)                  │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐         │
│  │  Service   │  │ Config     │  │ Install    │         │
│  │  Manager   │  │ Manager    │  │ Manager    │         │
│  └────────────┘  └────────────┘  └────────────┘         │
│                                                           │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐         │
│  │ Log        │  │ Health     │  │ Keycloak   │         │
│  │ Aggregator │  │ Monitor    │  │ Manager    │         │
│  └────────────┘  └────────────┘  └────────────┘         │
│                                                           │
│  PostgreSQL Database:                                     │
│  - LogEntry (centralized logs)                            │
│  - ServiceStatus (health tracking)                        │
│  - ServiceMetric (time-series metrics)                    │
└──────────────────────────────────────────────────────────┘
    ┌───────────────────────┼───────────────────────┐
    ▼                       ▼                       ▼
┌─────────┐           ┌─────────┐             ┌─────────┐
│  Core   │           │  Codex  │   ...       │  Nexus  │
│  (5000) │           │  (5010) │             │  (443)  │
└─────────┘           └─────────┘             └─────────┘
    │                       │                       │
    └───────────────────────┴───────────────────────┘
              Logs sent to Helm /api/logs/ingest

Component Overview

Service Manager (app/service_manager.py) - Process lifecycle control (start/stop/restart) - PID tracking and process monitoring - Health check execution - CPU and memory metrics collection - Service status database updates

Config Manager (config_manager.py) - Master configuration (master_config.json) - Per-service .flaskenv generation - Configuration synchronization - System-wide settings distribution

Install Manager (install_manager.py) - Service installation from Git repositories - Dependency checking (PostgreSQL, Neo4j, Java, etc.) - services.json auto-generation - Service discovery and registration

Log Aggregator (log_watcher.py, /api/logs/ingest) - Centralized log collection endpoint - Log file watching (background thread) - PostgreSQL storage with indexing - Query API for log retrieval

Health Monitor - Periodic service health checks - Resource usage tracking (CPU/memory) - Uptime calculation - Health status database updates

Keycloak Manager (app/api_routes.py) - Admin token caching and refresh - User CRUD operations - Group membership management - Password reset functionality


Database Schema

LogEntry

Immutable log entries from all HiveMatrix services.

Columns: - id (BigInteger) - Auto-incrementing primary key - timestamp (DateTime) - When log was created (indexed) - service_name (String) - Service that generated log (indexed) - level (String) - DEBUG, INFO, WARNING, ERROR, CRITICAL (indexed) - message (Text) - Log message - context (JSONB) - Additional structured data (optional) - trace_id (String) - Request tracing ID (indexed) - user_id (String) - User who triggered action (indexed, optional) - hostname (String) - Server hostname (optional) - process_id (Integer) - Process ID (optional)

Indexes: - idx_log_service_timestamp on (service_name, timestamp) - idx_log_level_timestamp on (level, timestamp) - Individual indexes on timestamp, service_name, level, trace_id, user_id

Purpose: Centralized, queryable audit trail for all system activities.

Example:

{
  "id": 12345,
  "timestamp": "2024-11-22T10:30:00Z",
  "service_name": "codex",
  "level": "INFO",
  "message": "Company synced from PSA",
  "context": {
    "account_number": "620547",
    "company_name": "Acme Corporation"
  },
  "trace_id": "req-abc123",
  "user_id": "admin@example.com",
  "hostname": "hivematrix-server",
  "process_id": 12345
}


ServiceStatus

Tracks real-time status and health of each service.

Columns: - id (Integer) - Primary key - service_name (String) - Service identifier (unique) - status (String) - running, stopped, error, unknown - pid (Integer) - Process ID (optional) - port (Integer) - Service port (optional) - started_at (DateTime) - When service started (optional) - last_checked (DateTime) - Last health check timestamp - health_status (String) - healthy, unhealthy, degraded (optional) - health_message (Text) - Health check details (optional) - cpu_percent (Float) - CPU usage percentage (optional) - memory_mb (Float) - Memory usage in MB (optional)

Purpose: Real-time service monitoring and status tracking.

Example:

{
  "service_name": "codex",
  "status": "running",
  "pid": 12345,
  "port": 5010,
  "started_at": "2024-11-22T08:00:00Z",
  "last_checked": "2024-11-22T10:30:00Z",
  "health_status": "healthy",
  "health_message": "All checks passed",
  "cpu_percent": 12.5,
  "memory_mb": 256.3
}


ServiceMetric

Time-series metrics for performance monitoring.

Columns: - id (BigInteger) - Auto-incrementing primary key - service_name (String) - Service identifier (indexed) - timestamp (DateTime) - Metric timestamp (indexed) - metric_name (String) - Metric type (e.g., "cpu_percent", "memory_mb") - metric_value (Float) - Metric value - tags (JSONB) - Additional tags (optional)

Indexes: - idx_metric_service_timestamp on (service_name, timestamp)

Purpose: Historical performance data for trending and analysis.

Example:

{
  "service_name": "codex",
  "timestamp": "2024-11-22T10:30:00Z",
  "metric_name": "cpu_percent",
  "metric_value": 15.3,
  "tags": {
    "host": "server1",
    "environment": "production"
  }
}


Service Management

Service Manager

File: app/service_manager.py

Key Functions:

Start Service

ServiceManager.start_service(service_name, mode='dev')

Process: 1. Resolves service directory path 2. Checks if already running (PID check) 3. Syncs master services.json to service directory 4. Activates virtual environment 5. Starts service in background (dev or prod mode) 6. Captures PID and updates database 7. Returns success/failure with PID and port

Development Mode:

python run.py  # Flask development server

Production Mode:

gunicorn run:app --workers 4 --bind 127.0.0.1:5010


Stop Service

ServiceManager.stop_service(service_name)

Process: 1. Finds service PID from database or process list 2. Sends SIGTERM for graceful shutdown 3. Waits up to 10 seconds for process to exit 4. Sends SIGKILL if still running 5. Updates database status to "stopped" 6. Removes PID file


Restart Service

ServiceManager.restart_service(service_name, mode='dev')

Process: 1. Stops service (if running) 2. Waits 2 seconds 3. Starts service with specified mode 4. Returns combined result


Get Service Status

ServiceManager.get_service_status(service_name)

Returns:

{
  "service_name": "codex",
  "status": "running",
  "health": "healthy",
  "pid": 12345,
  "configured_port": 5010,
  "started_at": "2024-11-22T08:00:00Z",
  "uptime": "2h 30m",
  "cpu_percent": 12.5,
  "memory_mb": 256.3,
  "health_message": "All checks passed"
}

Health Check: - Calls service's /health endpoint - Parses response for health status - Updates ServiceStatus database record - Collects CPU and memory metrics


Get All Service Statuses

ServiceManager.get_all_service_statuses()

Returns:

{
  "core": {
    "status": "running",
    "health": "healthy",
    "pid": 11111,
    "configured_port": 5000
  },
  "codex": {
    "status": "running",
    "health": "healthy",
    "pid": 12345,
    "configured_port": 5010
  }
}


CLI Usage

File: cli.py

Check Service Status:

cd hivematrix-helm
source pyenv/bin/activate
python cli.py status

# Output:
# ================================================================================
# HiveMatrix Service Status
# ================================================================================
#
# CORE
#   Status:  running
#   Health:  healthy
#   Port:    5000
#   PID:     11111
#
# CODEX
#   Status:  running
#   Health:  healthy
#   Port:    5010
#   PID:     12345

Start Service:

python cli.py start codex          # Start in dev mode
python cli.py start codex --mode prod  # Start in production mode

Stop Service:

python cli.py stop codex

Restart Service:

python cli.py restart codex
python cli.py restart core --mode prod

List Configured Services:

python cli.py list

# Output:
# Configured Services:
# ----------------------------------------
#   core            (type: python      port: 5000)
#   codex           (type: python      port: 5010)
#   keycloak        (type: external    port: 8080)


Centralized Logging

Log Aggregation Architecture

All HiveMatrix services send logs to Helm's centralized logging endpoint:

# In any service (using helm_logger.py)
from app.helm_logger import init_helm_logger

helm_logger = init_helm_logger(
    service_name='codex',
    helm_service_url='http://localhost:5004'
)

helm_logger.info("Company synced", extra={'account': '620547'})
helm_logger.error("API call failed", extra={'url': '/api/data'}, exc_info=True)

What Happens: 1. Service logs message with structured data 2. helm_logger formats as JSON 3. Sends HTTP POST to /api/logs/ingest 4. Helm stores in PostgreSQL log_entries table 5. Log is immediately queryable via API or CLI


Log Ingestion Endpoint

POST /api/logs/ingest
Content-Type: application/json

{
  "timestamp": "2024-11-22T10:30:00Z",
  "service_name": "codex",
  "level": "INFO",
  "message": "Company synced from PSA",
  "context": {
    "account_number": "620547",
    "company_name": "Acme Corporation"
  },
  "trace_id": "req-abc123",
  "user_id": "admin@example.com"
}

Response:

{
  "success": true,
  "log_id": 12345
}

Features: - Async processing - Non-blocking log storage - Bulk ingestion - Can send multiple logs in array - Indexed storage - Fast queries on service, level, timestamp - Context preservation - JSONB field for structured data - Trace correlation - Link logs by trace_id


Log File Watcher

File: log_watcher.py

Purpose: Automatically ingest logs from file-based logging (backup mechanism).

Process: 1. Background thread started on Helm startup 2. Watches logs/*.log files for changes 3. Reads new lines as they're written 4. Parses log format (JSON or text) 5. Inserts into PostgreSQL via bulk insert 6. Maintains file position to avoid re-reading

Benefits: - Backup if HTTP logging fails - Historical log file import - Zero-configuration for services - Resilient to network issues


Log Query API

Get Logs

GET /api/logs?service={service}&level={level}&limit={limit}&offset={offset}&start_time={iso8601}&end_time={iso8601}

Parameters: - service (optional) - Filter by service name - level (optional) - Filter by log level (DEBUG, INFO, WARNING, ERROR, CRITICAL) - limit (optional) - Number of results (default: 100, max: 500) - offset (optional) - Pagination offset - start_time (optional) - ISO 8601 timestamp (e.g., 2024-11-22T00:00:00Z) - end_time (optional) - ISO 8601 timestamp - trace_id (optional) - Filter by trace ID - user_id (optional) - Filter by user ID

Response:

{
  "logs": [
    {
      "id": 12345,
      "timestamp": "2024-11-22T10:30:00Z",
      "service_name": "codex",
      "level": "INFO",
      "message": "Company synced from PSA",
      "context": {
        "account_number": "620547"
      },
      "trace_id": "req-abc123",
      "user_id": "admin@example.com"
    }
  ],
  "total": 1,
  "limit": 100,
  "offset": 0
}


Log CLI Tool

File: logs_cli.py

View Recent Logs:

cd hivematrix-helm
source pyenv/bin/activate

# View last 50 logs from Codex
python logs_cli.py codex --tail 50

# View only ERROR logs
python logs_cli.py codex --level ERROR --tail 100

# View all services
python logs_cli.py --tail 30

# Real-time log monitoring
watch -n 2 'python logs_cli.py codex --tail 20'

Output:

2024-11-22 10:30:00 [INFO] codex: Company synced from PSA
  Context: {"account_number": "620547", "company_name": "Acme Corporation"}

2024-11-22 10:30:15 [ERROR] codex: API call failed
  Context: {"url": "/api/data", "error": "Connection timeout"}
  Trace ID: req-abc123
  User: admin@example.com

Features: - Colored output (INFO=green, WARNING=yellow, ERROR=red) - Context display for structured logs - Trace ID and user tracking - Tail-like behavior for recent logs - Level filtering


Configuration Management

Master Configuration

File: instance/configs/master_config.json

Purpose: Single source of truth for system-wide configuration.

Structure:

{
  "system": {
    "environment": "production",
    "log_level": "INFO",
    "secret_key": "your-secret-key-here",
    "hostname": "hivematrix.example.com"
  },
  "keycloak": {
    "url": "http://localhost:8080",
    "realm": "hivematrix",
    "client_id": "core-client",
    "client_secret": "your-keycloak-client-secret",
    "admin_username": "admin",
    "admin_password": "admin"
  },
  "databases": {
    "postgresql": {
      "host": "localhost",
      "port": 5432,
      "admin_user": "postgres",
      "admin_password": "postgres-password"
    },
    "neo4j": {
      "uri": "bolt://localhost:7687",
      "user": "neo4j",
      "password": "neo4j-password"
    }
  },
  "apps": {
    "core": {
      "port": 5000,
      "database": "core_db"
    },
    "codex": {
      "port": 5010,
      "database": "codex_db"
    }
  }
}


Config Manager

File: config_manager.py

Sync All Configurations:

cd hivematrix-helm
source pyenv/bin/activate
python config_manager.py sync-all

What It Does: 1. Reads master_config.json 2. For each service in apps_registry.json: - Generates service-specific .flaskenv - Includes Keycloak URLs, Core URL, service name - Includes database connection strings - Includes system-wide settings 3. Writes .flaskenv to each service directory 4. Prints success/failure for each service

Generated .flaskenv Example:

# Auto-generated by config_manager.py
# DO NOT EDIT MANUALLY - Changes will be overwritten

# Core Service
CORE_SERVICE_URL=http://localhost:5000

# Helm Service
HELM_SERVICE_URL=http://localhost:5004

# Service Identity
SERVICE_NAME=codex

# Keycloak
KEYCLOAK_URL=http://localhost:8080
KEYCLOAK_REALM=hivematrix
KEYCLOAK_CLIENT_ID=core-client
KEYCLOAK_CLIENT_SECRET=your-client-secret

# System
LOG_LEVEL=INFO
ENABLE_JSON_LOGGING=true
SECRET_KEY=your-secret-key


Services Registry

File: apps_registry.json

Purpose: Define all available HiveMatrix services and their properties.

Structure:

{
  "core_apps": {
    "keycloak": {
      "name": "Keycloak Identity Provider",
      "description": "OAuth2 authentication provider",
      "type": "external",
      "port": 8080,
      "required": true,
      "visible": false
    },
    "core": {
      "name": "HiveMatrix Core",
      "description": "Authentication and JWT management",
      "git_url": "https://github.com/skelhammer/hivematrix-core",
      "port": 5000,
      "required": true,
      "visible": false,
      "install_order": 1
    },
    "nexus": {
      "name": "HiveMatrix Nexus",
      "description": "API Gateway and frontend proxy",
      "git_url": "https://github.com/skelhammer/hivematrix-nexus",
      "port": 443,
      "required": true,
      "visible": false,
      "install_order": 2
    },
    "helm": {
      "name": "HiveMatrix Helm",
      "description": "Service orchestration and admin dashboard",
      "git_url": "https://github.com/skelhammer/hivematrix-helm",
      "port": 5004,
      "required": true,
      "admin_only": true,
      "install_order": 0
    }
  },
  "default_apps": {
    "codex": {
      "name": "HiveMatrix Codex",
      "description": "Master data management",
      "git_url": "https://github.com/skelhammer/hivematrix-codex",
      "port": 5010,
      "required": false,
      "dependencies": ["core", "postgresql"],
      "install_order": 3
    },
    "beacon": {
      "name": "HiveMatrix Beacon",
      "description": "Real-time ticket dashboard",
      "git_url": "https://github.com/skelhammer/hivematrix-beacon",
      "port": 5001,
      "required": false,
      "dependencies": ["core", "codex"],
      "install_order": 4
    }
  },
  "system_dependencies": {
    "postgresql": {
      "name": "PostgreSQL",
      "description": "Relational database",
      "required": true
    },
    "neo4j": {
      "name": "Neo4j",
      "description": "Graph database for KnowledgeTree",
      "required": false
    }
  }
}

Properties: - name - Display name - description - Short description - git_url - Git repository URL - port - Service port - required - Must be installed? - visible - Show in navigation sidebar? - admin_only - Restrict to admins? - billing_or_admin_only - Restrict to billing/admin? - dependencies - Required services/dependencies - install_order - Installation sequence


Services JSON Generation

File: install_manager.py (method: update_config())

Command:

python install_manager.py update-config

Generates Two Files:

1. services.json (Simplified for services to use)

{
  "core": "http://localhost:5000",
  "codex": "http://localhost:5010",
  "helm": "http://localhost:5004"
}

2. master_services.json (Complete metadata for Helm/Nexus)

{
  "core": {
    "url": "http://localhost:5000",
    "name": "HiveMatrix Core",
    "port": 5000,
    "visible": false,
    "admin_only": false,
    "billing_or_admin_only": false,
    "display_order": 100
  },
  "codex": {
    "url": "http://localhost:5010",
    "name": "HiveMatrix Codex",
    "port": 5010,
    "visible": true,
    "admin_only": false,
    "billing_or_admin_only": false,
    "display_order": 3
  }
}

Display Order Logic: - Services in SERVICE_ORDER list get assigned order (0-N) - Services not in list get order 100+ (alphabetical) - Controls sidebar display order in Nexus


Installation Management

Install Manager

File: install_manager.py

Install a Service:

cd hivematrix-helm
source pyenv/bin/activate
python install_manager.py install codex

Process: 1. Checks if service is in apps_registry.json 2. Verifies dependencies (PostgreSQL, etc.) 3. Clones Git repository to parent directory 4. Creates hivematrix-{service} directory 5. Runs service's install.sh script: - Creates Python virtual environment - Installs requirements.txt - Creates symlink to services.json - Creates instance/ directory 6. Runs init_db.py for database setup (interactive) 7. Updates services.json and master_services.json 8. Syncs configuration via config_manager.py

Example Output:

Installing codex...
✓ Repository cloned
✓ Running install.sh
  Creating virtual environment...
  Installing dependencies...
  ✓ Installation complete

✓ Running init_db.py
  PostgreSQL configuration:
    Host: localhost
    Port: 5432
    Database: codex_db
    Username: postgres
    Password: [hidden]
  ✓ Database configured
  ✓ Tables created

✓ Configuration synced
✓ codex installed successfully


System Dependency Checks

Check Dependencies:

python install_manager.py check-dependencies

Checks: - PostgreSQL (psql --version) - Python 3.8+ (python --version) - Git (git --version) - Java (for Keycloak) (java -version) - Keycloak (directory exists) - Neo4j (neo4j version)

Output:

System Dependencies:
----------------------------------------
  ✓ postgresql
  ✓ python (3.10.12)
  ✓ git
  ✓ java
  ✓ keycloak
  ✗ neo4j (not installed)


Uninstall a Service

Command:

python install_manager.py uninstall codex

Process: 1. Stops service (if running) 2. Removes service directory 3. Updates services.json and master_services.json 4. Optionally drops database (prompts user)

⚠️ Warning: This does NOT backup data. Run backup first!


Keycloak Management

Keycloak Admin API Integration

Helm provides a unified interface to Keycloak's admin API for user and group management.

Admin Token Caching:

# Helm caches admin token with auto-refresh
admin_token = get_keycloak_admin_token()
# Token cached for 5 minutes, auto-refreshed on expiry


User Management API

List Users

GET /api/keycloak/users

Response:

[
  {
    "id": "user-uuid-123",
    "username": "john.doe",
    "email": "john.doe@example.com",
    "firstName": "John",
    "lastName": "Doe",
    "enabled": true,
    "emailVerified": true,
    "createdTimestamp": 1700000000000
  }
]


Create User

POST /api/keycloak/users
Content-Type: application/json

{
  "username": "jane.smith",
  "email": "jane.smith@example.com",
  "firstName": "Jane",
  "lastName": "Smith",
  "enabled": true,
  "emailVerified": true,
  "credentials": [
    {
      "type": "password",
      "value": "temp-password-123",
      "temporary": true
    }
  ]
}

Response:

{
  "success": true,
  "user_id": "user-uuid-456",
  "message": "User created successfully"
}


Update User

PUT /api/keycloak/users/<user_id>
Content-Type: application/json

{
  "firstName": "Jane",
  "lastName": "Doe-Smith",
  "email": "jane.doesmith@example.com"
}

Delete User

DELETE /api/keycloak/users/<user_id>

Reset Password

POST /api/keycloak/users/<user_id>/password
Content-Type: application/json

{
  "password": "new-password-456",
  "temporary": true
}

Response:

{
  "success": true,
  "message": "Password reset successfully"
}


Group Management API

List Groups

GET /api/keycloak/groups

Response:

[
  {
    "id": "group-uuid-123",
    "name": "admins",
    "path": "/admins"
  },
  {
    "id": "group-uuid-456",
    "name": "technicians",
    "path": "/technicians"
  }
]


Get User's Groups

GET /api/keycloak/users/<user_id>/groups

Response:

[
  {
    "id": "group-uuid-123",
    "name": "admins",
    "path": "/admins"
  }
]


Update User's Groups

POST /api/keycloak/users/<user_id>/groups
Content-Type: application/json

{
  "groups": ["admins", "technicians"]
}

Process: 1. Removes user from all current groups 2. Adds user to specified groups 3. Returns success/failure

Response:

{
  "success": true,
  "message": "User groups updated"
}


Security Auditing

Security Audit Script

File: security_audit.py

Run Security Audit:

cd hivematrix-helm
source pyenv/bin/activate
python security_audit.py --audit

Checks: 1. Port Bindings - Verifies services bind to 127.0.0.1 (localhost only) - Flags services bound to 0.0.0.0 (public access) - Exception: Nexus (443) should bind to 0.0.0.0

  1. File Permissions
  2. Checks sensitive files are not world-readable
  3. Verifies instance/*.conf permissions
  4. Checks SSH keys if present

  5. Configuration Security

  6. Checks for default passwords
  7. Verifies secret keys are random
  8. Checks for hardcoded credentials

Output:

================================================================================
HiveMatrix Security Audit Report
================================================================================

Port Binding Check:
✓ core (5000) - Bound to 127.0.0.1 (localhost only)
✓ codex (5010) - Bound to 127.0.0.1 (localhost only)
✓ nexus (443) - Bound to 0.0.0.0 (public - expected for gateway)
✗ keycloak (8080) - Bound to 0.0.0.0 (should be localhost only)

File Permissions:
✓ instance/core.conf - Permissions OK (600)
✗ instance/codex.conf - World-readable (644) - should be 600

Configuration Security:
✓ Secret keys are randomized
✗ Keycloak admin password is default - CHANGE IMMEDIATELY

Overall: 3 issues found


Generate Firewall Rules

Command:

python security_audit.py --generate-firewall

Generates: secure_firewall.sh

Generated Script:

#!/bin/bash
# HiveMatrix Firewall Configuration
# Auto-generated by security_audit.py

# Allow SSH
ufw allow 22/tcp

# Allow HTTPS (Nexus only public-facing service)
ufw allow 443/tcp

# Block all service ports from external access
ufw deny 5000/tcp  # Core
ufw deny 5001/tcp  # Beacon
ufw deny 5004/tcp  # Helm
ufw deny 5010/tcp  # Codex
ufw deny 8080/tcp  # Keycloak

# Allow localhost to access everything
ufw allow from 127.0.0.1

# Enable firewall
ufw --force enable
ufw status

Apply Firewall:

sudo bash secure_firewall.sh


Backup & Recovery

Backup Script

File: backup.py

Create Backup:

cd hivematrix-helm
source pyenv/bin/activate
python backup.py

What Gets Backed Up: 1. All PostgreSQL Databases - Dumps each database to individual SQL file - Format: {service}_db-{timestamp}.sql

  1. Configuration Files
  2. master_config.json
  3. apps_registry.json
  4. services.json
  5. All instance/*.conf files

  6. Service-Specific Data

  7. Neo4j dumps (KnowledgeTree)
  8. File uploads (if any)

Output:

Creating backup...
✓ Backed up core_db → backups/core_db-20241122-103000.sql
✓ Backed up codex_db → backups/codex_db-20241122-103000.sql
✓ Backed up helm_db → backups/helm_db-20241122-103000.sql
✓ Backed up master_config.json
✓ Backed up services.json
✓ Backup complete: backups/hivematrix-backup-20241122-103000.tar.gz

Scheduled Backups:

# Add to crontab
crontab -e

# Backup daily at 2 AM
0 2 * * * cd /path/to/hivematrix-helm && source pyenv/bin/activate && python backup.py >> logs/backup.log 2>&1


Restore Script

File: restore.py

Restore from Backup:

cd hivematrix-helm
source pyenv/bin/activate
python restore.py backups/hivematrix-backup-20241122-103000.tar.gz

Process: 1. Stops all services 2. Extracts backup archive 3. Restores each database: - Drops existing database - Creates new database - Restores from SQL dump 4. Restores configuration files 5. Starts all services 6. Verifies restoration

⚠️ Warning: This is a destructive operation. Current data will be lost.

Confirmation Prompt:

WARNING: This will replace all current data!
Current databases will be dropped and recreated.

Backup file: backups/hivematrix-backup-20241122-103000.tar.gz
Created: 2024-11-22 10:30:00

Continue? (yes/no):


Web Dashboard

Main Dashboard

URL: / (requires authentication)

Features: - Service Status Grid - All services with status, health, uptime - Quick Actions - Start/stop/restart buttons - Log Statistics - Recent log counts by level - CPU & Memory Graphs - Resource usage trends - Health Alerts - Warnings for unhealthy services

Service Status Card:

┌─────────────────────────────────┐
│ Codex                           │
│                                 │
│ Status: ● Running               │
│ Health: ✓ Healthy               │
│ Uptime: 2h 30m                  │
│ CPU: 12.5%  Memory: 256 MB      │
│                                 │
│ Logs (last hour):               │
│   INFO: 145  WARNING: 2  ERROR: 0│
│                                 │
│ [Stop] [Restart] [View Logs]   │
└─────────────────────────────────┘


Service Detail View

URL: /service/<service_name>

Sections:

1. Service Information - Name, description, port - Current status, health, PID - Started time, uptime - Resource usage (CPU, memory)

2. Health Check Details - Last check timestamp - Health check response - Dependency status - Disk space, database connection

3. Recent Logs - Last 50 log entries - Color-coded by level - Expandable for full context - Link to full log viewer

4. Actions - Start / Stop / Restart buttons - View full logs - Download logs (CSV/JSON) - Access service dashboard (link)


Log Viewer

URL: /logs

Features: - Service Filter - Dropdown to select specific service - Level Filter - DEBUG, INFO, WARNING, ERROR, CRITICAL - Time Range - Last hour, 24 hours, 7 days, custom - Search - Full-text search in message and context - Pagination - Navigate through results - Export - Download logs as CSV or JSON

Log Entry Display:

2024-11-22 10:30:00 [INFO] codex
Company synced from PSA
Context: {
  "account_number": "620547",
  "company_name": "Acme Corporation"
}
Trace ID: req-abc123
User: admin@example.com


Settings Page

URL: /settings (admin only)

Sections:

1. System Configuration - Environment (dev/prod) - Log level - Secret key rotation - Hostname

2. Keycloak Configuration - Keycloak URL - Realm name - Client credentials - Admin credentials - Test connection button

3. Database Configuration - PostgreSQL host, port, credentials - Neo4j URI, credentials - Test connections button

4. Service Management - Install new service - Uninstall service - Update service - Sync configurations

5. Security - Run security audit - Generate firewall rules - View/rotate secret keys - Password policies

6. Backup & Restore - Create backup now - Schedule automated backups - List available backups - Restore from backup


API Reference

Service Management API

Get All Service Statuses

GET /api/dashboard/status

Response:

{
  "services": {
    "core": {
      "status": "running",
      "health": "healthy",
      "pid": 11111,
      "port": 5000,
      "started_at": "2024-11-22T08:00:00Z",
      "cpu_percent": 5.2,
      "memory_mb": 128.5
    },
    "codex": {
      "status": "running",
      "health": "healthy",
      "pid": 12345,
      "port": 5010,
      "started_at": "2024-11-22T08:00:30Z",
      "cpu_percent": 12.5,
      "memory_mb": 256.3
    }
  },
  "system": {
    "cpu_percent": 45.2,
    "memory_percent": 62.3,
    "disk_percent": 48.1
  }
}


Start Service

POST /api/services/<service_name>/start
Content-Type: application/json

{
  "mode": "prod"
}

Response:

{
  "success": true,
  "message": "codex started successfully",
  "pid": 12345,
  "port": 5010
}


Stop Service

POST /api/services/<service_name>/stop

Response:

{
  "success": true,
  "message": "codex stopped successfully"
}


Restart Service

POST /api/services/<service_name>/restart
Content-Type: application/json

{
  "mode": "dev"
}

Response:

{
  "success": true,
  "message": "codex restarted successfully",
  "pid": 12346,
  "port": 5010
}


Health Check

GET /health

Response:

{
  "status": "healthy",
  "timestamp": "2024-11-22T10:30:00Z",
  "service": "helm",
  "checks": {
    "database": {
      "status": "healthy",
      "message": "Connected to PostgreSQL"
    },
    "disk": {
      "status": "healthy",
      "usage_percent": 48.1,
      "available_gb": 120.5
    }
  }
}


Configuration

Database Configuration

File: instance/helm.conf

[database]
connection_string = postgresql://helm_user:password@localhost:5432/helm_db

Environment Variables

File: .flaskenv (auto-generated)

# Core Service
CORE_SERVICE_URL=http://localhost:5000

# Service Identity
SERVICE_NAME=helm

# Logging
LOG_LEVEL=INFO
ENABLE_JSON_LOGGING=true

# Flask Secret Key
SECRET_KEY=your-secret-key-here

Installation & Setup

Prerequisites

  1. PostgreSQL 12+ installed and running
  2. Python 3.8+ with pip
  3. Git for cloning repositories

Install Helm

Automated (Recommended):

cd /path/to/parent-directory
git clone https://github.com/skelhammer/hivematrix-helm
cd hivematrix-helm
./install.sh

Manual:

cd hivematrix-helm
python3 -m venv pyenv
source pyenv/bin/activate
pip install -r requirements.txt
python init_db.py

First-Time Setup

Run Initial Configuration:

cd hivematrix-helm
source pyenv/bin/activate
python init_db.py

Prompts for: - PostgreSQL host, port, database name - PostgreSQL username, password - Tests connection - Creates all tables - Initializes master_config.json


Troubleshooting

Service Won't Start

Symptom: python cli.py start codex fails

Check: 1. Port in use:

ss -tlnp | grep 5010
# If port is in use, find and stop the process

  1. Virtual environment:

    ls hivematrix-codex/pyenv
    # If missing, run install.sh again
    

  2. .flaskenv missing:

    ls hivematrix-codex/.flaskenv
    # If missing, run: python config_manager.py sync-all
    

  3. Database not configured:

    ls hivematrix-codex/instance/codex.conf
    # If missing, run: cd hivematrix-codex && python init_db.py
    


Logs Not Appearing

Symptom: No logs in Helm database

Check: 1. Helm service running:

python cli.py status helm

  1. Service can reach Helm:

    curl http://localhost:5004/health
    

  2. Service has helm_logger configured:

    # In service's app/__init__.py
    from app.helm_logger import init_helm_logger
    helm_logger = init_helm_logger(SERVICE_NAME, HELM_SERVICE_URL)
    

  3. Check logs table:

    psql -U postgres -d helm_db -c "SELECT COUNT(*) FROM log_entries;"
    


Configuration Sync Fails

Symptom: python config_manager.py sync-all errors

Check: 1. master_config.json exists:

ls instance/configs/master_config.json

  1. Service directories exist:

    ls ../hivematrix-*/
    

  2. Permissions:

    ls -la ../hivematrix-*/.flaskenv
    # Should be writable
    


See Also

Architecture & Design

Configuration & Setup

Tools & Utilities

  • Service Management: cli.py start|stop|restart|status <service>
  • Log Viewer: logs_cli.py <service> --tail N
  • Config Manager: config_manager.py sync-all
  • Install Manager: install_manager.py install <service>
  • Security Audit: security_audit.py --audit

Questions or issues? Check the troubleshooting section or file an issue on GitHub.