Helm - Service Orchestration & Admin Dashboard¶
{: .no_toc }
Port: 5004 Database: PostgreSQL Repository: hivematrix-helm
Table of Contents¶
- Overview
- Architecture
- Database Schema
- Service Management
- Centralized Logging
- Configuration Management
- Installation Management
- Keycloak Management
- Security Auditing
- Backup & Recovery
- Web Dashboard
- API Reference
- Configuration
- Installation & Setup
- Troubleshooting
- See Also
Overview¶
Helm is HiveMatrix's central command center and orchestration hub that manages the entire platform lifecycle. It provides service management, centralized logging, health monitoring, configuration distribution, and administrative controls for the complete HiveMatrix ecosystem.
As the nerve center of HiveMatrix, Helm is responsible for starting/stopping services, aggregating logs from all components, monitoring system health, and providing administrators with a unified dashboard for platform management.
Primary Responsibilities¶
- Service Orchestration - Start, stop, restart, and monitor all HiveMatrix services
- Centralized Logging - Aggregate logs from all services into PostgreSQL for querying and analysis
- Health Monitoring - Track service status, CPU usage, memory consumption, and uptime
- Configuration Management - Distribute system-wide configuration to all services
- Installation Management - Install, update, and configure HiveMatrix services
- Keycloak Administration - Manage users, groups, and permissions in Keycloak
- Security Auditing - Verify port bindings, generate firewall rules, audit system security
- Backup & Recovery - Automated database and configuration backups with disaster recovery
Key Features¶
✅ Service Lifecycle Management - Process control with PID tracking and health checks ✅ Centralized Log Aggregation - All service logs in one queryable PostgreSQL database ✅ Real-Time Monitoring - CPU, memory, uptime tracking for all services ✅ Admin Dashboard - Web UI for managing the entire platform ✅ Configuration Distribution - Master configuration synced to all services ✅ Automated Installation - One-command service installation from Git ✅ Keycloak Integration - User/group management via admin API ✅ Security Auditing - Automated security scans and firewall generation ✅ CLI Tools - Command-line interface for all operations ✅ Log File Watcher - Automatic log file ingestion with background monitoring
Architecture¶
System Architecture¶
┌──────────────────────────────────────────────────────────┐
│ Helm Service (5004) │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ │
│ │ Service │ │ Config │ │ Install │ │
│ │ Manager │ │ Manager │ │ Manager │ │
│ └────────────┘ └────────────┘ └────────────┘ │
│ │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ │
│ │ Log │ │ Health │ │ Keycloak │ │
│ │ Aggregator │ │ Monitor │ │ Manager │ │
│ └────────────┘ └────────────┘ └────────────┘ │
│ │
│ PostgreSQL Database: │
│ - LogEntry (centralized logs) │
│ - ServiceStatus (health tracking) │
│ - ServiceMetric (time-series metrics) │
└──────────────────────────────────────────────────────────┘
│
┌───────────────────────┼───────────────────────┐
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Core │ │ Codex │ ... │ Nexus │
│ (5000) │ │ (5010) │ │ (443) │
└─────────┘ └─────────┘ └─────────┘
│ │ │
└───────────────────────┴───────────────────────┘
Logs sent to Helm /api/logs/ingest
Component Overview¶
Service Manager (app/service_manager.py)
- Process lifecycle control (start/stop/restart)
- PID tracking and process monitoring
- Health check execution
- CPU and memory metrics collection
- Service status database updates
Config Manager (config_manager.py)
- Master configuration (master_config.json)
- Per-service .flaskenv generation
- Configuration synchronization
- System-wide settings distribution
Install Manager (install_manager.py)
- Service installation from Git repositories
- Dependency checking (PostgreSQL, Neo4j, Java, etc.)
- services.json auto-generation
- Service discovery and registration
Log Aggregator (log_watcher.py, /api/logs/ingest)
- Centralized log collection endpoint
- Log file watching (background thread)
- PostgreSQL storage with indexing
- Query API for log retrieval
Health Monitor - Periodic service health checks - Resource usage tracking (CPU/memory) - Uptime calculation - Health status database updates
Keycloak Manager (app/api_routes.py)
- Admin token caching and refresh
- User CRUD operations
- Group membership management
- Password reset functionality
Database Schema¶
LogEntry¶
Immutable log entries from all HiveMatrix services.
Columns:
- id (BigInteger) - Auto-incrementing primary key
- timestamp (DateTime) - When log was created (indexed)
- service_name (String) - Service that generated log (indexed)
- level (String) - DEBUG, INFO, WARNING, ERROR, CRITICAL (indexed)
- message (Text) - Log message
- context (JSONB) - Additional structured data (optional)
- trace_id (String) - Request tracing ID (indexed)
- user_id (String) - User who triggered action (indexed, optional)
- hostname (String) - Server hostname (optional)
- process_id (Integer) - Process ID (optional)
Indexes:
- idx_log_service_timestamp on (service_name, timestamp)
- idx_log_level_timestamp on (level, timestamp)
- Individual indexes on timestamp, service_name, level, trace_id, user_id
Purpose: Centralized, queryable audit trail for all system activities.
Example:
{
"id": 12345,
"timestamp": "2024-11-22T10:30:00Z",
"service_name": "codex",
"level": "INFO",
"message": "Company synced from PSA",
"context": {
"account_number": "620547",
"company_name": "Acme Corporation"
},
"trace_id": "req-abc123",
"user_id": "admin@example.com",
"hostname": "hivematrix-server",
"process_id": 12345
}
ServiceStatus¶
Tracks real-time status and health of each service.
Columns:
- id (Integer) - Primary key
- service_name (String) - Service identifier (unique)
- status (String) - running, stopped, error, unknown
- pid (Integer) - Process ID (optional)
- port (Integer) - Service port (optional)
- started_at (DateTime) - When service started (optional)
- last_checked (DateTime) - Last health check timestamp
- health_status (String) - healthy, unhealthy, degraded (optional)
- health_message (Text) - Health check details (optional)
- cpu_percent (Float) - CPU usage percentage (optional)
- memory_mb (Float) - Memory usage in MB (optional)
Purpose: Real-time service monitoring and status tracking.
Example:
{
"service_name": "codex",
"status": "running",
"pid": 12345,
"port": 5010,
"started_at": "2024-11-22T08:00:00Z",
"last_checked": "2024-11-22T10:30:00Z",
"health_status": "healthy",
"health_message": "All checks passed",
"cpu_percent": 12.5,
"memory_mb": 256.3
}
ServiceMetric¶
Time-series metrics for performance monitoring.
Columns:
- id (BigInteger) - Auto-incrementing primary key
- service_name (String) - Service identifier (indexed)
- timestamp (DateTime) - Metric timestamp (indexed)
- metric_name (String) - Metric type (e.g., "cpu_percent", "memory_mb")
- metric_value (Float) - Metric value
- tags (JSONB) - Additional tags (optional)
Indexes:
- idx_metric_service_timestamp on (service_name, timestamp)
Purpose: Historical performance data for trending and analysis.
Example:
{
"service_name": "codex",
"timestamp": "2024-11-22T10:30:00Z",
"metric_name": "cpu_percent",
"metric_value": 15.3,
"tags": {
"host": "server1",
"environment": "production"
}
}
Service Management¶
Service Manager¶
File: app/service_manager.py
Key Functions:
Start Service¶
Process: 1. Resolves service directory path 2. Checks if already running (PID check) 3. Syncs master services.json to service directory 4. Activates virtual environment 5. Starts service in background (dev or prod mode) 6. Captures PID and updates database 7. Returns success/failure with PID and port
Development Mode:
Production Mode:
Stop Service¶
Process: 1. Finds service PID from database or process list 2. Sends SIGTERM for graceful shutdown 3. Waits up to 10 seconds for process to exit 4. Sends SIGKILL if still running 5. Updates database status to "stopped" 6. Removes PID file
Restart Service¶
Process: 1. Stops service (if running) 2. Waits 2 seconds 3. Starts service with specified mode 4. Returns combined result
Get Service Status¶
Returns:
{
"service_name": "codex",
"status": "running",
"health": "healthy",
"pid": 12345,
"configured_port": 5010,
"started_at": "2024-11-22T08:00:00Z",
"uptime": "2h 30m",
"cpu_percent": 12.5,
"memory_mb": 256.3,
"health_message": "All checks passed"
}
Health Check:
- Calls service's /health endpoint
- Parses response for health status
- Updates ServiceStatus database record
- Collects CPU and memory metrics
Get All Service Statuses¶
Returns:
{
"core": {
"status": "running",
"health": "healthy",
"pid": 11111,
"configured_port": 5000
},
"codex": {
"status": "running",
"health": "healthy",
"pid": 12345,
"configured_port": 5010
}
}
CLI Usage¶
File: cli.py
Check Service Status:
cd hivematrix-helm
source pyenv/bin/activate
python cli.py status
# Output:
# ================================================================================
# HiveMatrix Service Status
# ================================================================================
#
# CORE
# Status: running
# Health: healthy
# Port: 5000
# PID: 11111
#
# CODEX
# Status: running
# Health: healthy
# Port: 5010
# PID: 12345
Start Service:
python cli.py start codex # Start in dev mode
python cli.py start codex --mode prod # Start in production mode
Stop Service:
Restart Service:
List Configured Services:
python cli.py list
# Output:
# Configured Services:
# ----------------------------------------
# core (type: python port: 5000)
# codex (type: python port: 5010)
# keycloak (type: external port: 8080)
Centralized Logging¶
Log Aggregation Architecture¶
All HiveMatrix services send logs to Helm's centralized logging endpoint:
# In any service (using helm_logger.py)
from app.helm_logger import init_helm_logger
helm_logger = init_helm_logger(
service_name='codex',
helm_service_url='http://localhost:5004'
)
helm_logger.info("Company synced", extra={'account': '620547'})
helm_logger.error("API call failed", extra={'url': '/api/data'}, exc_info=True)
What Happens:
1. Service logs message with structured data
2. helm_logger formats as JSON
3. Sends HTTP POST to /api/logs/ingest
4. Helm stores in PostgreSQL log_entries table
5. Log is immediately queryable via API or CLI
Log Ingestion Endpoint¶
POST /api/logs/ingest
Content-Type: application/json
{
"timestamp": "2024-11-22T10:30:00Z",
"service_name": "codex",
"level": "INFO",
"message": "Company synced from PSA",
"context": {
"account_number": "620547",
"company_name": "Acme Corporation"
},
"trace_id": "req-abc123",
"user_id": "admin@example.com"
}
Response:
Features: - Async processing - Non-blocking log storage - Bulk ingestion - Can send multiple logs in array - Indexed storage - Fast queries on service, level, timestamp - Context preservation - JSONB field for structured data - Trace correlation - Link logs by trace_id
Log File Watcher¶
File: log_watcher.py
Purpose: Automatically ingest logs from file-based logging (backup mechanism).
Process:
1. Background thread started on Helm startup
2. Watches logs/*.log files for changes
3. Reads new lines as they're written
4. Parses log format (JSON or text)
5. Inserts into PostgreSQL via bulk insert
6. Maintains file position to avoid re-reading
Benefits: - Backup if HTTP logging fails - Historical log file import - Zero-configuration for services - Resilient to network issues
Log Query API¶
Get Logs¶
GET /api/logs?service={service}&level={level}&limit={limit}&offset={offset}&start_time={iso8601}&end_time={iso8601}
Parameters:
- service (optional) - Filter by service name
- level (optional) - Filter by log level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
- limit (optional) - Number of results (default: 100, max: 500)
- offset (optional) - Pagination offset
- start_time (optional) - ISO 8601 timestamp (e.g., 2024-11-22T00:00:00Z)
- end_time (optional) - ISO 8601 timestamp
- trace_id (optional) - Filter by trace ID
- user_id (optional) - Filter by user ID
Response:
{
"logs": [
{
"id": 12345,
"timestamp": "2024-11-22T10:30:00Z",
"service_name": "codex",
"level": "INFO",
"message": "Company synced from PSA",
"context": {
"account_number": "620547"
},
"trace_id": "req-abc123",
"user_id": "admin@example.com"
}
],
"total": 1,
"limit": 100,
"offset": 0
}
Log CLI Tool¶
File: logs_cli.py
View Recent Logs:
cd hivematrix-helm
source pyenv/bin/activate
# View last 50 logs from Codex
python logs_cli.py codex --tail 50
# View only ERROR logs
python logs_cli.py codex --level ERROR --tail 100
# View all services
python logs_cli.py --tail 30
# Real-time log monitoring
watch -n 2 'python logs_cli.py codex --tail 20'
Output:
2024-11-22 10:30:00 [INFO] codex: Company synced from PSA
Context: {"account_number": "620547", "company_name": "Acme Corporation"}
2024-11-22 10:30:15 [ERROR] codex: API call failed
Context: {"url": "/api/data", "error": "Connection timeout"}
Trace ID: req-abc123
User: admin@example.com
Features: - Colored output (INFO=green, WARNING=yellow, ERROR=red) - Context display for structured logs - Trace ID and user tracking - Tail-like behavior for recent logs - Level filtering
Configuration Management¶
Master Configuration¶
File: instance/configs/master_config.json
Purpose: Single source of truth for system-wide configuration.
Structure:
{
"system": {
"environment": "production",
"log_level": "INFO",
"secret_key": "your-secret-key-here",
"hostname": "hivematrix.example.com"
},
"keycloak": {
"url": "http://localhost:8080",
"realm": "hivematrix",
"client_id": "core-client",
"client_secret": "your-keycloak-client-secret",
"admin_username": "admin",
"admin_password": "admin"
},
"databases": {
"postgresql": {
"host": "localhost",
"port": 5432,
"admin_user": "postgres",
"admin_password": "postgres-password"
},
"neo4j": {
"uri": "bolt://localhost:7687",
"user": "neo4j",
"password": "neo4j-password"
}
},
"apps": {
"core": {
"port": 5000,
"database": "core_db"
},
"codex": {
"port": 5010,
"database": "codex_db"
}
}
}
Config Manager¶
File: config_manager.py
Sync All Configurations:
What It Does:
1. Reads master_config.json
2. For each service in apps_registry.json:
- Generates service-specific .flaskenv
- Includes Keycloak URLs, Core URL, service name
- Includes database connection strings
- Includes system-wide settings
3. Writes .flaskenv to each service directory
4. Prints success/failure for each service
Generated .flaskenv Example:
# Auto-generated by config_manager.py
# DO NOT EDIT MANUALLY - Changes will be overwritten
# Core Service
CORE_SERVICE_URL=http://localhost:5000
# Helm Service
HELM_SERVICE_URL=http://localhost:5004
# Service Identity
SERVICE_NAME=codex
# Keycloak
KEYCLOAK_URL=http://localhost:8080
KEYCLOAK_REALM=hivematrix
KEYCLOAK_CLIENT_ID=core-client
KEYCLOAK_CLIENT_SECRET=your-client-secret
# System
LOG_LEVEL=INFO
ENABLE_JSON_LOGGING=true
SECRET_KEY=your-secret-key
Services Registry¶
File: apps_registry.json
Purpose: Define all available HiveMatrix services and their properties.
Structure:
{
"core_apps": {
"keycloak": {
"name": "Keycloak Identity Provider",
"description": "OAuth2 authentication provider",
"type": "external",
"port": 8080,
"required": true,
"visible": false
},
"core": {
"name": "HiveMatrix Core",
"description": "Authentication and JWT management",
"git_url": "https://github.com/skelhammer/hivematrix-core",
"port": 5000,
"required": true,
"visible": false,
"install_order": 1
},
"nexus": {
"name": "HiveMatrix Nexus",
"description": "API Gateway and frontend proxy",
"git_url": "https://github.com/skelhammer/hivematrix-nexus",
"port": 443,
"required": true,
"visible": false,
"install_order": 2
},
"helm": {
"name": "HiveMatrix Helm",
"description": "Service orchestration and admin dashboard",
"git_url": "https://github.com/skelhammer/hivematrix-helm",
"port": 5004,
"required": true,
"admin_only": true,
"install_order": 0
}
},
"default_apps": {
"codex": {
"name": "HiveMatrix Codex",
"description": "Master data management",
"git_url": "https://github.com/skelhammer/hivematrix-codex",
"port": 5010,
"required": false,
"dependencies": ["core", "postgresql"],
"install_order": 3
},
"beacon": {
"name": "HiveMatrix Beacon",
"description": "Real-time ticket dashboard",
"git_url": "https://github.com/skelhammer/hivematrix-beacon",
"port": 5001,
"required": false,
"dependencies": ["core", "codex"],
"install_order": 4
}
},
"system_dependencies": {
"postgresql": {
"name": "PostgreSQL",
"description": "Relational database",
"required": true
},
"neo4j": {
"name": "Neo4j",
"description": "Graph database for KnowledgeTree",
"required": false
}
}
}
Properties:
- name - Display name
- description - Short description
- git_url - Git repository URL
- port - Service port
- required - Must be installed?
- visible - Show in navigation sidebar?
- admin_only - Restrict to admins?
- billing_or_admin_only - Restrict to billing/admin?
- dependencies - Required services/dependencies
- install_order - Installation sequence
Services JSON Generation¶
File: install_manager.py (method: update_config())
Command:
Generates Two Files:
1. services.json (Simplified for services to use)
{
"core": "http://localhost:5000",
"codex": "http://localhost:5010",
"helm": "http://localhost:5004"
}
2. master_services.json (Complete metadata for Helm/Nexus)
{
"core": {
"url": "http://localhost:5000",
"name": "HiveMatrix Core",
"port": 5000,
"visible": false,
"admin_only": false,
"billing_or_admin_only": false,
"display_order": 100
},
"codex": {
"url": "http://localhost:5010",
"name": "HiveMatrix Codex",
"port": 5010,
"visible": true,
"admin_only": false,
"billing_or_admin_only": false,
"display_order": 3
}
}
Display Order Logic:
- Services in SERVICE_ORDER list get assigned order (0-N)
- Services not in list get order 100+ (alphabetical)
- Controls sidebar display order in Nexus
Installation Management¶
Install Manager¶
File: install_manager.py
Install a Service:
Process:
1. Checks if service is in apps_registry.json
2. Verifies dependencies (PostgreSQL, etc.)
3. Clones Git repository to parent directory
4. Creates hivematrix-{service} directory
5. Runs service's install.sh script:
- Creates Python virtual environment
- Installs requirements.txt
- Creates symlink to services.json
- Creates instance/ directory
6. Runs init_db.py for database setup (interactive)
7. Updates services.json and master_services.json
8. Syncs configuration via config_manager.py
Example Output:
Installing codex...
✓ Repository cloned
✓ Running install.sh
Creating virtual environment...
Installing dependencies...
✓ Installation complete
✓ Running init_db.py
PostgreSQL configuration:
Host: localhost
Port: 5432
Database: codex_db
Username: postgres
Password: [hidden]
✓ Database configured
✓ Tables created
✓ Configuration synced
✓ codex installed successfully
System Dependency Checks¶
Check Dependencies:
Checks:
- PostgreSQL (psql --version)
- Python 3.8+ (python --version)
- Git (git --version)
- Java (for Keycloak) (java -version)
- Keycloak (directory exists)
- Neo4j (neo4j version)
Output:
System Dependencies:
----------------------------------------
✓ postgresql
✓ python (3.10.12)
✓ git
✓ java
✓ keycloak
✗ neo4j (not installed)
Uninstall a Service¶
Command:
Process:
1. Stops service (if running)
2. Removes service directory
3. Updates services.json and master_services.json
4. Optionally drops database (prompts user)
⚠️ Warning: This does NOT backup data. Run backup first!
Keycloak Management¶
Keycloak Admin API Integration¶
Helm provides a unified interface to Keycloak's admin API for user and group management.
Admin Token Caching:
# Helm caches admin token with auto-refresh
admin_token = get_keycloak_admin_token()
# Token cached for 5 minutes, auto-refreshed on expiry
User Management API¶
List Users¶
Response:
[
{
"id": "user-uuid-123",
"username": "john.doe",
"email": "john.doe@example.com",
"firstName": "John",
"lastName": "Doe",
"enabled": true,
"emailVerified": true,
"createdTimestamp": 1700000000000
}
]
Create User¶
POST /api/keycloak/users
Content-Type: application/json
{
"username": "jane.smith",
"email": "jane.smith@example.com",
"firstName": "Jane",
"lastName": "Smith",
"enabled": true,
"emailVerified": true,
"credentials": [
{
"type": "password",
"value": "temp-password-123",
"temporary": true
}
]
}
Response:
Update User¶
PUT /api/keycloak/users/<user_id>
Content-Type: application/json
{
"firstName": "Jane",
"lastName": "Doe-Smith",
"email": "jane.doesmith@example.com"
}
Delete User¶
Reset Password¶
POST /api/keycloak/users/<user_id>/password
Content-Type: application/json
{
"password": "new-password-456",
"temporary": true
}
Response:
Group Management API¶
List Groups¶
Response:
[
{
"id": "group-uuid-123",
"name": "admins",
"path": "/admins"
},
{
"id": "group-uuid-456",
"name": "technicians",
"path": "/technicians"
}
]
Get User's Groups¶
Response:
Update User's Groups¶
POST /api/keycloak/users/<user_id>/groups
Content-Type: application/json
{
"groups": ["admins", "technicians"]
}
Process: 1. Removes user from all current groups 2. Adds user to specified groups 3. Returns success/failure
Response:
Security Auditing¶
Security Audit Script¶
File: security_audit.py
Run Security Audit:
Checks:
1. Port Bindings
- Verifies services bind to 127.0.0.1 (localhost only)
- Flags services bound to 0.0.0.0 (public access)
- Exception: Nexus (443) should bind to 0.0.0.0
- File Permissions
- Checks sensitive files are not world-readable
- Verifies
instance/*.confpermissions -
Checks SSH keys if present
-
Configuration Security
- Checks for default passwords
- Verifies secret keys are random
- Checks for hardcoded credentials
Output:
================================================================================
HiveMatrix Security Audit Report
================================================================================
Port Binding Check:
✓ core (5000) - Bound to 127.0.0.1 (localhost only)
✓ codex (5010) - Bound to 127.0.0.1 (localhost only)
✓ nexus (443) - Bound to 0.0.0.0 (public - expected for gateway)
✗ keycloak (8080) - Bound to 0.0.0.0 (should be localhost only)
File Permissions:
✓ instance/core.conf - Permissions OK (600)
✗ instance/codex.conf - World-readable (644) - should be 600
Configuration Security:
✓ Secret keys are randomized
✗ Keycloak admin password is default - CHANGE IMMEDIATELY
Overall: 3 issues found
Generate Firewall Rules¶
Command:
Generates: secure_firewall.sh
Generated Script:
#!/bin/bash
# HiveMatrix Firewall Configuration
# Auto-generated by security_audit.py
# Allow SSH
ufw allow 22/tcp
# Allow HTTPS (Nexus only public-facing service)
ufw allow 443/tcp
# Block all service ports from external access
ufw deny 5000/tcp # Core
ufw deny 5001/tcp # Beacon
ufw deny 5004/tcp # Helm
ufw deny 5010/tcp # Codex
ufw deny 8080/tcp # Keycloak
# Allow localhost to access everything
ufw allow from 127.0.0.1
# Enable firewall
ufw --force enable
ufw status
Apply Firewall:
Backup & Recovery¶
Backup Script¶
File: backup.py
Create Backup:
What Gets Backed Up:
1. All PostgreSQL Databases
- Dumps each database to individual SQL file
- Format: {service}_db-{timestamp}.sql
- Configuration Files
master_config.jsonapps_registry.jsonservices.json-
All
instance/*.conffiles -
Service-Specific Data
- Neo4j dumps (KnowledgeTree)
- File uploads (if any)
Output:
Creating backup...
✓ Backed up core_db → backups/core_db-20241122-103000.sql
✓ Backed up codex_db → backups/codex_db-20241122-103000.sql
✓ Backed up helm_db → backups/helm_db-20241122-103000.sql
✓ Backed up master_config.json
✓ Backed up services.json
✓ Backup complete: backups/hivematrix-backup-20241122-103000.tar.gz
Scheduled Backups:
# Add to crontab
crontab -e
# Backup daily at 2 AM
0 2 * * * cd /path/to/hivematrix-helm && source pyenv/bin/activate && python backup.py >> logs/backup.log 2>&1
Restore Script¶
File: restore.py
Restore from Backup:
cd hivematrix-helm
source pyenv/bin/activate
python restore.py backups/hivematrix-backup-20241122-103000.tar.gz
Process: 1. Stops all services 2. Extracts backup archive 3. Restores each database: - Drops existing database - Creates new database - Restores from SQL dump 4. Restores configuration files 5. Starts all services 6. Verifies restoration
⚠️ Warning: This is a destructive operation. Current data will be lost.
Confirmation Prompt:
WARNING: This will replace all current data!
Current databases will be dropped and recreated.
Backup file: backups/hivematrix-backup-20241122-103000.tar.gz
Created: 2024-11-22 10:30:00
Continue? (yes/no):
Web Dashboard¶
Main Dashboard¶
URL: / (requires authentication)
Features: - Service Status Grid - All services with status, health, uptime - Quick Actions - Start/stop/restart buttons - Log Statistics - Recent log counts by level - CPU & Memory Graphs - Resource usage trends - Health Alerts - Warnings for unhealthy services
Service Status Card:
┌─────────────────────────────────┐
│ Codex │
│ │
│ Status: ● Running │
│ Health: ✓ Healthy │
│ Uptime: 2h 30m │
│ CPU: 12.5% Memory: 256 MB │
│ │
│ Logs (last hour): │
│ INFO: 145 WARNING: 2 ERROR: 0│
│ │
│ [Stop] [Restart] [View Logs] │
└─────────────────────────────────┘
Service Detail View¶
URL: /service/<service_name>
Sections:
1. Service Information - Name, description, port - Current status, health, PID - Started time, uptime - Resource usage (CPU, memory)
2. Health Check Details - Last check timestamp - Health check response - Dependency status - Disk space, database connection
3. Recent Logs - Last 50 log entries - Color-coded by level - Expandable for full context - Link to full log viewer
4. Actions - Start / Stop / Restart buttons - View full logs - Download logs (CSV/JSON) - Access service dashboard (link)
Log Viewer¶
URL: /logs
Features: - Service Filter - Dropdown to select specific service - Level Filter - DEBUG, INFO, WARNING, ERROR, CRITICAL - Time Range - Last hour, 24 hours, 7 days, custom - Search - Full-text search in message and context - Pagination - Navigate through results - Export - Download logs as CSV or JSON
Log Entry Display:
2024-11-22 10:30:00 [INFO] codex
Company synced from PSA
Context: {
"account_number": "620547",
"company_name": "Acme Corporation"
}
Trace ID: req-abc123
User: admin@example.com
Settings Page¶
URL: /settings (admin only)
Sections:
1. System Configuration - Environment (dev/prod) - Log level - Secret key rotation - Hostname
2. Keycloak Configuration - Keycloak URL - Realm name - Client credentials - Admin credentials - Test connection button
3. Database Configuration - PostgreSQL host, port, credentials - Neo4j URI, credentials - Test connections button
4. Service Management - Install new service - Uninstall service - Update service - Sync configurations
5. Security - Run security audit - Generate firewall rules - View/rotate secret keys - Password policies
6. Backup & Restore - Create backup now - Schedule automated backups - List available backups - Restore from backup
API Reference¶
Service Management API¶
Get All Service Statuses¶
Response:
{
"services": {
"core": {
"status": "running",
"health": "healthy",
"pid": 11111,
"port": 5000,
"started_at": "2024-11-22T08:00:00Z",
"cpu_percent": 5.2,
"memory_mb": 128.5
},
"codex": {
"status": "running",
"health": "healthy",
"pid": 12345,
"port": 5010,
"started_at": "2024-11-22T08:00:30Z",
"cpu_percent": 12.5,
"memory_mb": 256.3
}
},
"system": {
"cpu_percent": 45.2,
"memory_percent": 62.3,
"disk_percent": 48.1
}
}
Start Service¶
Response:
Stop Service¶
Response:
Restart Service¶
Response:
Health Check¶
Response:
{
"status": "healthy",
"timestamp": "2024-11-22T10:30:00Z",
"service": "helm",
"checks": {
"database": {
"status": "healthy",
"message": "Connected to PostgreSQL"
},
"disk": {
"status": "healthy",
"usage_percent": 48.1,
"available_gb": 120.5
}
}
}
Configuration¶
Database Configuration¶
File: instance/helm.conf
Environment Variables¶
File: .flaskenv (auto-generated)
# Core Service
CORE_SERVICE_URL=http://localhost:5000
# Service Identity
SERVICE_NAME=helm
# Logging
LOG_LEVEL=INFO
ENABLE_JSON_LOGGING=true
# Flask Secret Key
SECRET_KEY=your-secret-key-here
Installation & Setup¶
Prerequisites¶
- PostgreSQL 12+ installed and running
- Python 3.8+ with pip
- Git for cloning repositories
Install Helm¶
Automated (Recommended):
cd /path/to/parent-directory
git clone https://github.com/skelhammer/hivematrix-helm
cd hivematrix-helm
./install.sh
Manual:
cd hivematrix-helm
python3 -m venv pyenv
source pyenv/bin/activate
pip install -r requirements.txt
python init_db.py
First-Time Setup¶
Run Initial Configuration:
Prompts for: - PostgreSQL host, port, database name - PostgreSQL username, password - Tests connection - Creates all tables - Initializes master_config.json
Troubleshooting¶
Service Won't Start¶
Symptom: python cli.py start codex fails
Check: 1. Port in use:
-
Virtual environment:
-
.flaskenv missing:
-
Database not configured:
Logs Not Appearing¶
Symptom: No logs in Helm database
Check: 1. Helm service running:
-
Service can reach Helm:
-
Service has helm_logger configured:
-
Check logs table:
Configuration Sync Fails¶
Symptom: python config_manager.py sync-all errors
Check: 1. master_config.json exists:
-
Service directories exist:
-
Permissions:
See Also¶
Related Services¶
- Core - Authentication - JWT authentication and session management
- Nexus - Gateway - Frontend proxy and routing
- All Services - Complete service inventory
Architecture & Design¶
- Service Orchestration Pattern
- Centralized Logging Architecture
- Configuration Management
- Monitoring & Health Checks
Configuration & Setup¶
- Installation Guide - Complete installation walkthrough
- Configuration Guide - Configuration management details
- Security Guide - Security auditing and firewall setup
Tools & Utilities¶
- Service Management:
cli.py start|stop|restart|status <service> - Log Viewer:
logs_cli.py <service> --tail N - Config Manager:
config_manager.py sync-all - Install Manager:
install_manager.py install <service> - Security Audit:
security_audit.py --audit
Questions or issues? Check the troubleshooting section or file an issue on GitHub.