Skip to content

HiveMatrix Architecture - Development & Operations

Version 4.2 Part 2 of 3: Development Practices & Operational Tools

This document covers development workflows, security architecture, debugging tools, and operational best practices for HiveMatrix. Read Architecture - Core first to understand the foundational patterns.

Other Parts: - Architecture - Core - Core concepts, authentication, service communication - Architecture - Services - Service-specific deep dives (Brainhair, Ledger, KnowledgeTree)


8. Running the Development Environment

HiveMatrix provides a unified startup script that handles installation, configuration, and service orchestration.

Quick Start

From the hivematrix-helm directory:

./start.sh

Note: This script requires the jq command-line JSON processor to be installed.

This script will: 1. Check and install system dependencies (Python, Git, Java, PostgreSQL, jq) 2. Download and setup Keycloak 3. Clone and install Core and Nexus if not present 4. Setup databases 5. Auto-detect and register all services (via install_manager.py update-config) 6. Automatically configure Keycloak realm and users (if needed) 7. Sync configurations to all apps (via config_manager.py) 8. Start all services (Keycloak, Core, Nexus, and any additional installed apps) 9. Launch Helm web interface on port 5004

Keycloak Auto-Configuration

HiveMatrix includes intelligent Keycloak setup automation that ensures proper synchronization between Keycloak and the system configuration.

Automatic Configuration Detection:

The startup script (start.sh) automatically detects when Keycloak needs to be configured by checking:

  1. Fresh Keycloak Installation: If Keycloak was just downloaded (directory didn't exist)
  2. Missing Configuration: If client_secret is missing from master_config.json
  3. Configuration Sync: If master_config.json is missing but Keycloak exists, it removes Keycloak to force reinstallation

What Gets Configured:

When Keycloak configuration runs (configure_keycloak.sh), it creates:

  • Realm: hivematrix realm with proper frontend URL settings
  • Client: core-client with OAuth2 authorization code flow
  • Admin User: Default admin user (admin/admin)
  • Permission Groups:
  • admins - Full system access
  • technicians - Technical operations
  • billing - Financial operations
  • client - Limited access (default for new users)
  • Group Mapper: OIDC mapper to include group membership in JWT tokens

Configuration Synchronization:

The system maintains synchronization between Keycloak and master_config.json:

# If Keycloak is reinstalled (directory deleted)
1. Start.sh detects Keycloak is missing
2. Downloads and extracts Keycloak
3. Clears old keycloak section from master_config.json
4. Runs configure_keycloak.sh to set up realm and users
5. Saves new client_secret to master_config.json

# If master_config.json is deleted but Keycloak exists
1. Start.sh detects config is missing
2. Removes Keycloak directory to force clean state
3. Re-downloads Keycloak
4. Runs full configuration

Manual Reconfiguration:

To force Keycloak reconfiguration:

# Delete Keycloak directory - will trigger full reinstall and config
rm -rf ../keycloak-26.4.0
./start.sh

# Or delete master config - will force resync
rm instance/configs/master_config.json
./start.sh

Keycloak Configuration Files:

  • Master Config: hivematrix-helm/instance/configs/master_config.json - Stores client_secret and URLs
  • Service Configs: Each service's .flaskenv gets Keycloak settings from master config
  • Keycloak Config: ../keycloak-26.4.0/conf/keycloak.conf - Auto-configured for proxy mode with hostname

The configuration process is idempotent - running it multiple times is safe and will update existing configurations rather than creating duplicates

Development Mode

For development with Flask's auto-reload:

./start.sh --dev

This uses Flask's development server instead of Gunicorn.

Manual Service Management

You can also manage services individually using the Helm CLI:

cd hivematrix-helm
source pyenv/bin/activate

# Start individual services
python cli.py start keycloak
python cli.py start core
python cli.py start nexus
python cli.py start codex
python cli.py start ledger

# Check service status
python cli.py status

# Stop services
python cli.py stop ledger
python cli.py stop codex
python cli.py stop nexus
python cli.py stop core
python cli.py stop keycloak

# Restart a service
python cli.py restart core

Access Points

After running ./start.sh, access the platform at:

  • HiveMatrix: https://localhost:443 (or http://localhost:8000 if port 443 binding failed)
  • Helm Dashboard: http://localhost:5004
  • Keycloak Admin: http://localhost:8080
  • Core Service: http://localhost:5000

Default credentials: - Username: admin - Password: admin

Important: Change the default password in Keycloak admin console after first login.

Installing Additional Services

Via Helm web interface (http://localhost:5004): 1. Navigate to "Apps" or "Services" section 2. Click "Install" next to the desired service 3. Wait for installation to complete 4. Service will automatically start

Via command line:

cd hivematrix-helm
source pyenv/bin/activate
python install_manager.py install codex
python cli.py start codex

Configuration Updates

After modifying Helm's master configuration, sync to all apps:

cd hivematrix-helm
source pyenv/bin/activate
python config_manager.py sync-all

Or restart the platform with ./start.sh which automatically syncs configs.

9. Backup & Restore System

HiveMatrix includes comprehensive backup and restore tools that export and import all platform data. These tools run standalone with sudo access and are not integrated with the web interface.

Overview

The backup system creates timestamped ZIP archives containing: - PostgreSQL databases - All application databases and global objects (roles, tablespaces) - Neo4j databases - Graph database used by KnowledgeTree - Keycloak directory - Authentication server data, configuration, and themes

Backups default to /tmp and are owned by the user who ran sudo (not root).

Architecture & Technical Details

PostgreSQL Backup

  • Uses sudo -u postgres pg_dump for peer authentication (no passwords needed)
  • Dumps each database to individual .sql files
  • Backs up global objects with pg_dumpall -g
  • Discovers databases by scanning service configs in instance/*.conf

Neo4j Backup

  • Uses sudo neo4j-admin database dump for proper backup
  • Requires Neo4j to be stopped before backup
  • Creates .dump files that preserve all graph data
  • Cannot use directory copy - must use neo4j-admin tools

Keycloak Backup

  • Uses tar to create compressed archive of entire Keycloak directory
  • Preserves all permissions, ownership, and executable bits
  • Excludes log files and temporary directories
  • Includes themes, configuration, and data

Usage

Creating a Backup

cd hivematrix-helm
sudo python3 backup.py [output_file]

Important: 1. Stop Neo4j before backup: sudo systemctl stop neo4j 2. Run with sudo for database access 3. Output file is optional - defaults to /tmp/hivematrix_backup_YYYY-MM-DD_HH-MM-SS.zip 4. Final zip file will be owned by the user who ran sudo

Example:

sudo systemctl stop neo4j
sudo python3 backup.py
# Creates: /tmp/hivematrix_backup_2025-10-29_14-30-00.zip
sudo systemctl start neo4j

Restoring from Backup

cd hivematrix-helm
./stop.sh  # Stop all HiveMatrix services
sudo systemctl stop neo4j
sudo python3 restore.py /tmp/hivematrix_backup_2025-10-29_14-30-00.zip

Important: 1. Stop all services first - Use ./stop.sh to stop HiveMatrix services 2. Stop Neo4j - sudo systemctl stop neo4j 3. Run with sudo for database access 4. Restore will prompt for confirmation before overwriting data 5. Start services after restore: ./start.sh and sudo systemctl start neo4j

Warning: Restore will OVERWRITE existing data. Make sure you have a current backup before restoring.

Backup Flow

1. User runs: sudo python3 backup.py
2. Script creates temp directory: /tmp/hivematrix_backup_XXXXXX/
3. PostgreSQL Backup:
   - Scan instance/*.conf for database names
   - For each database: sudo -u postgres pg_dump > database.sql
   - Backup globals: sudo -u postgres pg_dumpall -g > globals.sql
4. Neo4j Backup:
   - Check if any service uses Neo4j
   - For each database: sudo neo4j-admin database dump neo4j --to-path=...
5. Keycloak Backup:
   - Find Keycloak directory (../keycloak-*)
   - Create tar archive: tar --exclude=*.log -czf keycloak.tar.gz
6. Create ZIP Archive:
   - Zip temp directory contents
   - Move to /tmp with timestamp
   - Change ownership to SUDO_USER
7. Cleanup temp directory

Restore Flow

1. User runs: sudo python3 restore.py backup.zip
2. Extract ZIP to temp directory: /tmp/hivematrix_restore_XXXXXX/
3. Set permissions (755 dirs, 644 files) for postgres user access
4. PostgreSQL Restore:
   - Restore globals first: sudo -u postgres psql < globals.sql
   - For each database:
     - Drop if exists: sudo -u postgres dropdb database
     - Create: sudo -u postgres createdb database
     - Restore: sudo -u postgres psql -d database < database.sql
5. Neo4j Restore:
   - For each .dump file:
     - sudo neo4j-admin database load neo4j --from-path=... --overwrite-destination=true
6. Keycloak Restore:
   - Extract tar archive to ../keycloak-*/
   - Fix ownership: chown -R user:group
   - Fix permissions: chmod +x bin/*.sh
7. Cleanup temp directory

Best Practices

  1. Regular Backups - Schedule daily backups with cron
  2. Stop Neo4j - Always stop Neo4j before backup/restore
  3. Test Restores - Periodically test restore process on development system
  4. Off-site Storage - Copy backups to remote location
  5. Backup Before Updates - Always backup before updating HiveMatrix or dependencies

Troubleshooting

"Neo4j backup failed" - Make sure Neo4j is stopped: sudo systemctl stop neo4j - Check Neo4j is installed: which neo4j-admin

"Permission denied" on PostgreSQL - Make sure running with sudo - Check postgres user exists: id postgres

"Keycloak won't start after restore" - Check ownership: ls -la ../keycloak-*/ - Check bin scripts are executable: ls -la ../keycloak-*/bin/ - Fix manually if needed: sudo chown -R $USER:$USER ../keycloak-*/ && chmod +x ../keycloak-*/bin/*.sh

"Data didn't come back after restore" - For Neo4j: Make sure you stopped Neo4j before backup AND before restore - For PostgreSQL: Check the backup.zip contains .sql files in postgresql/ directory - For Keycloak: Check the backup.zip contains keycloak/keycloak.tar.gz

10. Development & Debugging Tools

HiveMatrix includes CLI tools in the hivematrix-helm repository to streamline development and debugging workflows. These tools eliminate the need to manually navigate web interfaces or query databases during development.

Centralized Logging System

All HiveMatrix services send logs to Helm's PostgreSQL database for centralized storage and analysis. This allows viewing logs from all services in one place.

logs_cli.py - View Service Logs

Quick command-line access to centralized logs from any service.

Usage:

cd hivematrix-helm
source pyenv/bin/activate

# View recent logs from a specific service
python logs_cli.py knowledgetree --tail 50

# Filter by log level
python logs_cli.py core --level ERROR --tail 100

# View all services
python logs_cli.py --tail 30

Features: - Color-coded output by log level (ERROR=red, WARNING=yellow, INFO=green, DEBUG=blue) - Filters by service name and log level - Configurable tail count - Reads directly from Helm's PostgreSQL database

Implementation: - Location: hivematrix-helm/logs_cli.py - Database: Reads from log_entries table in Helm's PostgreSQL database - Config: Uses instance/helm.conf for database connection

create_test_token.py - Generate JWT Tokens

Creates valid JWT tokens for testing authenticated endpoints without browser login.

Usage:

cd hivematrix-helm
source pyenv/bin/activate

# Generate a test token
TOKEN=$(python create_test_token.py 2>/dev/null)

# Use token to test an endpoint
curl -H "Authorization: Bearer $TOKEN" http://127.0.0.1:5020/knowledgetree/browse/

Features: - Generates tokens signed with Core's RSA private key - Creates admin-level user tokens with 24-hour expiration - Includes proper JWT headers (kid, alg) matching Core's JWKS - No server interaction required - works offline

Token Payload:

{
  "sub": "admin",
  "username": "admin",
  "preferred_username": "admin",
  "email": "admin@hivematrix.local",
  "permission_level": "admin",
  "iss": "hivematrix-core",
  "groups": ["admin"],
  "exp": "<24_hours_from_now>"
}

Implementation: - Location: hivematrix-helm/create_test_token.py - Requires: Core's private key at ../hivematrix-core/keys/jwt_private.pem - Output: Raw JWT token to stdout

test_with_token.sh - Quick Endpoint Testing

Convenience wrapper that generates a token and tests an endpoint in one command.

Usage:

cd hivematrix-helm

# Test KnowledgeTree browse endpoint
./test_with_token.sh

# Or modify to test any endpoint:
# Edit test_with_token.sh and change the curl URL

Script Contents:

#!/bin/bash
cd ~/hivematrix/hivematrix-helm
source pyenv/bin/activate
TOKEN=$(python create_test_token.py 2>/dev/null)

echo "Testing KnowledgeTree /browse with auth token..."
curl -s -H "Authorization: Bearer $TOKEN" http://127.0.0.1:5020/knowledgetree/browse/

Development Workflow

Debugging a Service Error:

  1. Check Recent Logs:

    cd hivematrix-helm
    source pyenv/bin/activate
    python logs_cli.py myservice --tail 50
    

  2. Test Authenticated Endpoint:

    TOKEN=$(python create_test_token.py 2>/dev/null)
    curl -s -H "Authorization: Bearer $TOKEN" http://127.0.0.1:5010/myservice/api/data | jq
    

  3. Monitor Logs During Testing:

    # Terminal 1: Watch logs
    watch -n 2 'python logs_cli.py myservice --tail 20'
    
    # Terminal 2: Make test requests
    ./test_with_token.sh
    

Testing Service-to-Service Communication:

# Generate token and test from calling service
TOKEN=$(python create_test_token.py 2>/dev/null)

# Simulate service call to target service
curl -H "Authorization: Bearer $TOKEN" \
     -H "Content-Type: application/json" \
     http://127.0.0.1:5010/codex/api/companies

Adding Logging to Your Service

All services should use the Helm logger for centralized logging:

In your service's app/__init__.py:

from app.helm_logger import init_helm_logger

# Initialize logger
app.config["SERVICE_NAME"] = os.environ.get("SERVICE_NAME", "myservice")
app.config["HELM_SERVICE_URL"] = os.environ.get("HELM_SERVICE_URL", "http://localhost:5004")

helm_logger = init_helm_logger(
    app.config["SERVICE_NAME"],
    app.config["HELM_SERVICE_URL"]
)

# Log service startup
helm_logger.info(f"{app.config['SERVICE_NAME']} service started")

In your routes:

from app import helm_logger

@app.route('/api/data')
@token_required
def api_data():
    try:
        helm_logger.info("Fetching data", extra={'user': g.user.get('username')})
        # ... your code ...
        return jsonify({'data': result})
    except Exception as e:
        helm_logger.error(f"Failed to fetch data: {e}", exc_info=True)
        return {'error': 'Internal error'}, 500

Troubleshooting Tools

Check if logs are being stored:

cd hivematrix-helm
source pyenv/bin/activate
python -c "
import psycopg2
import configparser

config = configparser.ConfigParser()
config.read('instance/helm.conf')
conn_str = config.get('database', 'connection_string')

conn = psycopg2.connect(conn_str)
cursor = conn.cursor()
cursor.execute('SELECT COUNT(*) FROM log_entries')
print(f'Total logs: {cursor.fetchone()[0]}')
conn.close()
"

Clear old logs:

# Not yet implemented - logs currently accumulate
# TODO: Add log rotation/cleanup tool

11. Security Architecture

HiveMatrix follows a zero-trust internal network model where only the Nexus gateway should be accessible externally. All other services operate on localhost and are accessed through the Nexus proxy.

Security Principles

  1. Single Entry Point: Only port 443 (Nexus with HTTPS) should be exposed externally
  2. Localhost Binding: All backend services (Core, Keycloak, apps) bind to 127.0.0.1
  3. Proxy Access: All services are accessed via Nexus proxy with authentication
  4. Firewall Protection: Host firewall blocks direct access to internal services
  5. Automated Auditing: Security checks run automatically on startup

Service Binding Requirements

Backend Services (bind to localhost only):

# Correct - secure binding
app.run(host='127.0.0.1', port=5040)

Services that MUST bind to localhost: - Keycloak (8080) - Core (5000) - Helm (5004) - All application services (Codex, Ledger, Template, etc.)

Frontend Gateway (bind to all interfaces):

# Nexus - public entry point
app.run(host='0.0.0.0', port=443)

Only Nexus (443) should bind to 0.0.0.0 as it's the authenticated entry point.

Security Audit Tool

Helm includes a security audit tool that checks service port bindings:

cd hivematrix-helm
source pyenv/bin/activate
python security_audit.py --audit

This automatically runs during ./start.sh and reports: - ✓ Services properly bound to localhost - ✗ Services exposed to network (security risk) - ○ Services not running

Firewall Configuration

Generate and apply firewall rules to block direct access to internal services:

# Generate firewall script
python security_audit.py --generate-firewall

# Apply firewall rules (requires sudo)
sudo bash secure_firewall.sh

This configures Ubuntu's UFW firewall to: - Allow SSH (port 22) - Allow HTTPS (port 443 - Nexus) - Block all internal service ports from external access

Alternative with iptables:

python security_audit.py --generate-iptables
sudo bash secure_iptables.sh

Security Checklist

Before deploying to production:

  • [ ] Run security audit: python security_audit.py --audit
  • [ ] All services bound to localhost (except Nexus on 443)
  • [ ] Firewall configured and enabled
  • [ ] Change default Keycloak admin password
  • [ ] Change default HiveMatrix admin password
  • [ ] Use valid SSL certificate (not self-signed) for Nexus
  • [ ] Review Keycloak security settings
  • [ ] Test external access blocked
  • [ ] Test internal access via Nexus works

Common Security Issues

Issue: Service exposed to network

Symptom: Security audit shows service on 0.0.0.0 instead of 127.0.0.1

Fix: Update service's run.py to bind to localhost:

# Change from:
app.run(host='0.0.0.0', port=5040)

# To:
app.run(host='127.0.0.1', port=5040)

Issue: Keycloak exposed

Keycloak (Java-based) may bind to all interfaces. This is acceptable if protected by firewall. The security audit will flag this - apply firewall rules to block external access:

sudo bash secure_firewall.sh

Issue: Can't access service after fixing binding

This is expected behavior! Services on localhost are accessed via: - Internal: http://localhost:PORT (from server) - External: https://SERVER_IP:443/service-name (via Nexus proxy)

Never access services directly by their port. Always use Nexus.

Production Deployment Security

Additional security measures for production:

  1. SSL Certificates: Use Let's Encrypt or commercial SSL for Nexus
  2. Fail2ban: Protect SSH from brute force attacks
  3. Rate Limiting: Configure in Nexus or load balancer
  4. Security Updates: Regular system and application updates
  5. Monitoring: Log analysis and intrusion detection
  6. Backups: Regular encrypted backups of databases and configs

See SECURITY.md for detailed security configuration and best practices.

12. Design System & BEM Classes

(This section will be expanded with more components as they are built.)

Component: Card (.card)

  • Block: .card - The main container.
  • Elements: .card__header, .card__title, .card__body

Component: Button (.btn)

  • Block: .btn
  • Elements: .btn__icon, .btn__label
  • Modifiers: .btn--primary, .btn--danger

Component: Table

  • Block: table - Standard HTML table element
  • Elements: thead, tbody, th, td
  • Styling is provided globally by Nexus

Component: Form Elements

  • Input: Standard input, select, textarea elements
  • Label: Standard label element
  • Styling is provided globally by Nexus

13. Database Best Practices

Configuration Storage

  • Use configparser.RawConfigParser() instead of ConfigParser() to handle special characters in passwords
  • Store database credentials in instance/[service].conf
  • Never commit config files to version control (they're in .gitignore)

Models

  • Each service owns its own database tables
  • Use SQLAlchemy for ORM
  • Define models in models.py
  • Use appropriate data types:
  • db.String(50) for short strings (IDs, codes)
  • db.String(150) for names
  • db.String(255) for URLs, domains
  • db.Text for long text fields
  • BigInteger for large numeric IDs (like PSA external IDs)

Relationships

  • Use association tables for many-to-many relationships
  • Use db.relationship() with back_populates for bidirectional relationships
  • Add cascade="all, delete-orphan" for proper cleanup

14. External System Integration

Sync Scripts

Services that integrate with external systems (like Codex with PSA systems and Datto) should:

  • Have standalone Python scripts (e.g., sync_psa.py, pull_datto.py)
  • Import the Flask app and models directly
  • Use the app context: with app.app_context():
  • Be runnable via cron for automated syncing
  • Include proper error handling and logging

API Credentials

  • Store API credentials in the service's config file
  • Never hardcode credentials
  • Provide interactive setup via init_db.py

15. Common Patterns

Service Directory Structure

hivematrix-myservice/
├── app/
│   ├── __init__.py           # Flask app initialization
│   ├── auth.py               # @token_required decorator
│   ├── routes.py             # Main web routes
│   ├── service_client.py     # Service-to-service helper
│   ├── middleware.py         # URL prefix middleware
│   └── templates/            # HTML templates (BEM styled)
│       └── admin/            # Admin-only templates
├── routes/                   # Blueprint routes (optional)
│   ├── __init__.py
│   ├── entities.py
│   └── admin.py
├── instance/
│   └── myservice.conf        # Configuration (not in git)
├── extensions.py             # Flask extensions (db)
├── models.py                 # SQLAlchemy models
├── init_db.py                # Database initialization script
├── run.py                    # Application entry point
├── services.json             # Service discovery config (symlinked from Helm)
├── requirements.txt          # Python dependencies
├── .flaskenv                 # Environment variables (not in git)
├── .gitignore
└── README.md

Required Files

Every service must have: - .flaskenv - Environment configuration - requirements.txt - Python dependencies - run.py - Entry point - app/__init__.py - Flask app setup - app/auth.py - Authentication decorators (copy from template) - app/service_client.py - Service-to-service helper (copy from template) - app/middleware.py - URL prefix middleware (copy from template) - extensions.py - Flask extensions - models.py - Database models - init_db.py - Interactive database setup - services.json - Service discovery (symlinked to ../hivematrix-helm/services.json)

Note: The services.json file should be a symlink created by install.sh, not a regular file. This ensures all services see the same service registry.

Service Configuration Files in Helm

Helm maintains two service configuration files with different purposes:

1. master_services.json - Minimal service registry - Purpose: Used by Nexus for service discovery and URL routing - Format: Simplified with just url and port - Synced to: Individual services via service_manager - When to update: When adding/removing services

2. services.json - Complete service configuration - Purpose: Used by Helm for service management (start/stop/status) - Format: Extended with path, python_bin, run_script, visible - Used by: Helm's service_manager.py and cli.py - When to update: When adding/removing services

Adding a new service workflow:

Instead of manually editing both master_services.json and services.json, you should:

  1. Add the service to apps_registry.json:

    {
      "default_apps": {
        "archive": {
          "name": "HiveMatrix Archive",
          "description": "Document and file archival system",
          "git_url": "https://github.com/skelhammer/hivematrix-archive",
          "port": 5012,
          "required": false,
          "dependencies": ["core", "codex"],
          "install_order": 8
        }
      }
    }
    

  2. Run update-config to generate both files automatically:

    cd hivematrix-helm
    source pyenv/bin/activate
    python install_manager.py update-config
    

This will automatically create entries in both master_services.json and services.json:

Generated master_services.json entry:

{
  "archive": {
    "url": "http://localhost:5012",
    "port": 5012
  }
}

Generated services.json entry:

{
  "archive": {
    "url": "http://localhost:5012",
    "path": "../hivematrix-archive",
    "port": 5012,
    "python_bin": "pyenv/bin/python",
    "run_script": "run.py",
    "visible": true
  }
}

Important: Always use apps_registry.json as the source of truth and let install_manager.py update-config generate the other files. Manual edits to services.json or master_services.json will be overwritten on the next config update.


Continue to: - Architecture - Core - Core concepts, authentication, service communication - Architecture - Services - Brainhair, Ledger, KnowledgeTree, production features