Skip to content

HiveMatrix Backup & Restore Guide

Overview

HiveMatrix includes comprehensive backup and restore capabilities that protect all your data:

  • PostgreSQL databases: codex_db, ledger_db, core_db, helm_db, brainhair_db
  • Neo4j graph database: KnowledgeTree knowledge base
  • Redis: Session storage and rate limiting data
  • Keycloak: User authentication and SSO data
  • Configuration files: Service configurations and environment settings

Quick Start

Manual Backup (One-Time)

cd hivematrix-helm
source pyenv/bin/activate
sudo python3 backup.py /path/to/backup/directory
cd hivematrix-helm
sudo ./install_backup_cron.sh

This installs a cron job that runs daily at 2:00 AM with automatic retention policies.

Backup Components

What Gets Backed Up

  1. PostgreSQL Databases (5 databases)
  2. codex_db - Master data management
  3. ledger_db - Billing and financial data
  4. core_db - Core authentication and user data
  5. helm_db - Orchestration and system data
  6. brainhair_db - AI assistant conversation data

  7. Neo4j Graph Database

  8. KnowledgeTree graph database with all knowledge base content
  9. Includes nodes, relationships, properties, and indexes

  10. Redis Database

  11. User session data
  12. Rate limiting data
  13. Cache data

  14. Keycloak Directory

  15. User authentication data
  16. SSO configuration
  17. Client registrations

  18. Configuration Files

  19. Service configuration files
  20. Environment settings
  21. SSL certificates (if present)

Backup Format

Backups are created as ZIP archives with the naming format:

hivematrix_backup_YYYYMMDD_HHMMSS.zip

Example: hivematrix_backup_20251122_020000.zip

Manual Backup

Basic Usage

cd hivematrix-helm
source pyenv/bin/activate
sudo python3 backup.py /var/backups/hivematrix/manual

Dry Run (Test Without Actually Backing Up)

sudo python3 backup.py /tmp/test --dry-run

This will: - Check all services are accessible - Verify required tools are available (pg_dump, neo4j-admin, redis-cli) - Show what would be backed up - Report any issues without creating a backup

Backup Output

A successful backup will create:

/var/backups/hivematrix/manual/
└── hivematrix_backup_20251122_143022.zip
    ├── postgresql/
    │   ├── codex_db.sql
    │   ├── ledger_db.sql
    │   ├── core_db.sql
    │   ├── helm_db.sql
    │   └── brainhair_db.sql
    ├── neo4j/
    │   ├── hivematrix.dump
    │   └── neo4j_version.txt
    ├── redis/
    │   ├── dump.rdb
    │   └── redis_info.txt
    ├── keycloak/
    │   └── [keycloak data directory contents]
    ├── configs/
    │   ├── [service config files]
    │   └── [environment files]
    └── backup_metadata.json

Backup Metadata

Each backup includes a backup_metadata.json file with:

{
  "timestamp": "2025-11-22T14:30:22",
  "hostname": "hivematrix-server",
  "version": "1.0",
  "components": {
    "postgresql": ["codex_db", "ledger_db", "core_db", "helm_db", "brainhair_db"],
    "neo4j": ["hivematrix"],
    "redis": true,
    "keycloak": true,
    "configs": true
  }
}

Automated Backups

Installation

cd hivematrix-helm
sudo ./install_backup_cron.sh

This will: 1. Install a cron job that runs daily at 2:00 AM 2. Create backup directories: /var/backups/hivematrix/{daily,weekly,monthly} 3. Set up logging to /var/backups/hivematrix/backup.log

Backup Schedule and Retention

Automated backups use a tiered retention policy:

Type When Retention Location
Daily Every day at 2 AM 7 days /var/backups/hivematrix/daily/
Weekly Sunday at 2 AM 4 weeks /var/backups/hivematrix/weekly/
Monthly 1st of month at 2 AM 12 months /var/backups/hivematrix/monthly/

Example: - On a Tuesday: Creates daily backup, keeps last 7 daily backups - On a Sunday: Creates weekly backup, keeps last 4 weekly backups - On the 1st: Creates monthly backup, keeps last 12 monthly backups

Viewing Cron Job

sudo crontab -l

You should see:

0 2 * * * /home/user/hivematrix-helm/backup_automated.sh >> /var/backups/hivematrix/backup.log 2>&1

Viewing Backup Logs

# View entire log
sudo cat /var/backups/hivematrix/backup.log

# Follow log in real-time
sudo tail -f /var/backups/hivematrix/backup.log

# View last backup
sudo tail -50 /var/backups/hivematrix/backup.log

Testing Automated Backup

Run the automated backup script manually to test:

cd hivematrix-helm
sudo ./backup_automated.sh

Uninstalling Automated Backups

cd hivematrix-helm
sudo ./uninstall_backup_cron.sh

This removes the cron job but preserves existing backups.

To delete backups:

sudo rm -rf /var/backups/hivematrix/

Restore

Prerequisites

  • Stop all HiveMatrix services before restoring
  • Root/sudo access required
  • Backup ZIP file must be accessible

Full Restore

cd hivematrix-helm

# Stop all services
python cli.py stop

# Activate virtual environment
source pyenv/bin/activate

# Run restore
sudo python3 restore.py /var/backups/hivematrix/daily/hivematrix_backup_20251122_020000.zip

# Start services
python cli.py start

Partial Restore (Specific Components)

Restore only specific components using command-line options:

# Restore only PostgreSQL databases
sudo python3 restore.py backup.zip --postgresql-only

# Restore only Neo4j (KnowledgeTree)
sudo python3 restore.py backup.zip --neo4j-only

# Restore only Redis (sessions)
sudo python3 restore.py backup.zip --redis-only

# Restore only Keycloak (auth)
sudo python3 restore.py backup.zip --keycloak-only

# Restore only configuration files
sudo python3 restore.py backup.zip --configs-only

Force Restore (Skip Confirmations)

⚠️ DANGEROUS - Use with caution!

sudo python3 restore.py backup.zip --force

Restore Process

  1. Confirmation: You'll be asked to confirm (unless using --force)
  2. Extraction: Backup ZIP is extracted to temporary directory
  3. Service Shutdown: Required services are stopped (Neo4j, Redis)
  4. Database Restoration:
  5. PostgreSQL: Drops and recreates databases, restores data
  6. Neo4j: Stops service, restores dump, starts service
  7. Redis: Stops service, copies dump.rdb, starts service
  8. Directory Restoration: Keycloak and configs copied
  9. Cleanup: Temporary files removed
  10. Verification: Services checked for proper operation

What Gets Overwritten

⚠️ WARNING: Restore will OVERWRITE existing data:

  • All PostgreSQL databases are DROPPED and RECREATED
  • Neo4j database is COMPLETELY REPLACED
  • Redis data is OVERWRITTEN
  • Keycloak directory is REPLACED
  • Configuration files are OVERWRITTEN

Make a backup before restoring if you want to preserve current data!

Backup Best Practices

1. Regular Testing

Test your backups regularly:

# Create a test backup
sudo python3 backup.py /tmp/test-backup

# Verify backup integrity
unzip -t /tmp/test-backup/hivematrix_backup_*.zip

# Test restore in a non-production environment

2. Off-Site Backups

Copy backups to a remote location:

# Using rsync to remote server
rsync -avz /var/backups/hivematrix/ user@backup-server:/backups/hivematrix/

# Using rclone to cloud storage
rclone sync /var/backups/hivematrix/ remote:hivematrix-backups/

3. Monitor Backup Size

Track backup growth:

# Check total backup disk usage
du -sh /var/backups/hivematrix/

# List all backups with sizes
ls -lh /var/backups/hivematrix/{daily,weekly,monthly}/

4. Encryption (Optional)

Encrypt sensitive backups:

# Encrypt backup
gpg --symmetric --cipher-algo AES256 hivematrix_backup_20251122_020000.zip

# Decrypt when needed
gpg --decrypt hivematrix_backup_20251122_020000.zip.gpg > backup.zip

5. Backup Notifications

Add email notifications to automated backups:

Edit /home/user/hivematrix-helm/backup_automated.sh and uncomment the notification section:

# Around line 149, uncomment and configure:
if [ "$BACKUP_SUCCESS" = false ]; then
    echo "HiveMatrix backup failed on $(hostname)" | mail -s "Backup Failed" admin@example.com
fi

Troubleshooting

Backup Fails with Permission Errors

Problem: Permission denied when accessing databases

Solution: Run backup with sudo:

sudo python3 backup.py /var/backups/hivematrix/manual

Neo4j Backup Fails

Problem: Neo4j must be stopped to create database dump

Solution: This is expected - the backup script automatically stops/starts Neo4j. Ensure you have sudo privileges.

Redis Backup Empty

Problem: Redis backup creates empty dump.rdb

Solution: Check if Redis has data:

redis-cli DBSIZE

If empty, this is normal. Redis will be restored correctly even if empty.

Restore Fails - Database Already Exists

Problem: ERROR: database "codex_db" already exists

Solution: The restore script should handle this. If it doesn't, manually drop databases:

sudo -u postgres psql -c "DROP DATABASE IF EXISTS codex_db;"

Disk Space Issues

Problem: Backups failing due to insufficient disk space

Solution: 1. Check disk usage: df -h 2. Reduce retention periods in /home/user/hivematrix-helm/backup_automated.sh:

RETENTION_DAILY=3    # Reduce from 7 to 3
RETENTION_WEEKLY=2   # Reduce from 4 to 2
RETENTION_MONTHLY=6  # Reduce from 12 to 6
3. Move backups to larger storage volume 4. Set up off-site backup sync and delete local copies

Cron Job Not Running

Problem: Automated backups not running

Check cron job:

sudo crontab -l

Check cron service:

sudo systemctl status cron     # Ubuntu/Debian
sudo systemctl status crond    # RHEL/Fedora

Check logs:

sudo grep CRON /var/log/syslog

Recovery Scenarios

Scenario 1: Complete Server Failure

  1. Install fresh HiveMatrix on new server
  2. Copy backup files to new server
  3. Run restore:
    cd hivematrix-helm
    source pyenv/bin/activate
    sudo python3 restore.py /path/to/backup.zip
    
  4. Verify all services start correctly

Scenario 2: Accidental Data Deletion

  1. Stop affected service
  2. Restore only affected component:
    # Example: Restore only KnowledgeTree
    sudo python3 restore.py backup.zip --neo4j-only
    
  3. Restart service

Scenario 3: Database Corruption

  1. Stop all services: python cli.py stop
  2. Restore PostgreSQL databases:
    sudo python3 restore.py latest-backup.zip --postgresql-only
    
  3. Start services: python cli.py start

Scenario 4: Rolling Back After Failed Update

  1. Stop all services
  2. Restore from backup before update
  3. Verify services work correctly
  4. Investigate update failure

Security Considerations

Backup Permissions

Backups contain sensitive data. Protect them:

# Set restrictive permissions
sudo chmod 600 /var/backups/hivematrix/daily/*.zip
sudo chown root:root /var/backups/hivematrix/daily/*.zip

Backup Contents

Backups include: - User credentials (hashed passwords in PostgreSQL) - Authentication tokens (in Redis - short-lived) - Keycloak data (SSO configuration and user data) - Business data (all application data)

Access Control

  • Only root should access /var/backups/hivematrix/
  • Use encryption for off-site backups
  • Implement audit logging for backup access

Disk Space Requirements

Estimating Backup Size

Backup size depends on your data:

Component Typical Size Notes
PostgreSQL 10-500 MB Depends on data volume
Neo4j 50-2000 MB KnowledgeTree size varies greatly
Redis 1-50 MB Session data is usually small
Keycloak 10-100 MB Relatively static
Configs 1-5 MB Small and static

Example with retention policy: - Average backup size: 200 MB - Daily (7): 1.4 GB - Weekly (4): 800 MB - Monthly (12): 2.4 GB - Total: ~4.6 GB

Monitoring Disk Usage

Add to your monitoring:

# Check backup directory size
du -sh /var/backups/hivematrix/

# Alert if > 10GB
BACKUP_SIZE=$(du -s /var/backups/hivematrix/ | awk '{print $1}')
if [ $BACKUP_SIZE -gt 10485760 ]; then
    echo "WARNING: Backups using more than 10GB"
fi

Migration and Upgrades

Before Upgrading HiveMatrix

# Always backup before upgrades
cd hivematrix-helm
sudo python3 backup.py /var/backups/hivematrix/pre-upgrade

Migrating to New Server

  1. On old server: Create backup

    sudo python3 backup.py /tmp/migration-backup
    

  2. Transfer backup: Copy to new server

    scp /tmp/migration-backup/hivematrix_backup_*.zip user@new-server:/tmp/
    

  3. On new server: Install HiveMatrix

    cd hivematrix-helm
    ./install.sh
    

  4. On new server: Restore backup

    source pyenv/bin/activate
    sudo python3 restore.py /tmp/hivematrix_backup_*.zip
    

  5. Verify: Test all services

Support

Backup/Restore Issues

If you encounter issues:

  1. Check logs:
  2. Backup: /var/backups/hivematrix/backup.log
  3. Service logs: hivematrix-helm/logs/

  4. Run dry-run to diagnose:

    sudo python3 backup.py /tmp/test --dry-run
    

  5. Verify all prerequisites are met (see Requirements section)

  6. Check disk space: df -h

Getting Help

  • GitHub Issues: https://github.com/skelhammer/hivematrix-helm/issues
  • Documentation: https://github.com/skelhammer/hivematrix-docs

Appendix

Backup Script Location

  • Manual backup: /home/user/hivematrix-helm/backup.py
  • Automated backup: /home/user/hivematrix-helm/backup_automated.sh
  • Restore script: /home/user/hivematrix-helm/restore.py

Cron Installation Scripts

  • Install: /home/user/hivematrix-helm/install_backup_cron.sh
  • Uninstall: /home/user/hivematrix-helm/uninstall_backup_cron.sh

Default Backup Location

/var/backups/hivematrix/
├── daily/          # Last 7 days
├── weekly/         # Last 4 weeks
├── monthly/        # Last 12 months
└── backup.log      # Automated backup logs

Backup Metadata Fields

{
  "timestamp": "ISO 8601 timestamp",
  "hostname": "Server hostname",
  "version": "Backup format version",
  "components": {
    "postgresql": ["array", "of", "databases"],
    "neo4j": ["array", "of", "databases"],
    "redis": true/false,
    "keycloak": true/false,
    "configs": true/false
  }
}