Skip to content

KnowledgeTree - Knowledge Base Service

{: .no_toc }

Port: 5020 Database: Neo4j (Graph Database) Repository: hivematrix-knowledgetree

Table of Contents

  1. Overview
  2. Architecture
  3. Data Synchronization
  4. Ticket Information
  5. Context System
  6. API Reference
  7. Admin Operations
  8. User Interface
  9. Configuration
  10. Installation & Setup
  11. Development
  12. Monitoring & Logging
  13. Troubleshooting
  14. Security
  15. Backup & Recovery

Overview

KnowledgeTree is HiveMatrix's graph-based knowledge management system that organizes institutional knowledge in a hierarchical, filesystem-like structure. Using Neo4j's graph database, it provides fast traversal, flexible relationships, and powerful context-building capabilities for both human users and AI assistants.

Unlike other HiveMatrix services that use PostgreSQL, KnowledgeTree leverages Neo4j's graph database to efficiently represent hierarchical knowledge structures, track relationships between topics, and quickly gather contextual information across multiple related articles.

Primary Responsibilities

  • Hierarchical Knowledge Organization - Filesystem-like structure with sections, categories, and topics
  • Markdown Content Management - Rich text documentation with code blocks, tables, and formatting
  • External Data Integration - Syncs companies, contacts, assets, and tickets from Codex
  • Context Building - Gathers related knowledge for AI assistants (Brainhair)
  • Search & Discovery - Full-text search across all articles and folders
  • File Attachments - Upload and serve files linked to knowledge articles

Key Features

Hierarchical Organization - Three-level structure (sections → categories → topics) ✅ Graph Relationships - Neo4j enables flexible connections between knowledge items ✅ Markdown Support - Full markdown rendering with syntax highlighting ✅ Data Sync - Automated sync from Codex for companies, users, assets, tickets ✅ Context System - Intelligently gathers related knowledge for AI queries ✅ Attached Folders - Special folders that automatically include in context ✅ Read-Only Nodes - Synced external data is marked read-only ✅ Multiple Views - Grid, list, and tree views for browsing ✅ Full-Text Search - Search across titles and content ✅ Export/Import - Backup and restore user-created knowledge


Architecture

Database: Neo4j Graph Database

KnowledgeTree uses Neo4j 5.14.0, a native graph database optimized for hierarchical data and relationship traversal.

Why Neo4j?

  1. Natural Hierarchy - Perfect fit for section → category → topic structure
  2. Fast Traversal - Efficiently navigate deep folder structures
  3. Flexible Relationships - Easy to link related articles, prerequisites, dependencies
  4. Graph Queries - Find connections and build context across multiple paths
  5. Schema-less - Easy to evolve the data model

Graph Schema

Node Labels:

(:ContextItem)  // Folders and articles
(:File)         // File attachments

ContextItem Properties: - id (string, unique) - Node identifier (UUID or deterministic path-based) - name (string) - Display name - content (string) - Markdown content (for articles) - is_folder (boolean) - True for folders, false for articles - is_attached (boolean) - True for special folders that include in context - read_only (boolean) - True for synced data (from Codex)

File Properties: - id (string, unique) - File identifier (UUID) - filename (string) - Original filename

Relationships:

(:ContextItem)-[:PARENT_OF]->(:ContextItem)  // Hierarchy
(:ContextItem)-[:HAS_FILE]->(:File)           // Attachments

Schema Initialization

On startup, KnowledgeTree ensures the root node exists:

MERGE (r:ContextItem {id: 'root', name: 'KnowledgeTree Root'})
ON CREATE SET r.content = '# Welcome to KnowledgeTree',
              r.is_folder = true,
              r.is_attached = false,
              r.read_only = false

Hierarchical Structure

root (KnowledgeTree Root)
├── IT Documentation (Section)
│   ├── Network (Category)
│   │   ├── Router Configuration.md (Topic)
│   │   ├── VLAN Setup.md (Topic)
│   │   └── Firewall Rules.md (Topic)
│   ├── Server Management (Category)
│   │   ├── Windows Updates.md (Topic)
│   │   ├── Backup Procedures.md (Topic)
│   │   └── Active Directory.md (Topic)
│   └── End User Support (Category)
│       ├── Password Resets.md (Topic)
│       ├── Email Configuration.md (Topic)
│       └── VPN Setup.md (Topic)
└── Companies (Synced from Codex)
    ├── Acme Corporation
    │   ├── Users
    │   │   ├── John Doe
    │   │   │   ├── Contact.md
    │   │   │   └── Tickets (attached)
    │   │   │       └── Ticket_12345.md
    │   │   └── Jane Smith
    │   │       ├── Contact.md
    │   │       └── Tickets (attached)
    │   └── Assets
    │       ├── ACME-PC-001.md
    │       └── ACME-SRV-001.md
    └── Wayne Enterprises
        ├── Users
        └── Assets

Node Types

1. User-Created Nodes - Created via UI or API - Editable by users - read_only: false - Uses UUID for id

2. Synced Nodes (Read-Only) - Created by sync_codex.py or sync_tickets.py - Not editable via UI - read_only: true - Uses deterministic path-based id (e.g., root_Companies_Acme_Corporation)

3. Attached Folders - Special folders with is_attached: true - Automatically included when building context - Example: User's "Tickets" folder

4. Regular Folders - Organizational containers - is_folder: true, is_attached: false

5. Articles (Files) - Markdown content - is_folder: false - Can have file attachments via HAS_FILE relationship


Data Synchronization

KnowledgeTree syncs all external data from Codex, which acts as the single source of truth for companies, contacts, assets, and tickets.

Sync Architecture

PSA System (Freshservice) → Codex → KnowledgeTree
Datto RMM              → Codex → KnowledgeTree

Sync Scripts

1. sync_codex.py

Purpose: Syncs company structure, users, and assets from Codex

Creates:

/Companies/
  /{Company Name}/
    /Users/
      /{User Name}/
        Contact.md          (user details)
        /Tickets/           (attached folder, empty until ticket sync)
    /Assets/
      {hostname}.md         (asset details)

Run Command:

cd hivematrix-knowledgetree
source pyenv/bin/activate
python sync_codex.py

What It Does: 1. Fetches all companies from Codex API 2. For each company: - Creates company folder under /Companies/ - Creates Users and Assets subfolders - Fetches company users from Codex - Creates user folder with Contact.md containing user details - Creates empty Tickets attached folder (populated by ticket sync) - Fetches company assets from Codex - Creates markdown file for each asset with specs

Contact.md Example:

# Contact Information for John Doe

- **Email:** john.doe@acme.com
- **Title:** IT Manager
- **Mobile Phone:** (555) 123-4567
- **Work Phone:** (555) 123-4568
- **Active:** Yes

Asset.md Example:

# Computer Information: ACME-PC-001

- **Operating System:** Windows 11 Pro
- **Hardware Type:** Desktop
- **Internal IP:** 192.168.1.100
- **External IP:** 203.0.113.45
- **Last Logged In User:** john.doe
- **Status:** ✓ Online
- **Last Seen:** 2024-11-22 15:30:00
- **Domain:** acme.local

2. sync_tickets.py

Purpose: Syncs support tickets from Codex (originally from PSA system)

Creates:

/Companies/{Company}/Users/{User}/Tickets/
  Ticket_12345.md
  Ticket_12346.md

Run Command:

cd hivematrix-knowledgetree
source pyenv/bin/activate
python sync_tickets.py

What It Does: 1. Fetches all companies from Codex 2. For each company: - Fetches tickets from Codex /api/companies/{account}/tickets - Fetches contacts to map ticket requesters to users - For each ticket: - Finds the user's Tickets attached folder - Creates Ticket_{id}.md with full ticket details, conversations, and notes - Marks as read_only: true

Ticket.md Example:

# Ticket #12345: Email Configuration Issues

## Ticket Information
- **Requester:** John Doe (john.doe@acme.com)
- **Status:** Closed
- **Priority:** High
- **Created:** 2024-11-20 09:15:00
- **Last Updated:** 2024-11-21 14:30:00
- **Closed:** 2024-11-21 16:00:00
- **Hours Spent:** 2.50 hours

## Description
User unable to connect Outlook to Exchange server. Error: "Cannot connect to server."

## Conversation History

### Message 1 - → Incoming
**From:** john.doe@acme.com
**Date:** 2024-11-20 09:15:00

I'm getting an error when trying to set up my email in Outlook...

---

### Message 2 - ← Outgoing
**From:** support@msp.com
**Date:** 2024-11-20 10:00:00

Thank you for contacting support. Let's try reconfiguring your profile...

---

## Internal Notes

### Note 1
**From:** tech@msp.com
**Date:** 2024-11-20 10:30:00

Checked Exchange logs - user account was locked. Reset password and unlocked.

---

*Ticket data synced from Codex/PSA*

Sync Utilities

sync_utils.py - Shared helper function:

def ensure_node(session, parent_id, name, is_folder=True,
                is_attached=False, content='', read_only=True):
    """
    Creates or updates a node in Neo4j.

    Intelligently handles node creation/updates:
    1. Checks if node with same name exists under parent
    2. If found, reuses existing ID (preserves manually created nodes)
    3. If not found, generates deterministic path-based ID
    4. Creates or updates node with MERGE

    Returns: node_id
    """

Why This Matters: - Prevents duplicate folders when re-running sync - Preserves manually created nodes with UUIDs - Uses deterministic IDs for synced nodes (e.g., root_Companies_Acme) - Idempotent - safe to run multiple times

Scheduling Syncs

Recommended Cron Jobs:

# Run company/asset sync daily at 2 AM
0 2 * * * cd /path/to/hivematrix-knowledgetree && source pyenv/bin/activate && python sync_codex.py >> logs/sync_codex.log 2>&1

# Run ticket sync every 4 hours
0 */4 * * * cd /path/to/hivematrix-knowledgetree && source pyenv/bin/activate && python sync_tickets.py >> logs/sync_tickets.log 2>&1

Order of Operations: 1. PSA/Datto → Codex sync (see Codex documentation) 2. Codex → KnowledgeTree company sync (sync_codex.py) 3. Codex → KnowledgeTree ticket sync (sync_tickets.py)


Context System

One of KnowledgeTree's most powerful features is the context system, which gathers related knowledge for AI assistants like Brainhair.

How Context Works

When you request context for a node, KnowledgeTree:

  1. Traverses the Path - Finds all ancestors from root to the target node
  2. Gathers Articles - Collects articles at each level of the path
  3. Includes Attached Folders - Automatically includes content from folders marked is_attached: true
  4. Excludes Specified Folders - Allows excluding certain attached folders
  5. Builds Hierarchical Context - Organizes by depth with markdown headers

Example: Context for a User

If Brainhair asks about user "John Doe" at path:

/Companies/Acme Corporation/Users/John Doe/

Context API Call:

POST /api/context/node_id_for_john_doe
Content-Type: application/json

{
  "excluded_ids": []  // Optional: exclude specific attached folders
}

Context Response:

# Context: KnowledgeTree Root

## Context: Companies

### Context: Acme Corporation

File: README.md
> Company overview and important notes...

#### Context: Users

##### Context: John Doe

File: Contact.md

# Contact Information for John Doe
- **Email:** john.doe@acme.com
- **Title:** IT Manager
...

File: Ticket_12345.md (from attached folder: Tickets)

# Ticket #12345: Email Configuration Issues
...

File: Ticket_12346.md (from attached folder: Tickets)

# Ticket #12346: Password Reset Request
...

Attached Folders

Purpose: Special folders whose content is automatically included when building context for child nodes.

Use Cases: - Tickets Folder - Include all user's tickets in their context - Procedures Folder - Include standard procedures for all items in a category - Notes Folder - Include general notes for all company assets

Creating Attached Folders:

Via API:

POST /api/node
{
  "parent_id": "user_folder_id",
  "name": "Tickets",
  "is_folder": true,
  "is_attached": true
}

Via UI: - Attached folders created automatically by sync scripts - Not currently creatable via UI (feature request)

Context API Usage

Endpoint: GET/POST /api/context/<node_id>

Parameters: - excluded_ids (array, optional) - List of attached folder IDs to exclude

Response:

{
  "context": "# Context: Root\n\n## Context: Section1\n\n..."
}

Example Use Case - Brainhair:

When a user asks Brainhair "What tickets does John Doe have?", Brainhair: 1. Searches KnowledgeTree for "John Doe" 2. Gets the node ID 3. Calls /api/context/<node_id> 4. Receives all of John's contact info and ticket history 5. Uses this context to answer the question accurately


API Reference

Search & Browse

Search Knowledge Base

GET /api/search?query={query}&start_node_id={node_id}

Description: Full-text search across node names and content.

Parameters: - query (required) - Search string (case-insensitive) - start_node_id (optional) - Limit search to descendants of this node (default: root)

Response:

[
  {
    "id": "node-uuid-123",
    "name": "Email Configuration Guide",
    "is_folder": false,
    "folder_path": "IT Documentation / Email / Configuration",
    "url_path": "IT%20Documentation/Email/Configuration"
  }
]

Example:

curl -H "Authorization: Bearer $TOKEN" \
  "http://localhost:5020/api/search?query=email+configuration"


Browse Folder

GET /api/browse?path={path}

Description: Browse the knowledge tree at a specific path. Returns categories (subfolders) and articles.

Parameters: - path (optional) - Path to browse (default: /)

Response:

{
  "path": "/IT Documentation",
  "current_node": {
    "id": "node-it-docs"
  },
  "categories": [
    {
      "name": "Network",
      "path": "/IT Documentation/Network"
    },
    {
      "name": "Server Management",
      "path": "/IT Documentation/Server Management"
    }
  ],
  "articles": [
    {
      "id": "node-readme",
      "title": "README.md",
      "summary": "This section contains IT documentation..."
    }
  ]
}

Use Case: Service-to-service calls to browse knowledge structure.


Node Management

Create Node

POST /api/node
Content-Type: application/json

{
  "parent_id": "parent-node-id",
  "name": "New Article",
  "is_folder": false,
  "is_attached": false
}

Description: Creates a new folder or article under a parent node.

Request Body: - parent_id (required) - Parent node ID - name (required) - Node name - is_folder (optional) - True for folders, false for articles (default: false) - is_attached (optional) - True for attached folders (default: false)

Response:

{
  "success": true,
  "id": "new-node-uuid"
}

Error Response (409):

{
  "error": "A node with this name already exists in this location",
  "existing_id": "existing-node-uuid"
}


Get Node Details

GET /api/node/<node_id>

Description: Retrieves node details including content and attached files.

Response:

{
  "id": "node-uuid",
  "name": "Email Setup Guide",
  "content": "# Email Setup\n\nFollow these steps...",
  "content_html": "<h1>Email Setup</h1><p>Follow these steps...</p>",
  "is_folder": false,
  "is_attached": false,
  "read_only": false,
  "files": [
    {
      "id": "file-uuid",
      "filename": "screenshot.png"
    }
  ]
}

Features: - Markdown content converted to HTML - Supports strikethrough syntax (~~text~~<del>text</del>) - Returns list of attached files


Update Node

PUT /api/node/<node_id>
Content-Type: application/json

{
  "name": "Updated Article Title",
  "content": "# Updated Content\n\nNew markdown here..."
}

Description: Updates node name and/or content.

Request Body: - name (optional) - New node name - content (optional) - New markdown content

Response:

{
  "success": true
}


Delete Node

DELETE /api/node/<node_id>

Description: Deletes a node and all its descendants.

⚠️ Warning: This is a recursive delete! All children will be removed.

Response:

{
  "success": true
}


Move Node

POST /api/node/<node_id>/move
Content-Type: application/json

{
  "new_parent_id": "target-parent-id"
}

Description: Moves a node to a new parent folder.

Validations: - Target must be a folder - Cannot move root node - Cannot move folder into itself or descendants (prevents cycles)

Response:

{
  "success": true
}

Error Response (400):

{
  "error": "Cannot move a folder into itself or its descendants"
}


Get Node Children

GET /api/node/<node_id>/children

Description: Get immediate children of a node.

Response:

[
  {
    "id": "child-1",
    "name": "Subfolder",
    "is_folder": true,
    "is_attached": false,
    "read_only": false
  },
  {
    "id": "child-2",
    "name": "Article.md",
    "is_folder": false,
    "is_attached": false,
    "read_only": false
  }
]


Folder Tree

Get Folder Tree

GET /api/folders/tree

Description: Returns the entire folder hierarchy as a nested tree structure (folders only, no articles).

Response:

{
  "id": "root",
  "name": "KnowledgeTree Root",
  "is_attached": false,
  "children": [
    {
      "id": "it-docs",
      "name": "IT Documentation",
      "is_attached": false,
      "children": [
        {
          "id": "network",
          "name": "Network",
          "is_attached": false,
          "children": []
        }
      ]
    }
  ]
}

Use Case: Building folder picker UI, navigation trees, etc.


File Management

Upload File

POST /api/upload/<node_id>
Content-Type: multipart/form-data

file: <file_data>

Description: Upload a file attachment to a node.

Response:

{
  "success": true,
  "filename": "screenshot.png"
}

File Storage: - Files saved to instance/uploads/ - Original filename preserved - Creates HAS_FILE relationship in Neo4j


Download File

GET /uploads/<filename>

Description: Serves an uploaded file.

Example:

<img src="/knowledgetree/uploads/screenshot.png">


Context Management

Get Context Tree

GET /api/context/tree/<node_id>

Description: Get list of attached folders in the path from root to this node.

Response:

{
  "attached_folders": [
    {
      "id": "tickets-folder",
      "name": "Tickets"
    }
  ]
}

Use Case: Display which attached folders will be included in context.


Get Full Context

GET /api/context/<node_id>
POST /api/context/<node_id>
Content-Type: application/json

{
  "excluded_ids": ["folder-id-to-exclude"]
}

Description: Gathers full context for a node by traversing from root and including attached folders.

Request Body (POST only): - excluded_ids (array, optional) - List of attached folder IDs to exclude from context

Response:

{
  "context": "# Context: Root\n\n## Context: Section\n\nFile: Article.md\n\nContent here..."
}

How It Works: 1. Finds path from root to target node 2. For each node in path: - Collects direct child articles - Collects articles from attached folders (unless excluded) 3. Builds hierarchical markdown with depth-based headers

Example Usage:

# Get full context
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:5020/api/context/node-uuid

# Exclude specific attached folder
curl -X POST \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"excluded_ids": ["tickets-folder-id"]}' \
  http://localhost:5020/api/context/node-uuid


Admin Operations

Admin Settings Page

GET /admin/settings

Description: Admin control panel for KnowledgeTree.

Permission: Requires admin permission level.

Features: - View Neo4j configuration - View Codex integration status - Trigger data syncs - View sync statistics - Database management tools


Trigger Codex Sync

POST /admin/sync/codex

Description: Manually trigger sync_codex.py to sync companies, users, and assets from Codex.

Permission: Admin only.

Response:

{
  "success": true,
  "message": "Codex sync completed successfully",
  "output": "--- KnowledgeTree Codex Sync ---\n..."
}

Timeout: 5 minutes


Trigger Ticket Sync

POST /admin/sync/tickets
Content-Type: application/json

{
  "overwrite": false
}

Description: Manually trigger sync_tickets.py to sync tickets from Codex.

Parameters: - overwrite (optional) - Whether to overwrite existing tickets (default: false)

Permission: Admin only.

Response:

{
  "success": true,
  "message": "Ticket sync completed successfully",
  "output": "--- KnowledgeTree Ticket Sync ---\n..."
}

Timeout: 10 minutes


Get Sync Status

GET /admin/sync/status

Description: Get statistics about synced data.

Permission: Admin only.

Response:

{
  "company_items": 150,
  "ticket_count": 500
}


Export Data

GET /admin/export

Description: Export all user-created (non-read-only) data to JSON file.

Permission: Admin only.

Response: File download knowledgetree_export.json

Export Format:

[
  {
    "path": "IT Documentation/Network/Router Config.md",
    "content": "# Router Configuration\n...",
    "is_folder": false,
    "is_attached": false
  }
]

What Gets Exported: - User-created nodes only (read_only: false) - Full path from root - Content and metadata - Does NOT export synced data from Codex


Import Data

POST /admin/import
Content-Type: multipart/form-data

file: <json_export_file>

Description: Import data from a previously exported JSON file.

Permission: Admin only.

Process: 1. Sorts items by path (ensures parents created before children) 2. Traverses from root to find parent node 3. Uses MERGE to create or update nodes 4. Preserves existing node IDs

Response:

{
  "success": true,
  "message": "Import successful."
}

Use Cases: - Restore from backup - Migrate knowledge between environments - Bulk import documentation


Wipe Database

POST /admin/wipe

Description: ⚠️ DANGER! Deletes all nodes and re-initializes the root node.

Permission: Admin only.

Process:

MATCH (n) DETACH DELETE n

Then recreates:

MERGE (r:ContextItem {id: 'root', name: 'KnowledgeTree Root'})

Response:

{
  "success": true,
  "message": "Database wiped and re-initialized."
}

⚠️ Use with extreme caution! This cannot be undone.


User Interface

Browse View

URL: /browse/ or /browse/<path>

Features: - Breadcrumb Navigation - Shows current path with clickable links - View Modes: - Grid view - Icon-based display - List view - Compact table - Tree view - Hierarchical tree - Search: - Scope: Current folder or all items - Real-time results - Shows full path to results - Context Menu: - Open (double-click or menu) - Rename - Move - Delete - Create folder - Create article - Visual Indicators: - 📁 Folder icon - 📄 Article icon - 📎 Attached folder icon - Read-only badge for synced content

Keyboard Shortcuts: - Enter - Open selected item - Delete - Delete selected item - Ctrl+N - New article - Ctrl+Shift+N - New folder


Article Viewer

URL: /view/<node_id>

Features: - Markdown Rendering: - Headers, lists, tables - Code blocks with syntax highlighting - Inline images - Links - Strikethrough support - Edit Mode: - Live markdown editor - Preview toggle - Auto-save - File Attachments: - Upload files - Download attachments - Preview images - Breadcrumb Navigation - Return to folder view - Metadata Display: - Read-only indicator for synced content - Last modified (if available)


Admin Dashboard

URL: /admin/settings

Sections:

1. Configuration - Neo4j URI and connection status - Codex integration URL - Configuration file location

2. Data Sync - Trigger company/asset sync from Codex - Trigger ticket sync from Codex - View sync statistics (item counts)

3. Database Management - Export user-created data - Import from JSON backup - Wipe database (danger zone)


Configuration

Database Configuration

File: instance/knowledgetree.conf

Format: INI-style configuration (use RawConfigParser)

[database]
neo4j_uri = bolt://localhost:7687
neo4j_user = neo4j
neo4j_password = your-secure-password

[services]
codex_url = http://localhost:5010

Environment Variables

File: .flaskenv (auto-generated by Helm's config_manager.py)

# Core Service (for JWT validation)
CORE_SERVICE_URL=http://localhost:5000

# Service Identity
SERVICE_NAME=knowledgetree

# Logging
LOG_LEVEL=INFO
ENABLE_JSON_LOGGING=true

# Neo4j (fallback if not in config file)
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=

Upload Configuration

Upload Directory: instance/uploads/

Created automatically on startup if it doesn't exist.

File Serving: Files accessible at /uploads/<filename>


Installation & Setup

Prerequisites

  1. Neo4j 5.14.0+ installed and running
  2. Python 3.8+ with pip
  3. Codex service running and configured (for sync)

Install Neo4j

Automatic (via Helm):

cd hivematrix-helm
./start.sh  # Installs Neo4j if not present

Manual:

# Ubuntu/Debian
wget -O - https://debian.neo4j.com/neotechnology.gpg.key | sudo apt-key add -
echo 'deb https://debian.neo4j.com stable latest' | sudo tee /etc/apt/sources.list.d/neo4j.list
sudo apt-get update
sudo apt-get install neo4j

# Start Neo4j
sudo systemctl enable neo4j
sudo systemctl start neo4j

# Set initial password
cypher-shell -u neo4j -p neo4j
# Change password when prompted

Verify:

sudo systemctl status neo4j
# Neo4j browser: http://localhost:7474

Install KnowledgeTree

Via Helm:

cd hivematrix-helm
source pyenv/bin/activate
python install_manager.py install knowledgetree
./start.sh

Manual:

cd hivematrix-knowledgetree

# Create virtual environment
python3 -m venv pyenv
source pyenv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Symlink services.json
ln -sf ../hivematrix-helm/services.json services.json

# Run interactive setup
python init_db.py

Interactive Setup (init_db.py)

The setup wizard prompts for:

1. Neo4j Configuration - URI: bolt://localhost:7687 - Username: neo4j - Password: Your Neo4j password - Tests connection before saving

2. Codex Integration - Codex Service URL: http://localhost:5010

3. Configuration Sync - Saves to instance/knowledgetree.conf - Updates Helm's master_config.json - Syncs to .flaskenv

4. Database Initialization - Creates root node - Primes schema with dummy nodes (deleted immediately)

First-Time Data Sync

After installation, sync data from Codex:

source pyenv/bin/activate

# Sync companies, users, assets
python sync_codex.py

# Sync tickets
python sync_tickets.py

Development

Running Locally

Development Server:

cd hivematrix-knowledgetree
source pyenv/bin/activate
python run.py

Access: - Direct: http://localhost:5020/ - Via Nexus: https://your-server/knowledgetree/

Development Mode:

# run.py
if __name__ == '__main__':
    app.run(host='127.0.0.1', port=5020, debug=True)  # Enable debug mode

Code Structure

hivematrix-knowledgetree/
├── app/
│   ├── __init__.py              # Flask app setup, Neo4j initialization
│   ├── auth.py                  # @token_required, @admin_required
│   ├── routes.py                # All endpoints and UI routes
│   ├── service_client.py        # call_service() for Codex integration
│   ├── rate_limit_key.py        # Per-user rate limiting
│   ├── error_responses.py       # RFC 7807 error responses
│   ├── structured_logger.py     # JSON logging with correlation IDs
│   ├── helm_logger.py           # Centralized logging to Helm
│   ├── version.py               # Git-based version generation
│   └── templates/
│       ├── index.html           # Main browse interface
│       ├── view.html            # Article viewer/editor
│       ├── error.html           # Error page
│       └── admin/
│           └── settings.html    # Admin dashboard
├── instance/
│   ├── knowledgetree.conf       # Database config
│   └── uploads/                 # File attachments
├── sync_codex.py                # Company/user/asset sync script
├── sync_tickets.py              # Ticket sync script
├── sync_utils.py                # Shared sync utilities (ensure_node)
├── init_db.py                   # Interactive setup wizard
├── run.py                       # Application entry point
├── health_check.py              # Health check library
├── requirements.txt             # Python dependencies
├── .flaskenv                    # Environment variables (auto-generated)
└── services.json                # Symlink to Helm's registry

Key Components

app/init.py: - Initializes Flask app - Connects to Neo4j with driver pooling - Sets up ProxyFix middleware - Configures rate limiter (10000/hour, 500/minute) - Registers error handlers (RFC 7807) - Initializes Swagger documentation - Creates root node if missing

app/routes.py: - Main routes: /, /browse/<path>, /view/<node_id> - API endpoints: /api/search, /api/node, /api/context, etc. - Admin routes: /admin/settings, /admin/sync/* - Health check: /health

app/auth.py: - @token_required - Validates JWT from Core - @admin_required - Checks admin permission - Supports both user tokens and service tokens - Sets g.user, g.service, g.is_service_call

sync_codex.py: - Fetches companies from Codex /api/companies - For each company, fetches users and assets - Creates hierarchical folder structure - Generates Contact.md for each user - Generates Asset.md for each device - Uses ensure_node() for idempotent creation

sync_tickets.py: - Fetches tickets from Codex /api/companies/{account}/tickets - Creates full ticket markdown with conversations and notes - Stores in user's Tickets attached folder - Marks tickets as read-only

sync_utils.py: - ensure_node() - Smart node creation/update - Prevents duplicates by checking existing nodes - Reuses UUIDs from manually created nodes - Generates deterministic IDs for synced nodes

Adding New Features

Example: Add Related Articles Feature

  1. Update Graph Schema (add relationship):

    (:ContextItem)-[:RELATED_TO]->(:ContextItem)
    

  2. Create API Endpoint (app/routes.py):

    @app.route('/api/node/<node_id>/related', methods=['POST'])
    @token_required
    def add_related_article(node_id):
        data = request.json
        related_id = data.get('related_id')
    
        driver, error = get_neo4j_driver()
        if error:
            return error
    
        with driver.session() as session:
            session.run("""
                MATCH (source:ContextItem {id: $source_id})
                MATCH (target:ContextItem {id: $target_id})
                MERGE (source)-[:RELATED_TO]->(target)
            """, source_id=node_id, target_id=related_id)
    
        return jsonify({'success': True})
    

  3. Query Related Articles:

    @app.route('/api/node/<node_id>/related', methods=['GET'])
    @token_required
    def get_related_articles(node_id):
        driver, error = get_neo4j_driver()
        if error:
            return error
    
        with driver.session() as session:
            result = session.run("""
                MATCH (source:ContextItem {id: $node_id})-[:RELATED_TO]->(related)
                RETURN related.id as id, related.name as name
            """, node_id=node_id)
    
            related = [dict(record) for record in result]
    
        return jsonify(related)
    

  4. Update UI (templates/view.html):

    <div class="related-articles">
      <h3>Related Articles</h3>
      <ul id="related-list"></ul>
    </div>
    
    <script>
    fetch(`/api/node/${nodeId}/related`, {credentials: 'same-origin'})
      .then(r => r.json())
      .then(related => {
        related.forEach(item => {
          const li = document.createElement('li');
          li.innerHTML = `<a href="/view/${item.id}">${item.name}</a>`;
          document.getElementById('related-list').appendChild(li);
        });
      });
    </script>
    

Testing

Manual Testing with JWT:

cd hivematrix-helm
source pyenv/bin/activate

# Generate test token
TOKEN=$(python create_test_token.py 2>/dev/null)

# Test search
curl -H "Authorization: Bearer $TOKEN" \
  "http://localhost:5020/api/search?query=email"

# Test browse
curl -H "Authorization: Bearer $TOKEN" \
  "http://localhost:5020/api/browse?path=/IT%20Documentation"

# Test create node
curl -X POST \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"parent_id":"root","name":"Test Folder","is_folder":true}' \
  http://localhost:5020/api/node

# Test get context
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:5020/api/context/some-node-id

Neo4j Cypher Testing:

cypher-shell -u neo4j -p your-password

# List all folders
MATCH (n:ContextItem {is_folder: true})
RETURN n.name, n.id
LIMIT 20;

# Find path to a node
MATCH p = (:ContextItem {id: 'root'})-[:PARENT_OF*..]->(target {name: 'Contact.md'})
RETURN [n IN nodes(p) | n.name];

# Count articles vs folders
MATCH (n:ContextItem)
RETURN n.is_folder, count(n);

# Find all attached folders
MATCH (n:ContextItem {is_attached: true})
RETURN n.name, n.id;


Monitoring & Logging

Health Check

Endpoint: GET /health

Checks: - Neo4j database connectivity - Disk space availability - Core service availability - Codex service availability

Response (200 - Healthy):

{
  "status": "healthy",
  "timestamp": "2024-11-22T10:30:00Z",
  "service": "knowledgetree",
  "checks": {
    "neo4j": {
      "status": "healthy",
      "message": "Connected to Neo4j"
    },
    "disk": {
      "status": "healthy",
      "usage_percent": 45.2,
      "available_gb": 120.5
    },
    "dependencies": {
      "core": {
        "status": "healthy",
        "response_time_ms": 15
      },
      "codex": {
        "status": "healthy",
        "response_time_ms": 23
      }
    }
  }
}

Response (503 - Unhealthy):

{
  "status": "degraded",
  "timestamp": "2024-11-22T10:30:00Z",
  "service": "knowledgetree",
  "checks": {
    "neo4j": {
      "status": "unhealthy",
      "error": "Connection refused"
    }
  }
}

Monitoring:

# Check health
curl http://localhost:5020/health | jq

# Monitor in loop
watch -n 5 'curl -s http://localhost:5020/health | jq .status'

Structured Logging

Log Format: JSON with correlation IDs

Example Log Entry:

{
  "timestamp": "2024-11-22T10:30:00Z",
  "level": "INFO",
  "service": "knowledgetree",
  "correlation_id": "req-abc123",
  "user": "john.doe@example.com",
  "message": "Article created",
  "extra": {
    "node_id": "new-article-uuid",
    "parent_path": "/IT Documentation/Network"
  }
}

View Centralized Logs:

cd hivematrix-helm
source pyenv/bin/activate

# View KnowledgeTree logs
python logs_cli.py knowledgetree --tail 50

# Filter by level
python logs_cli.py knowledgetree --level ERROR --tail 100

# Real-time monitoring
watch -n 2 'python logs_cli.py knowledgetree --tail 20'

Rate Limiting

Configuration: - Per-user limits: 10000 requests/hour, 500 requests/minute - Key: JWT subject (sub claim) or IP address fallback - Storage: In-memory (resets on restart)

Rate Limit Headers:

X-RateLimit-Limit: 500
X-RateLimit-Remaining: 498
X-RateLimit-Reset: 1700654400

Rate Limit Exceeded (429):

{
  "type": "https://httpstatuses.com/429",
  "title": "Too Many Requests",
  "status": 429,
  "detail": "Rate limit exceeded. Try again later."
}

Metrics to Monitor

Application: - Request rate and response times - Rate limit violations - Authentication failures - API error rates

Neo4j: - Database size and growth rate - Query execution times - Connection pool usage - Memory usage

Business: - Number of articles created per day - Search query volume - Popular search terms - Sync job success/failure rates


Troubleshooting

Neo4j Connection Issues

Symptom: Database not configured error

Check Neo4j Status:

sudo systemctl status neo4j

Common Issues:

  1. Neo4j not running:

    sudo systemctl start neo4j
    sudo systemctl enable neo4j  # Auto-start on boot
    

  2. Wrong credentials in config:

    cd hivematrix-knowledgetree
    cat instance/knowledgetree.conf
    
    # Test connection manually
    cypher-shell -u neo4j -p your-password
    

  3. Firewall blocking bolt port:

    sudo ufw allow 7687/tcp  # Bolt protocol
    sudo ufw allow 7474/tcp  # Browser UI
    

  4. Re-run setup:

    python init_db.py
    

Search Not Working

Symptom: Search returns no results for known content

Create Full-Text Index:

// Connect to Neo4j
cypher-shell -u neo4j -p your-password

// Create full-text search index
CREATE FULLTEXT INDEX article_search
FOR (n:ContextItem)
ON EACH [n.name, n.content];

// Verify index
SHOW INDEXES;

Query Index:

// Search using full-text index
CALL db.index.fulltext.queryNodes("article_search", "email configuration")
YIELD node, score
RETURN node.name, score;

Sync Scripts Failing

Symptom: sync_codex.py or sync_tickets.py errors

Common Issues:

  1. Codex not running:

    cd hivematrix-helm
    python cli.py status codex
    python cli.py start codex
    

  2. Service token expired:

    # Tokens expire after 5 minutes
    # Sync scripts request new token for each run
    # Check Codex logs for auth failures
    python logs_cli.py codex --tail 50
    

  3. Codex has no data:

    # Ensure Codex has synced from PSA/Datto first
    cd hivematrix-codex
    python pull_freshservice.py  # or your PSA sync script
    

  4. Neo4j constraint violations:

    // Check for constraint errors
    SHOW CONSTRAINTS;
    
    // Drop problematic constraints if needed
    DROP CONSTRAINT constraint_name;
    

  5. Permissions issues:

    # Ensure sync scripts can write to Neo4j
    ls -la instance/
    chmod 755 instance/
    

Debug Sync:

# Run with verbose output
python sync_codex.py 2>&1 | tee sync.log

# Check for errors
grep -i error sync.log

Slow Performance

Symptom: Slow searches, browsing, or context building

Create Indexes:

// Index on node IDs (should exist automatically)
CREATE CONSTRAINT context_item_id IF NOT EXISTS
FOR (n:ContextItem) REQUIRE n.id IS UNIQUE;

// Index on folder flag for faster queries
CREATE INDEX folder_index IF NOT EXISTS
FOR (n:ContextItem) ON (n.is_folder);

// Index on read_only flag
CREATE INDEX readonly_index IF NOT EXISTS
FOR (n:ContextItem) ON (n.read_only);

Check Query Performance:

// Enable query profiling
PROFILE MATCH p = (:ContextItem {id: 'root'})-[:PARENT_OF*..]->(n)
WHERE n.name CONTAINS 'search term'
RETURN n;

// Analyze execution plan
EXPLAIN MATCH (n:ContextItem)
WHERE n.content CONTAINS 'keyword'
RETURN n;

Optimize Neo4j Configuration:

# Edit Neo4j config
sudo nano /etc/neo4j/neo4j.conf

# Increase memory (adjust based on available RAM)
dbms.memory.heap.initial_size=1G
dbms.memory.heap.max_size=2G
dbms.memory.pagecache.size=1G

# Restart Neo4j
sudo systemctl restart neo4j

Check Database Size:

// Count nodes
MATCH (n) RETURN count(n);

// Count relationships
MATCH ()-[r]->() RETURN count(r);

// Find large content nodes
MATCH (n:ContextItem)
WHERE size(n.content) > 10000
RETURN n.name, size(n.content) as size
ORDER BY size DESC;

Missing Data After Sync

Symptom: Companies or tickets not appearing in KnowledgeTree

Verify Codex Has Data:

TOKEN=$(python create_test_token.py 2>/dev/null)

# Check companies
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:5010/codex/api/companies | jq

# Check tickets for a company
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:5010/codex/api/companies/12345/tickets | jq

Check Neo4j for Synced Data:

// Count synced companies
MATCH (root:ContextItem {id: 'root'})-[:PARENT_OF]->(companies {name: 'Companies'})
MATCH (companies)-[:PARENT_OF]->(company)
RETURN count(company);

// List companies
MATCH (:ContextItem {id: 'root'})-[:PARENT_OF]->(:ContextItem {name: 'Companies'})-[:PARENT_OF]->(c)
RETURN c.name, c.id;

// Count tickets
MATCH (n:ContextItem)
WHERE n.name STARTS WITH 'Ticket_'
RETURN count(n);

Re-run Sync:

# Delete synced data and re-sync
cypher-shell -u neo4j -p your-password

// Delete Companies folder and all children
MATCH (root:ContextItem {id: 'root'})-[:PARENT_OF]->(companies {name: 'Companies'})
MATCH (companies)-[:PARENT_OF*0..]->(child)
DETACH DELETE companies, child;

# Re-run sync
python sync_codex.py
python sync_tickets.py

Rate Limit Errors

Symptom: 429 Too Many Requests

Check Limits:

# app/__init__.py
limiter = Limiter(
    app=app,
    key_func=get_user_id_or_ip,
    default_limits=["10000 per hour", "500 per minute"],
    storage_uri="memory://"
)

Increase Limits (if needed):

# For specific endpoints
@app.route('/api/search')
@limiter.limit("1000 per minute")  # Override default
@token_required
def search_nodes():
    ...

Exempt Endpoint from Rate Limiting:

@app.route('/api/public-endpoint')
@limiter.exempt
def public_endpoint():
    ...

File Upload Failures

Symptom: File upload returns 500 error

Check Upload Directory:

ls -la instance/uploads/
chmod 755 instance/uploads/

Check Disk Space:

df -h

Check File Size Limits:

# app/__init__.py
app.config['MAX_CONTENT_LENGTH'] = 16 * 1024 * 1024  # 16 MB limit

# Increase if needed
app.config['MAX_CONTENT_LENGTH'] = 100 * 1024 * 1024  # 100 MB

Check File Permissions:

# Ensure Flask can write to uploads folder
ls -la instance/uploads/
sudo chown -R $USER:$USER instance/uploads/


Security

Authentication

All endpoints require JWT authentication except: - /health - Public health check

Token Validation: - Validates signature against Core's JWKS endpoint - Checks expiration timestamp - Verifies issuer (hivematrix-core) - Extracts user/service identity

Authorization

Permission Levels: - Admin - Full access including admin endpoints - Technician - Create/edit user nodes, view all data - Billing - View access only - Client - View scoped to company data

Endpoint Protection:

@app.route('/admin/wipe')
@admin_required  # Only admins
def admin_wipe():
    ...

@app.route('/api/node', methods=['POST'])
@token_required  # All authenticated users
def create_node():
    # Additional checks inside route
    if g.user.get('permission_level') not in ['admin', 'technician']:
        abort(403)
    ...

Data Security

Read-Only Nodes: - Synced data marked read_only: true - UI prevents editing - API should validate before updates (feature request)

Input Sanitization: - Markdown content sanitized during rendering - Prevents XSS via malicious markdown - File uploads: validate file types, scan for malware (recommended)

Network Security: - Localhost binding: Service listens on 127.0.0.1 only - Nexus proxy: Only entry point to KnowledgeTree - Neo4j: Should be firewalled, accessible only from localhost

Neo4j Security

Authentication: - Use strong password for Neo4j user - Change default neo4j password immediately - Store password securely in instance/knowledgetree.conf

Network Access:

# Block external access to Neo4j
sudo ufw deny 7687/tcp  # Bolt protocol
sudo ufw deny 7474/tcp  # Browser UI

# Allow only localhost
sudo ufw allow from 127.0.0.1 to any port 7687

Backup Encryption:

# Encrypt Neo4j dumps
neo4j-admin database dump --database=neo4j --to=/backup/neo4j.dump
gpg --encrypt --recipient admin@example.com /backup/neo4j.dump

Security Best Practices

  1. Rotate Neo4j Password Regularly
  2. Use TLS for Neo4j (bolt+s:// instead of bolt://)
  3. Enable Neo4j Auth (never disable authentication)
  4. Validate File Uploads (check file types, scan for malware)
  5. Audit Logs (track who created/modified what)
  6. Backup Regularly (export user data, dump Neo4j database)
  7. Monitor Access Patterns (detect unusual activity)

Backup & Recovery

Export User Data

Via Admin UI: 1. Navigate to /admin/settings 2. Click "Export Data" 3. Download JSON file

Via API:

TOKEN=$(python create_test_token.py 2>/dev/null)
curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:5020/admin/export \
  -o knowledgetree_backup.json

What Gets Exported: - User-created nodes only (read_only: false) - Full hierarchical paths - Markdown content - Metadata (is_folder, is_attached)

Does NOT Export: - Synced data from Codex (can be re-synced) - File attachments (separate backup needed) - Neo4j system data

Backup Neo4j Database

Full Database Dump:

# Stop Neo4j
sudo systemctl stop neo4j

# Create dump
sudo neo4j-admin database dump \
  --database=neo4j \
  --to=/backup/neo4j-$(date +%Y%m%d).dump

# Start Neo4j
sudo systemctl start neo4j

# Compress and encrypt
gzip /backup/neo4j-*.dump
gpg --encrypt --recipient admin@example.com /backup/neo4j-*.dump.gz

Online Backup (Enterprise Edition):

# Requires Neo4j Enterprise
neo4j-admin backup \
  --backup-dir=/backup \
  --name=neo4j-backup

Backup File Attachments

# Backup uploads folder
tar -czf uploads-$(date +%Y%m%d).tar.gz instance/uploads/

# Sync to remote backup
rsync -avz instance/uploads/ backup-server:/backups/knowledgetree/uploads/

Restore from Export

Via Admin UI: 1. Navigate to /admin/settings 2. Click "Import Data" 3. Select JSON export file 4. Click "Import"

Via API:

TOKEN=$(python create_test_token.py 2>/dev/null)
curl -X POST \
  -H "Authorization: Bearer $TOKEN" \
  -F "file=@knowledgetree_backup.json" \
  http://localhost:5020/admin/import

Process: 1. Sorts items by path (parents before children) 2. Traverses from root to find parent nodes 3. Uses MERGE to create or update nodes 4. Preserves existing node IDs

Restore Neo4j Database

# Stop Neo4j
sudo systemctl stop neo4j

# Restore from dump
sudo neo4j-admin database load \
  --from=/backup/neo4j-20241122.dump \
  --database=neo4j \
  --overwrite-destination

# Start Neo4j
sudo systemctl start neo4j

# Verify data
cypher-shell -u neo4j -p your-password
MATCH (n:ContextItem) RETURN count(n);

Automated Backup Script

#!/bin/bash
# backup_knowledgetree.sh

BACKUP_DIR="/backups/knowledgetree"
DATE=$(date +%Y%m%d_%H%M%S)

# Export user data
cd /path/to/hivematrix-knowledgetree
source pyenv/bin/activate
TOKEN=$(python ../hivematrix-helm/create_test_token.py 2>/dev/null)
curl -s -H "Authorization: Bearer $TOKEN" \
  http://localhost:5020/admin/export \
  -o "$BACKUP_DIR/export-$DATE.json"

# Backup Neo4j
sudo systemctl stop neo4j
sudo neo4j-admin database dump \
  --database=neo4j \
  --to="$BACKUP_DIR/neo4j-$DATE.dump"
sudo systemctl start neo4j

# Backup uploads
tar -czf "$BACKUP_DIR/uploads-$DATE.tar.gz" instance/uploads/

# Cleanup old backups (keep last 30 days)
find "$BACKUP_DIR" -name "*.json" -mtime +30 -delete
find "$BACKUP_DIR" -name "*.dump" -mtime +30 -delete
find "$BACKUP_DIR" -name "*.tar.gz" -mtime +30 -delete

echo "Backup completed: $DATE"

Schedule Daily Backup:

# Add to crontab
crontab -e

# Run daily at 2 AM
0 2 * * * /path/to/backup_knowledgetree.sh >> /var/log/knowledgetree_backup.log 2>&1


Integration with Other Services

Codex Integration

Purpose: KnowledgeTree syncs all external data from Codex

Data Flow:

PSA (Freshservice) → Codex → KnowledgeTree (companies, contacts, tickets)
Datto RMM          → Codex → KnowledgeTree (assets)

API Calls Made to Codex:

# Get all companies
response = call_service('codex', '/api/companies')

# Get users for a company
response = call_service('codex', f'/api/companies/{account}/users')

# Get assets for a company
response = call_service('codex', f'/api/companies/{account}/assets')

# Get tickets for a company
response = call_service('codex', f'/api/companies/{account}/tickets')

Configuration:

# instance/knowledgetree.conf
[services]
codex_url = http://localhost:5010

Brainhair Integration

Purpose: AI assistant searches and uses KnowledgeTree for context

API Calls Made by Brainhair:

# Search for relevant articles
response = call_service('knowledgetree', '/api/search?query=email+setup')

# Browse knowledge structure
response = call_service('knowledgetree', '/api/browse?path=/IT+Documentation')

# Get full context for a node (most important)
response = call_service('knowledgetree', f'/api/context/{node_id}')

Example Workflow:

  1. User asks Brainhair: "How do I reset John Doe's password?"
  2. Brainhair searches KnowledgeTree:
  3. /api/search?query=John+Doe → finds user node
  4. /api/search?query=password+reset → finds procedure article
  5. Brainhair gets context:
  6. /api/context/{john_doe_node_id} → includes user's tickets, company info
  7. /api/node/{password_reset_article_id} → gets procedure details
  8. Brainhair combines context and generates answer

Context Building:

Brainhair uses the context API to gather all relevant knowledge: - User's contact information - User's ticket history (from attached Tickets folder) - Company-specific procedures - General password reset documentation

This gives Brainhair complete context to answer accurately.

Core Integration

Purpose: JWT authentication

Flow: 1. User requests KnowledgeTree page via Nexus 2. Nexus includes JWT token in cookie/header 3. KnowledgeTree validates JWT: - Fetches JWKS from Core: GET /.well-known/jwks.json - Validates signature, expiration, issuer - Extracts user identity and permissions 4. KnowledgeTree renders page with user context

Service-to-Service:

# KnowledgeTree (or any service) calls another service
# 1. Request service token from Core
response = requests.post(
    f"{CORE_SERVICE_URL}/service-token",
    json={'calling_service': 'knowledgetree'}
)
service_token = response.json()['token']

# 2. Use token to call target service
response = requests.get(
    f"{CODEX_URL}/api/companies",
    headers={'Authorization': f'Bearer {service_token}'}
)

Nexus Integration

Purpose: Frontend proxy and global CSS injection

Proxy Configuration:

Nexus proxies /knowledgetree/ to KnowledgeTree service:

# In Nexus
@app.route('/knowledgetree/', defaults={'path': ''})
@app.route('/knowledgetree/<path:path>')
def proxy_knowledgetree(path):
    return proxy_service('knowledgetree', path, inject_html=True)

CSS Injection:

Nexus injects global.css into all KnowledgeTree HTML responses:

<head>
  ...
  <link rel="stylesheet" href="/static/global.css">
</head>

URL Handling:

KnowledgeTree uses ProxyFix to handle X-Forwarded-Prefix:

# app/__init__.py
from werkzeug.middleware.proxy_fix import ProxyFix
app.wsgi_app = ProxyFix(
    app.wsgi_app,
    x_for=1, x_proto=1, x_host=1, x_prefix=1
)

This ensures url_for() generates correct URLs like /knowledgetree/browse/...


Performance Optimization

Neo4j Optimization

1. Create Indexes:

// Unique constraint on node IDs (auto-created)
CREATE CONSTRAINT context_item_id IF NOT EXISTS
FOR (n:ContextItem) REQUIRE n.id IS UNIQUE;

// Index for folder queries
CREATE INDEX folder_index IF NOT EXISTS
FOR (n:ContextItem) ON (n.is_folder);

// Index for read-only flag
CREATE INDEX readonly_index IF NOT EXISTS
FOR (n:ContextItem) ON (n.read_only);

// Full-text search index
CREATE FULLTEXT INDEX article_search IF NOT EXISTS
FOR (n:ContextItem) ON EACH [n.name, n.content];

2. Increase Memory:

# Edit /etc/neo4j/neo4j.conf
dbms.memory.heap.initial_size=2G
dbms.memory.heap.max_size=4G
dbms.memory.pagecache.size=2G

3. Enable Query Logging:

# Monitor slow queries
dbms.logs.query.enabled=true
dbms.logs.query.threshold=1s

Application Optimization

1. Connection Pooling:

Neo4j driver uses connection pooling by default:

driver = GraphDatabase.driver(
    uri,
    auth=basic_auth(user, password),
    max_connection_pool_size=50,
    connection_acquisition_timeout=60
)

2. Caching:

Add Redis caching for frequently accessed nodes:

import redis
cache = redis.Redis(host='localhost', port=6379)

@app.route('/api/node/<node_id>')
@token_required
def get_node(node_id):
    # Check cache first
    cached = cache.get(f'node:{node_id}')
    if cached:
        return jsonify(json.loads(cached))

    # Query Neo4j
    with driver.session() as session:
        result = session.run(...)
        data = dict(result)

    # Cache result (5 minute TTL)
    cache.setex(f'node:{node_id}', 300, json.dumps(data))
    return jsonify(data)

3. Lazy Loading:

Load child nodes on-demand instead of entire tree:

// Load children when folder is expanded
folder.addEventListener('click', async () => {
  const response = await fetch(`/api/node/${folderId}/children`, {
    credentials: 'same-origin'
  });
  const children = await response.json();
  renderChildren(children);
});

4. Pagination:

Add pagination to large result sets:

@app.route('/api/search')
@token_required
def search_nodes():
    query = request.args.get('query', '')
    page = request.args.get('page', 1, type=int)
    per_page = request.args.get('per_page', 20, type=int)
    skip = (page - 1) * per_page

    with driver.session() as session:
        result = session.run("""
            MATCH (node:ContextItem)
            WHERE toLower(node.name) CONTAINS toLower($query)
            RETURN node
            SKIP $skip
            LIMIT $limit
        """, query=query, skip=skip, limit=per_page)

Query Optimization

Before (Slow):

// Searches all nodes, then filters by path
MATCH (node:ContextItem)
WHERE toLower(node.name) CONTAINS 'search'
MATCH p = (:ContextItem {id: 'root'})-[:PARENT_OF*..]->(node)
RETURN node, [n IN nodes(p) | n.name];

After (Fast):

// Uses full-text index, limits results early
CALL db.index.fulltext.queryNodes("article_search", "search")
YIELD node, score
WITH node, score
LIMIT 15
MATCH p = (:ContextItem {id: 'root'})-[:PARENT_OF*..]->(node)
RETURN node, [n IN nodes(p) | n.name], score
ORDER BY score DESC;


Changelog

Version 2.0.0 (2024-11-22)

  • Complete graph-based knowledge management system
  • Neo4j integration with hierarchical node structure
  • Codex sync for companies, users, assets, tickets
  • Context system for AI assistant integration
  • Attached folders feature
  • Multiple view modes (grid, list, tree)
  • Full-text search across all content
  • Export/import functionality
  • Admin dashboard for database management
  • Markdown rendering with syntax highlighting
  • File attachment support
  • Per-user rate limiting (10000/hour, 500/minute)
  • Structured JSON logging with correlation IDs
  • RFC 7807 error responses
  • Health check monitoring
  • Swagger/OpenAPI documentation

See Also

Architecture & Design

Configuration & Setup

External Resources

Tools & Utilities

  • Codex Sync: /sync/companies, /sync/users, /sync/assets, /sync/tickets
  • Database Admin: Admin Dashboard → Database Management
  • Export/Import: Bulk export/import functionality
  • Service CLI: ../hivematrix-helm/cli.py start|stop|restart knowledgetree
  • Log Viewer: ../hivematrix-helm/logs_cli.py knowledgetree --tail 50

Questions or issues? Check the troubleshooting section or file an issue on GitHub.