Skip to content

Core - Authentication & JWT Management

Port: 5000 Database: Redis (session storage) + PostgreSQL (optional) Repository: hivematrix-core Version: 1.0


Table of Contents


Overview

Core is the central authentication authority for the entire HiveMatrix platform. It serves as the bridge between Keycloak (OAuth2 identity provider) and HiveMatrix services, converting OAuth tokens into platform-specific JWTs with session management and revocation support.

Core Responsibilities

  1. OAuth2 Token Exchange
  2. Receives Keycloak OAuth2 access tokens from Nexus
  3. Validates tokens with Keycloak's userinfo endpoint
  4. Converts to HiveMatrix-signed JWTs

  5. JWT Token Management

  6. Issues RS256-signed JWTs with embedded session IDs
  7. Validates token signatures and expiration
  8. Tracks active sessions with revocation capability
  9. Distributes public keys via JWKS endpoint

  10. Session Management

  11. Creates persistent sessions in Redis (with in-memory fallback)
  12. Tracks session expiration (1-hour default)
  13. Supports explicit revocation (logout)
  14. Probabilistic cleanup of expired sessions

  15. Permission Management

  16. Maps Keycloak groups to permission levels
  17. Embeds permission levels in JWT claims
  18. Supports four-tier permission model

  19. Service-to-Service Authentication

  20. Issues short-lived service tokens (5 minutes)
  21. Validates calling and target services
  22. Enables trusted inter-service communication

Architecture

Technology Stack

  • Framework: Flask 3.0.0
  • Session Storage: Redis (with in-memory fallback)
  • OAuth Library: Authlib
  • JWT Library: PyJWT with cryptography
  • Rate Limiting: Flask-Limiter with Redis backend
  • Logging: Structured JSON logging with correlation IDs
  • API Documentation: Flasgger (OpenAPI/Swagger)
  • Health Checks: Custom HealthChecker library

Production Features

Core includes several production-ready features introduced in version 4.1:

Redis Session Persistence

  • Sessions stored in Redis survive service restarts
  • Automatic TTL expiration (1 hour)
  • Graceful fallback to thread-safe in-memory storage if Redis unavailable
  • Session count tracking for monitoring

Per-User Rate Limiting

  • Rate limits applied per user (JWT subject) instead of IP address
  • Prevents shared IP abuse scenarios
  • Configurable limits per endpoint:
  • /login: 10 requests/minute
  • /auth: 20 requests/minute
  • /api/token/exchange: 20 requests/minute
  • Internal endpoints (validate, service-token): exempt
  • Falls back to IP-based limiting for unauthenticated requests

Structured Logging

  • JSON-formatted logs with correlation IDs for distributed tracing
  • Request/response logging with timing
  • Error context preservation
  • Centralized logging to Helm service
  • Configurable log levels (DEBUG, INFO, WARNING, ERROR)
  • Can be disabled for development (ENABLE_JSON_LOGGING=false)

RFC 7807 Problem Details

  • Standardized machine-readable error responses
  • Consistent error format across all endpoints
  • Includes error type, title, detail, status, and instance
  • Enables automated error handling by clients

OpenAPI/Swagger Documentation

  • Auto-generated API documentation at /docs
  • Interactive API testing interface
  • Request/response schemas
  • Authentication examples
  • Available at http://localhost:5000/docs

Comprehensive Health Checks

  • Component-level health monitoring at /health
  • Checks: Redis, disk space, response latency
  • Status: healthy (200), degraded, or unhealthy (503)
  • Useful for Kubernetes readiness/liveness probes

API Reference

Core exposes a RESTful API with the following endpoints:

Authentication Endpoints

GET /login

Initiates the OAuth2 login flow by redirecting to Keycloak.

Query Parameters: - next (optional): URL to redirect to after successful login

Rate Limit: 10 requests/minute

Response: 302 Redirect to Keycloak authorization endpoint

Example:

curl -L http://localhost:5000/login?next=/dashboard


GET /auth

OAuth2 callback endpoint. Handles authorization code from Keycloak.

Query Parameters: - code (required, provided by Keycloak): Authorization code - error (optional): Error code if authentication failed

Rate Limit: 20 requests/minute

Response: - Success: 302 Redirect to next_url or user's preferred home page with JWT token - Error: 302 Redirect back to /login

User Home Page Logic: 1. Attempts to fetch user's preferred home page from Codex (/api/public/user/home-page) 2. Falls back through: beacon → knowledgetree → codex → helm 3. Redirects to {nexus_url}/{service}/?token={jwt}

Example Response:

302 Found
Location: https://localhost/beacon/?token=eyJhbGciOiJSUzI1NiIs...


POST /logout

Ends the user session and revokes tokens.

Query Parameters: - redirect (optional): URL to redirect after logout (default: /)

Actions Performed: 1. Revokes refresh token with Keycloak 2. Revokes access token with Keycloak 3. Clears Flask session 4. Redirects to Keycloak logout endpoint 5. Sets cache control headers to prevent back button issues

Response: 302 Redirect with cleared session cookie

Example:

curl -X POST http://localhost:5000/logout?redirect=/


Token Management

POST /api/token/exchange

Exchanges a Keycloak OAuth2 access token for a HiveMatrix JWT.

Rate Limit: 20 requests/minute

Request Body:

{
  "access_token": "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9..."
}

Alternative: Send access token in Authorization: Bearer <token> header

Response (200 OK):

{
  "token": "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCIsImtpZCI6Imhpdm..."
}

JWT Payload:

{
  "iss": "hivematrix-core",
  "sub": "user-uuid-1234-5678-9abc-def012345678",
  "jti": "session-abc123...",
  "name": "John Doe",
  "email": "john.doe@example.com",
  "preferred_username": "johndoe",
  "permission_level": "admin",
  "groups": ["/admins", "/technicians"],
  "iat": 1700000000,
  "exp": 1700003600
}

Process: 1. Validates Keycloak token with userinfo endpoint (via Nexus proxy) 2. Extracts user information and group membership 3. Determines permission level based on groups 4. Creates session in Redis with session ID 5. Mints HiveMatrix JWT with session ID as jti claim 6. Returns JWT to caller (typically Nexus)

Error Responses: - 400 Bad Request: No access token provided - 401 Unauthorized: Invalid Keycloak token - 500 Internal Server Error: Token exchange failed

Example:

TOKEN=$(curl -s -X POST http://localhost:5000/api/token/exchange \
  -H "Content-Type: application/json" \
  -d '{"access_token": "keycloak_token_here"}' | jq -r '.token')


POST /api/token/validate

Validates a HiveMatrix JWT and checks session status.

Rate Limit: Exempt (called frequently by services)

Request Body:

{
  "token": "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9..."
}

Alternative: Send token in Authorization: Bearer <token> header

Response (200 OK):

{
  "valid": true,
  "user": {
    "sub": "user-uuid-1234",
    "name": "John Doe",
    "email": "john.doe@example.com",
    "preferred_username": "johndoe",
    "permission_level": "admin",
    "groups": ["/admins"]
  }
}

Validation Checks: 1. JWT signature verification (RS256) 2. Token expiration check 3. Session ID (jti) existence in Redis 4. Session not expired (1-hour TTL) 5. Session not revoked (logout)

Error Responses: - 400 Bad Request: No token provided - 401 Unauthorized: Token invalid, expired, or session revoked

{
  "valid": false,
  "error": "Session expired or revoked"
}

Example:

curl -X POST http://localhost:5000/api/token/validate \
  -H "Content-Type: application/json" \
  -d '{"token": "'"$TOKEN"'"}'


POST /api/token/revoke

Revokes a session (logout). Token will fail validation after revocation.

Request Body:

{
  "token": "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9..."
}

Response (200 OK):

{
  "message": "Session revoked successfully"
}

Process: 1. Decodes JWT to extract session ID (jti claim) 2. Marks session as revoked in Redis 3. Session remains in Redis until TTL expires (for audit) 4. Subsequent validation requests will fail

Error Responses: - 400 Bad Request: No token provided or no session ID - 401 Unauthorized: Invalid token - 404 Not Found: Session not found or already revoked

Example:

curl -X POST http://localhost:5000/api/token/revoke \
  -H "Content-Type: application/json" \
  -d '{"token": "'"$TOKEN"'"}'


Service-to-Service

POST /service-token

Generates a short-lived JWT for service-to-service communication.

Rate Limit: Exempt (protected by token caching in service_client.py)

Request Body:

{
  "calling_service": "codex",
  "target_service": "ledger"
}

Validation: - Service names must match pattern: ^[a-z0-9_-]{1,50}$ - Both services must exist in services.json

Response (200 OK):

{
  "token": "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9..."
}

Service Token Payload:

{
  "iss": "hivematrix-core",
  "sub": "service:codex",
  "calling_service": "codex",
  "target_service": "ledger",
  "type": "service",
  "iat": 1700000000,
  "exp": 1700000300
}

Token Characteristics: - Expiration: 5 minutes (300 seconds) - Type: service (distinguishes from user tokens) - Subject: service:{calling_service} - Purpose: Enables trusted inter-service API calls

Error Responses: - 400 Bad Request: Missing parameters, invalid format, or unknown service

Example:

SERVICE_TOKEN=$(curl -s -X POST http://localhost:5000/service-token \
  -H "Content-Type: application/json" \
  -d '{"calling_service": "codex", "target_service": "ledger"}' | jq -r '.token')


Public Endpoints

GET /.well-known/jwks.json

JSON Web Key Set (JWKS) endpoint. Publishes Core's public RSA key.

Rate Limit: Exempt (public endpoint)

Response (200 OK):

{
  "keys": [
    {
      "kty": "RSA",
      "alg": "RS256",
      "kid": "hivematrix-signing-key-1",
      "use": "sig",
      "n": "0vx7agoebGcQSuuPiLJXZptN9nndrQmbXEps2aiAFbWhM78LhWx...",
      "e": "AQAB"
    }
  ]
}

Purpose: - Services fetch this on startup to verify JWT signatures - Standard RFC 7517 JWKS format - No authentication required (public key is public information)

Usage by Services:

from jwt import PyJWKClient

jwks_client = PyJWKClient('http://localhost:5000/.well-known/jwks.json')
signing_key = jwks_client.get_signing_key_from_jwt(token)


GET /health

Comprehensive health check endpoint.

Rate Limit: Exempt

Response (200 OK - Healthy):

{
  "service": "core",
  "status": "healthy",
  "timestamp": "2025-11-22T10:30:00.000Z",
  "checks": {
    "redis": {
      "status": "healthy",
      "latency_ms": 2,
      "connected_clients": 5,
      "used_memory_mb": 2.45
    },
    "disk": {
      "status": "healthy",
      "usage_percent": 45.67,
      "free_gb": 123.45,
      "total_gb": 250.0
    }
  }
}

Response (503 Service Unavailable - Degraded):

{
  "service": "core",
  "status": "degraded",
  "timestamp": "2025-11-22T10:30:00.000Z",
  "checks": {
    "redis": {
      "status": "unhealthy",
      "error": "Connection refused"
    },
    "disk": {
      "status": "healthy",
      "usage_percent": 45.67,
      "free_gb": 123.45,
      "total_gb": 250.0
    }
  }
}

Health Status Determination: - healthy (200): All checks passing - degraded (503): Non-critical issues (e.g., Redis down but in-memory fallback working) - unhealthy (503): Critical component failure (e.g., disk full)

Checks Performed: 1. Redis: Connectivity, latency, memory usage, client count 2. Disk Space: Usage percentage, free space - Healthy: < 85% - Degraded: 85-95% - Unhealthy: >= 95%


GET /docs

Interactive OpenAPI/Swagger API documentation.

Response: HTML page with interactive API explorer

Features: - Browse all endpoints with descriptions - View request/response schemas - Test API calls directly from browser - Authentication examples - Download OpenAPI spec at /apispec.json

Access: http://localhost:5000/docs


Authentication Flows

User Login Flow

Complete OAuth2 authorization code flow:

┌─────────┐                ┌───────┐                ┌──────────┐                ┌──────┐
│ Browser │                │ Nexus │                │ Keycloak │                │ Core │
└────┬────┘                └───┬───┘                └────┬─────┘                └───┬──┘
     │                         │                         │                          │
     │  1. GET /               │                         │                          │
     ├────────────────────────>│                         │                          │
     │                         │                         │                          │
     │  2. No session          │                         │                          │
     │  Redirect to Core login │                         │                          │
     │<────────────────────────┤                         │                          │
     │                         │                         │                          │
     │  3. GET /login?next=/   │                         │                          │
     ├──────────────────────────────────────────────────────────────────────────────>│
     │                         │                         │                          │
     │  4. Redirect to Keycloak                          │                          │
     │<──────────────────────────────────────────────────────────────────────────────┤
     │                         │                         │                          │
     │  5. GET /realms/.../auth                          │                          │
     ├──────────────────────────────────────────────────>│                          │
     │                         │                         │                          │
     │  6. Login form (username/password)                │                          │
     │<──────────────────────────────────────────────────┤                          │
     │                         │                         │                          │
     │  7. POST credentials    │                         │                          │
     ├──────────────────────────────────────────────────>│                          │
     │                         │                         │                          │
     │  8. Redirect to /auth with code                   │                          │
     │<──────────────────────────────────────────────────┤                          │
     │                         │                         │                          │
     │  9. GET /auth?code=xxx  │                         │                          │
     ├──────────────────────────────────────────────────────────────────────────────>│
     │                         │                         │                          │
     │                         │                         │  10. Exchange code       │
     │                         │                         │  for access token        │
     │                         │                         │<─────────────────────────┤
     │                         │                         │                          │
     │                         │                         │  11. Access token        │
     │                         │                         ├─────────────────────────>│
     │                         │                         │                          │
     │                         │                         │  12. Validate with       │
     │                         │                         │  /userinfo endpoint      │
     │                         │                         │<─────────────────────────┤
     │                         │                         │                          │
     │                         │                         │  13. User info + groups  │
     │                         │                         ├─────────────────────────>│
     │                         │                         │                          │
     │                         │                         │  14. Create session      │
     │                         │                         │  in Redis, mint JWT      │
     │                         │                         │  with jti=session_id     │
     │                         │                         │                          │
     │                         │  15. Get user's home    │                          │
     │                         │  page from Codex        │                          │
     │                         │<─────────────────────────────────────────────────────┤
     │                         │                         │                          │
     │  16. Redirect to home page with JWT               │                          │
     │<──────────────────────────────────────────────────────────────────────────────┤
     │                         │                         │                          │
     │  17. GET /beacon/?token=jwt                       │                          │
     ├────────────────────────>│                         │                          │
     │                         │                         │                          │
     │                         │  18. Store JWT in session cookie                   │
     │                         │                         │                          │
     │  19. Authenticated page │                         │                          │
     │<────────────────────────┤                         │                          │

Key Points: - User never enters credentials on HiveMatrix pages (handled by Keycloak) - Core never stores passwords (delegated to Keycloak) - JWT includes session ID for revocation capability - Permission level determined from Keycloak group membership


Token Validation Flow

On every authenticated request to any HiveMatrix service:

┌─────────┐         ┌───────┐         ┌──────────┐         ┌──────┐
│ Browser │         │ Nexus │         │  Codex   │         │ Core │
└────┬────┘         └───┬───┘         └────┬─────┘         └───┬──┘
     │                  │                   │                   │
     │  1. GET /codex/  │                   │                   │
     ├─────────────────>│                   │                   │
     │                  │                   │                   │
     │                  │  2. Extract JWT   │                   │
     │                  │  from session     │                   │
     │                  │                   │                   │
     │                  │  3. POST /api/token/validate          │
     │                  │  {token: jwt}     │                   │
     │                  ├───────────────────────────────────────>│
     │                  │                   │                   │
     │                  │                   │  4. Verify JWT    │
     │                  │                   │  signature with   │
     │                  │                   │  public key       │
     │                  │                   │                   │
     │                  │                   │  5. Check session │
     │                  │                   │  in Redis:        │
     │                  │                   │  - Exists?        │
     │                  │                   │  - Not expired?   │
     │                  │                   │  - Not revoked?   │
     │                  │                   │                   │
     │                  │  6. {valid: true, user: {...}}        │
     │                  │<───────────────────────────────────────┤
     │                  │                   │                   │
     │                  │  7. Proxy to Codex with               │
     │                  │  Authorization: Bearer <jwt>          │
     │                  ├──────────────────>│                   │
     │                  │                   │                   │
     │                  │  8. Codex verifies│                   │
     │                  │  JWT independently│                   │
     │                  │  using public key │                   │
     │                  │                   │                   │
     │                  │  9. Response      │                   │
     │                  │<──────────────────┤                   │
     │                  │                   │                   │
     │  10. Response    │                   │                   │
     │<─────────────────┤                   │                   │

Validation Sequence: 1. Nexus validates token with Core 2. Core checks signature, expiration, and session status 3. If valid, Nexus proxies request with JWT to backend service 4. Backend service independently verifies JWT signature using Core's public key 5. Backend service can trust JWT claims (permission_level, groups, etc.)

Why Dual Validation? - Nexus → Core: Checks session status (revocation, expiration) - Backend → JWT: Verifies signature (prevents tampering) - Provides defense in depth and distributed trust model


Logout Flow

Complete session termination:

┌─────────┐         ┌───────┐         ┌──────────┐         ┌──────┐
│ Browser │         │ Nexus │         │ Keycloak │         │ Core │
└────┬────┘         └───┬───┘         └────┬─────┘         └───┬──┘
     │                  │                   │                   │
     │  1. Click Logout│                   │                   │
     ├─────────────────>│                   │                   │
     │                  │                   │                   │
     │                  │  2. Extract JWT   │                   │
     │                  │  from session     │                   │
     │                  │                   │                   │
     │                  │  3. POST /api/token/revoke            │
     │                  │  {token: jwt}     │                   │
     │                  ├───────────────────────────────────────>│
     │                  │                   │                   │
     │                  │                   │  4. Extract jti   │
     │                  │                   │  from JWT         │
     │                  │                   │                   │
     │                  │                   │  5. Mark session  │
     │                  │                   │  as revoked in    │
     │                  │                   │  Redis            │
     │                  │                   │                   │
     │                  │  6. {message: "Session revoked"}      │
     │                  │<───────────────────────────────────────┤
     │                  │                   │                   │
     │                  │  7. Redirect to   │                   │
     │                  │  Core /logout     │                   │
     │                  │───────────────────────────────────────>│
     │                  │                   │                   │
     │                  │                   │  8. Revoke tokens │
     │                  │                   │  with Keycloak    │
     │                  │                   │<──────────────────┤
     │                  │                   │                   │
     │                  │                   │  9. Redirect to   │
     │                  │                   │  Keycloak logout  │
     │                  │                   │  endpoint         │
     │                  │<───────────────────────────────────────┤
     │                  │                   │                   │
     │  10. Redirect to Keycloak logout     │                   │
     │<─────────────────┤                   │                   │
     │                  │                   │                   │
     │  11. GET /realms/.../logout          │                   │
     ├──────────────────────────────────────>│                   │
     │                  │                   │                   │
     │  12. Clear Keycloak session          │                   │
     │  Redirect to post_logout_redirect_uri│                   │
     │<──────────────────────────────────────┤                   │
     │                  │                   │                   │
     │  13. Back to home page (logged out)  │                   │

Logout Steps: 1. Nexus revokes HiveMatrix session with Core 2. Core marks session as revoked in Redis (jti) 3. Core revokes Keycloak tokens (refresh + access) 4. Core clears Flask session cookie 5. Core redirects to Keycloak logout endpoint 6. Keycloak clears its session 7. User redirected to home page (logged out)

Security Considerations: - Session revocation is immediate (subsequent validation fails) - Revoked sessions remain in Redis until TTL expires (audit trail) - Cache control headers prevent back button from showing cached pages - Session cookie explicitly deleted


Service-to-Service Flow

How services authenticate with each other:

┌────────┐                              ┌──────┐                              ┌────────┐
│ Codex  │                              │ Core │                              │ Ledger │
└───┬────┘                              └───┬──┘                              └───┬────┘
    │                                       │                                     │
    │  1. Need to call Ledger API           │                                     │
    │                                       │                                     │
    │  2. POST /service-token               │                                     │
    │  {calling_service: "codex",           │                                     │
    │   target_service: "ledger"}           │                                     │
    ├──────────────────────────────────────>│                                     │
    │                                       │                                     │
    │                                       │  3. Validate service names          │
    │                                       │  against services.json              │
    │                                       │                                     │
    │                                       │  4. Mint service token              │
    │                                       │  (5 min expiry, type="service")     │
    │                                       │                                     │
    │  5. {token: <service_jwt>}            │                                     │
    │<──────────────────────────────────────┤                                     │
    │                                       │                                     │
    │  6. GET /api/invoices                 │                                     │
    │  Authorization: Bearer <service_jwt>  │                                     │
    ├─────────────────────────────────────────────────────────────────────────────>│
    │                                       │                                     │
    │                                       │                                     │
    │                                       │                                     │
    │                                       │  7. Verify JWT signature            │
    │                                       │  using Core's public key            │
    │                                       │                                     │
    │                                       │  8. Check type="service"            │
    │                                       │  Set g.is_service_call=True         │
    │                                       │                                     │
    │                                       │  9. Bypass user permission checks   │
    │                                       │  (services are trusted)             │
    │                                       │                                     │
    │  10. {invoices: [...]}                │                                     │
    │<─────────────────────────────────────────────────────────────────────────────┤

Service Token Characteristics: - Short-lived: 5 minutes (prevents long-term abuse if leaked) - Scoped: Includes calling and target service names - Trusted: Backend services bypass user permission checks - Validated: Service names must exist in services.json

Usage in Code:

from app.service_client import call_service

# Automatically handles token request and API call
response = call_service('ledger', '/api/invoices')
invoices = response.json()


Session Management

Redis-Backed Sessions

Core uses Redis for persistent session storage with automatic fallback to in-memory storage.

Why Redis? - Sessions survive Core service restarts - Shared session storage (for multi-instance deployments) - Automatic TTL expiration - High performance (sub-millisecond latency)

Fallback Behavior: - If Redis unavailable at startup → uses thread-safe in-memory dict - If Redis fails during operation → switches to in-memory storage - In-memory mode logs warning: "SessionManager: Using in-memory sessions" - Sessions lost on service restart when using in-memory mode

Configuration:

# Redis runs on standard port
redis-server --port 6379

# Check Redis connectivity
redis-cli ping
# Expected: PONG


Session Lifecycle

Creation

When a user logs in (via /api/token/exchange):

  1. Session Data Created:

    {
        'user_data': {
            'sub': 'user-uuid',
            'name': 'John Doe',
            'email': 'john@example.com',
            'preferred_username': 'johndoe',
            'permission_level': 'admin',
            'groups': ['/admins']
        },
        'created_at': 1700000000,
        'expires_at': 1700003600,  # 1 hour later
        'revoked': False
    }
    

  2. Stored in Redis:

  3. Key: session:{session_id}
  4. Value: JSON-serialized session data
  5. TTL: 3600 seconds (1 hour)
  6. Session ID: 32-byte URL-safe token (e.g., a1b2c3d4...)

  7. Session ID Embedded in JWT:

  8. jti claim contains session ID
  9. Links JWT to revocable session

Validation

On every request (via /api/token/validate):

  1. Decode JWT to extract jti (session ID)
  2. Fetch session from Redis: GET session:{jti}
  3. Check if session exists
  4. Check if session expired: expires_at < current_time
  5. Check if session revoked: revoked == True
  6. Return user data if all checks pass

Performance: - Redis lookup: ~1-2ms - In-memory lookup: ~0.1ms - Total validation time: <5ms

Expiration

Sessions automatically expire after 1 hour:

  • Redis Mode: TTL handled by Redis (automatic deletion)
  • In-Memory Mode: Probabilistic cleanup on validation (1% chance)
  • Manual Cleanup: session_manager.cleanup_expired() (returns count)

Monitoring:

# Get active session count
count = session_manager.get_active_session_count()
print(f"Active sessions: {count}")


Session Revocation

Explicit Revocation (Logout)

Via /api/token/revoke:

  1. Extract jti from JWT
  2. Fetch session from Redis
  3. Mark revoked: True
  4. Update session in Redis with remaining TTL
  5. Session remains in Redis until TTL expires (audit trail)

Why Keep Revoked Sessions? - Audit logging (track logout events) - Debug investigations (why did user get logged out?) - Forensics (detect suspicious logout patterns)

Cleanup

Expired sessions are automatically removed:

  • Redis: Automatic via TTL (no action needed)
  • In-Memory: Manual cleanup via cleanup_expired()
  • Called probabilistically (1% of validations)
  • Can be called manually via health endpoint or cron job

Permission Levels

HiveMatrix uses a four-tier permission model based on Keycloak group membership.

Permission Tiers

Level Keycloak Group Description Typical Use Cases
admin admins or /admins Full system access System administrators, platform owners
technician technicians or /technicians Technical operations MSP technicians, engineers, helpdesk staff
billing billing or /billing Financial operations Billing department, accountants
client (default) Limited access End-user clients, read-only access

Permission Determination

During token exchange (/api/token/exchange), Core:

  1. Fetches user info from Keycloak (includes groups array)
  2. Checks group membership in priority order:
    if 'admins' in groups or '/admins' in groups:
        permission_level = 'admin'
    elif 'technicians' in groups or '/technicians' in groups:
        permission_level = 'technician'
    elif 'billing' in groups or '/billing' in groups:
        permission_level = 'billing'
    else:
        permission_level = 'client'
    
  3. Embeds permission_level in JWT claims
  4. Services enforce permissions using decorators

Service-Level Enforcement

Services use the permission_level claim from JWT:

from flask import g

@app.route('/admin/settings')
@token_required
def admin_settings():
    # Check permission level
    if g.user.get('permission_level') != 'admin':
        abort(403, "Admin access required")

    # Admin-only logic here
    return render_template('admin/settings.html')

Built-in Decorators: - @token_required - Any authenticated user - @admin_required - Admin-level only (auto-checks permission)

Group Format

Keycloak groups may appear in two formats: - Without leading slash: admins, technicians, billing - With leading slash: /admins, /technicians, /billing

Core handles both formats for compatibility.


Configuration

Environment Variables

Core is configured entirely via environment variables (loaded from .flaskenv).

⚠️ Important: .flaskenv is auto-generated by config_manager.py from Helm's master_config.json. Do not edit manually.

Flask Configuration

# Flask application settings
FLASK_APP=run.py
FLASK_ENV=development  # or production
SECRET_KEY=<generated-secret>  # Flask session encryption key
SERVICE_NAME=core

Keycloak Configuration

# Keycloak OAuth2 settings
KEYCLOAK_SERVER_URL=http://localhost:8080
KEYCLOAK_REALM=hivematrix
KEYCLOAK_CLIENT_ID=core-client
KEYCLOAK_CLIENT_SECRET=<client-secret>

Note: KEYCLOAK_CLIENT_SECRET is generated during Keycloak setup by configure_keycloak.sh and stored in master_config.json.

JWT Configuration

# JWT signing configuration
JWT_PRIVATE_KEY_FILE=keys/jwt_private.pem
JWT_PUBLIC_KEY_FILE=keys/jwt_public.pem
JWT_ISSUER=hivematrix-core
JWT_ALGORITHM=RS256

Service URLs

# Inter-service communication
CORE_SERVICE_URL=http://localhost:5000
NEXUS_SERVICE_URL=https://localhost
HELM_SERVICE_URL=http://localhost:5004
CODEX_SERVICE_URL=http://localhost:5010  # For user home page preference

Logging Configuration

# Logging settings (production features)
LOG_LEVEL=INFO  # DEBUG, INFO, WARNING, ERROR
ENABLE_JSON_LOGGING=true  # Structured JSON logs with correlation IDs

Security Configuration

# SSL/TLS settings
VERIFY_SSL=False  # Set to True in production with valid certificates

⚠️ Security Warning: VERIFY_SSL=False is common in HiveMatrix deployments using self-signed certificates. For production with valid SSL certificates, set to True.

Session Configuration

Session settings are hardcoded in app/__init__.py:

app.config['SESSION_COOKIE_SECURE'] = False  # Set to True in production
app.config['SESSION_COOKIE_HTTPONLY'] = True  # Prevents JavaScript access
app.config['SESSION_COOKIE_SAMESITE'] = 'Lax'  # CSRF protection
app.config['PERMANENT_SESSION_LIFETIME'] = 3600  # 1 hour

For Production: - Set SESSION_COOKIE_SECURE = True (requires HTTPS) - Keep HTTPONLY = True (prevents XSS attacks) - Keep SAMESITE = 'Lax' (prevents CSRF attacks)


RSA Key Generation

Core uses RSA key pairs for JWT signing and verification.

Generate Keys

cd hivematrix-core

# Create keys directory
mkdir -p keys

# Generate private key (2048-bit RSA)
openssl genrsa -out keys/jwt_private.pem 2048

# Extract public key
openssl rsa -in keys/jwt_private.pem -pubout -out keys/jwt_public.pem

# Set proper permissions
chmod 600 keys/jwt_private.pem
chmod 644 keys/jwt_public.pem

Key Rotation

To rotate keys (e.g., annually or after compromise):

  1. Generate new key pair
  2. Update .flaskenv to point to new keys
  3. Restart Core service
  4. All services auto-fetch new public key from /.well-known/jwks.json
  5. Existing sessions remain valid until expiration (1 hour max)

Gradual Rollover: For zero-downtime rotation, Core can publish multiple keys in JWKS (future enhancement).

Key Security

Private Key (jwt_private.pem): - Never commit to git (included in .gitignore) - Permissions: 600 (owner read/write only) - Backup: Store securely (password manager, secrets vault) - Compromise: Rotate immediately if leaked

Public Key (jwt_public.pem): - Safe to distribute (published via JWKS endpoint) - Services cache this key for JWT verification - Automatically fetched on service startup


Security

Rate Limiting

Core implements per-user rate limiting using Flask-Limiter with Redis backend.

Per-User Limits

Rate limits are applied per user (JWT subject) instead of per IP address:

# In app/rate_limit_key.py
def get_user_id_or_ip():
    """
    Extract user ID from JWT if available, otherwise use IP address.
    Prevents shared IP abuse (e.g., corporate NAT, VPN users).
    """
    auth_header = request.headers.get('Authorization', '')
    if auth_header.startswith('Bearer '):
        token = auth_header[7:]
        try:
            payload = jwt.decode(token, options={"verify_signature": False})
            return f"user:{payload.get('sub', 'unknown')}"
        except:
            pass
    return f"ip:{request.remote_addr}"

Benefits: - Users behind shared IPs (corporate NAT) don't affect each other - Prevents one abusive user from blocking entire organization - More accurate rate limiting for distributed systems

Endpoint Limits

Endpoint Limit Reason
/login 10/minute Prevent credential stuffing
/auth 20/minute Prevent OAuth callback abuse
/api/token/exchange 20/minute Prevent token brute force
/service-token Exempt Protected by token caching
/api/token/validate Exempt Called frequently by services
/.well-known/jwks.json Exempt Public endpoint
/health Exempt Monitoring endpoint

Storage Backend

  • Primary: Redis (shared state for multi-instance deployments)
  • Fallback: In-memory (if Redis unavailable)
# In app/__init__.py
try:
    redis_client = redis.Redis(host='localhost', port=6379)
    redis_client.ping()
    storage_uri = "redis://localhost:6379"
except:
    storage_uri = "memory://"

Response Headers

When rate limit exceeded, Core returns:

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 10
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1700003600
Retry-After: 60

{
  "error": "Rate limit exceeded"
}

Error Handling

Core implements RFC 7807 Problem Details for standardized error responses.

Error Response Format

All errors follow this schema:

{
  "type": "https://tools.ietf.org/html/rfc7807",
  "title": "Unauthorized",
  "status": 401,
  "detail": "Invalid access token",
  "instance": "/api/token/exchange"
}

Fields: - type: URI reference identifying problem type - title: Short, human-readable summary - status: HTTP status code - detail: Specific explanation for this occurrence - instance: URI reference to specific occurrence

Standard Error Handlers

@app.errorhandler(400)  # Bad Request
@app.errorhandler(401)  # Unauthorized
@app.errorhandler(403)  # Forbidden
@app.errorhandler(404)  # Not Found
@app.errorhandler(500)  # Internal Server Error
@app.errorhandler(503)  # Service Unavailable

Example Error Responses

Bad Request:

{
  "type": "https://tools.ietf.org/html/rfc7807",
  "title": "Bad Request",
  "status": 400,
  "detail": "No access token provided",
  "instance": "/api/token/exchange"
}

Unauthorized:

{
  "type": "https://tools.ietf.org/html/rfc7807",
  "title": "Unauthorized",
  "status": 401,
  "detail": "Session expired or revoked",
  "instance": "/api/token/validate"
}

Internal Server Error:

{
  "type": "https://tools.ietf.org/html/rfc7807",
  "title": "Internal Server Error",
  "status": 500,
  "detail": "An unexpected error occurred",
  "instance": "/api/token/exchange"
}

Benefits

  • Machine-readable: Clients can parse and handle errors programmatically
  • Standardized: RFC 7807 is an internet standard
  • Consistent: All services use the same error format
  • Debuggable: instance field helps trace errors

SSL/TLS

Core supports configurable SSL verification for connecting to Keycloak.

Configuration

# In .flaskenv
VERIFY_SSL=False  # Development with self-signed certs
VERIFY_SSL=True   # Production with valid certificates

Usage

All requests to Keycloak respect this setting:

response = requests.get(
    userinfo_url,
    headers={'Authorization': f'Bearer {access_token}'},
    verify=current_app.config.get('VERIFY_SSL', True),
    timeout=10
)

When to Use: - Development: VERIFY_SSL=False (self-signed certs common) - Production: VERIFY_SSL=True (valid SSL certificates)

⚠️ Security Warning: Never use VERIFY_SSL=False in production with internet-facing services.


Monitoring & Observability

Health Checks

Core provides comprehensive health monitoring via the /health endpoint.

Health Check Components

1. Redis Connectivity

{
  "redis": {
    "status": "healthy",
    "latency_ms": 2,
    "connected_clients": 5,
    "used_memory_mb": 2.45
  }
}

Checks: - Connection test (PING command) - Response latency - Active client count - Memory usage

Status: - healthy: Connected, low latency (<10ms) - unhealthy: Connection failed or timeout - null: Redis not configured (in-memory mode)


2. Disk Space

{
  "disk": {
    "status": "healthy",
    "usage_percent": 45.67,
    "free_gb": 123.45,
    "total_gb": 250.0
  }
}

Thresholds: - healthy: <85% used - degraded: 85-95% used - unhealthy: ≥95% used


Overall Status Logic

if disk >= 95% or database down:
    return "unhealthy" (503)
elif redis down or disk >= 85%:
    return "degraded" (503)
else:
    return "healthy" (200)

Kubernetes Integration

Use /health for readiness and liveness probes:

apiVersion: v1
kind: Pod
metadata:
  name: hivematrix-core
spec:
  containers:
  - name: core
    image: hivematrix/core:latest
    ports:
    - containerPort: 5000
    livenessProbe:
      httpGet:
        path: /health
        port: 5000
      initialDelaySeconds: 30
      periodSeconds: 10
      failureThreshold: 3
    readinessProbe:
      httpGet:
        path: /health
        port: 5000
      initialDelaySeconds: 10
      periodSeconds: 5
      failureThreshold: 2

Logging

Core implements structured JSON logging with correlation IDs for distributed tracing.

Structured Logging

Format:

{
  "timestamp": "2025-11-22T10:30:00.123Z",
  "level": "INFO",
  "service": "core",
  "correlation_id": "a1b2c3d4-e5f6-7890",
  "message": "User admin logged in",
  "user": "admin",
  "endpoint": "/auth",
  "method": "GET",
  "status_code": 302,
  "duration_ms": 145
}

Fields: - timestamp: ISO 8601 UTC timestamp - level: DEBUG, INFO, WARNING, ERROR - service: Always "core" - correlation_id: Unique request ID (traces request across services) - message: Log message - Additional context fields (user, endpoint, etc.)

Correlation IDs

Correlation IDs enable tracing requests across multiple services:

User Request → Nexus (correlation_id: abc123)
  └─> Core validates token (correlation_id: abc123)
      └─> Codex fetches data (correlation_id: abc123)
          └─> Ledger calculates billing (correlation_id: abc123)

Search logs by correlation ID to see entire request flow.

Configuration

# Enable/disable structured logging
ENABLE_JSON_LOGGING=true   # Production (machine-readable)
ENABLE_JSON_LOGGING=false  # Development (human-readable)

# Set log level
LOG_LEVEL=INFO  # DEBUG, INFO, WARNING, ERROR

Centralized Logging

Core sends logs to Helm's PostgreSQL database:

from app.helm_logger import init_helm_logger

helm_logger = init_helm_logger('core', 'http://localhost:5004')
helm_logger.info("User logged in", extra={'user': username})

View Logs:

cd hivematrix-helm
source pyenv/bin/activate

# View Core logs
python logs_cli.py core --tail 50

# Filter by level
python logs_cli.py core --level ERROR --tail 100


Metrics

Core exposes metrics via health endpoint and logs:

Session Metrics

# Active session count
GET /health  checks.redis.connected_clients

# Session operations
helm_logger.info("Session created", extra={'session_id': session_id})
helm_logger.info("Session validated", extra={'session_id': session_id})
helm_logger.info("Session revoked", extra={'session_id': session_id})

Performance Metrics

{
  "redis": {"latency_ms": 2},
  "request_duration_ms": 145
}

Future Enhancements

Planned metrics (not yet implemented): - Prometheus endpoint (/metrics) - Token exchange rate - Authentication success/failure rate - Permission level distribution - Service token usage


Development

Running Locally

Prerequisites: - Python 3.9+ - Redis server - Keycloak 26.4.0+ - PostgreSQL (optional, for Helm logging)

Setup:

cd hivematrix-core

# Create virtual environment
python3 -m venv pyenv
source pyenv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Generate RSA keys (if not exists)
mkdir -p keys
openssl genrsa -out keys/jwt_private.pem 2048
openssl rsa -in keys/jwt_private.pem -pubout -out keys/jwt_public.pem

# Configure environment
# .flaskenv is auto-generated by config_manager.py
# To regenerate:
cd ../hivematrix-helm
python config_manager.py write-dotenv core
cd ../hivematrix-core

# Start Redis
redis-server --port 6379

# Run Core
python run.py

Expected Output:

SessionManager: Using Redis for persistent sessions
Flask-Limiter: Using Redis for rate limiting
 * Running on http://127.0.0.1:5000
 * Debug mode: on

Access Points: - API: http://localhost:5000 - Swagger Docs: http://localhost:5000/docs - Health Check: http://localhost:5000/health - JWKS: http://localhost:5000/.well-known/jwks.json


Testing

Manual Testing

1. Generate Test Token:

cd ../hivematrix-helm
source pyenv/bin/activate

# Create admin token (24-hour expiry)
TOKEN=$(python create_test_token.py 2>/dev/null)
echo "Token: $TOKEN"

2. Validate Token:

curl -s -X POST http://localhost:5000/api/token/validate \
  -H "Content-Type: application/json" \
  -d "{\"token\": \"$TOKEN\"}" | jq

Expected:

{
  "valid": true,
  "user": {
    "sub": "admin",
    "name": "Admin User",
    "email": "admin@hivematrix.local",
    "preferred_username": "admin",
    "permission_level": "admin",
    "groups": ["admin"]
  }
}

3. Request Service Token:

SERVICE_TOKEN=$(curl -s -X POST http://localhost:5000/service-token \
  -H "Content-Type: application/json" \
  -d '{"calling_service": "codex", "target_service": "ledger"}' | jq -r '.token')

echo "Service Token: $SERVICE_TOKEN"

4. Check Health:

curl -s http://localhost:5000/health | jq


Integration Testing

Test Full OAuth Flow:

# 1. Start services
cd hivematrix-helm
./start.sh

# 2. Open browser to Nexus
open https://localhost

# 3. Click login → redirected to Keycloak
# 4. Login with admin/admin
# 5. Redirected back to HiveMatrix with token
# 6. Check session in Redis:

redis-cli KEYS "session:*"
redis-cli GET "session:{session_id}"

Load Testing

Test rate limiting and session performance:

# Install Apache Bench
sudo apt install apache2-utils

# Test login endpoint (rate limit: 10/min)
ab -n 20 -c 2 http://localhost:5000/login

# Should see:
# - First 10 requests: 302 (redirect)
# - Next 10 requests: 429 (rate limited)

# Test health endpoint (no rate limit)
ab -n 1000 -c 10 http://localhost:5000/health

API Documentation

Core includes auto-generated OpenAPI/Swagger documentation.

Access: http://localhost:5000/docs

Features: - Interactive API testing - Request/response examples - Authentication configuration - Model schemas - Download OpenAPI spec: /apispec.json

Example Usage:

  1. Open http://localhost:5000/docs
  2. Click "Authorize" button
  3. Enter JWT token in format: Bearer {token}
  4. Click endpoint to test (e.g., /api/token/validate)
  5. Click "Try it out"
  6. Enter parameters
  7. Click "Execute"
  8. View response

Troubleshooting

"Invalid token" Errors

Symptoms: - /api/token/validate returns {"valid": false} - Services return 401 Unauthorized

Causes & Solutions:

1. Token Expired (1 hour TTL)

# Decode token to check expiration
python3 -c "import jwt, sys; print(jwt.decode(sys.argv[1], options={'verify_signature': False}))" "$TOKEN"

# Check 'exp' field (Unix timestamp)
# If expired, request new token via /login or /api/token/exchange

2. Session Revoked (Logged Out)

# Check session in Redis
redis-cli GET "session:{jti}"

# If revoked: {"revoked": true, ...}
# Solution: Re-authenticate

3. Core Service Restarted (In-Memory Sessions Lost)

# Check if Redis is being used
redis-cli KEYS "session:*"

# If no sessions, check Core logs:
tail -f logs/core.log | grep "SessionManager"

# Expected: "Using Redis for persistent sessions"
# If using in-memory: Sessions lost on restart
# Solution: Start Redis or re-authenticate

4. Invalid JWT Signature

# Check public key matches
curl -s http://localhost:5000/.well-known/jwks.json | jq

# Services cache this key
# If Core keys rotated, restart services to fetch new key


Session Revocation Not Working

Symptoms: - User clicks logout but can still access pages - /api/token/revoke returns 200 but session still valid

Troubleshooting:

1. Check Session ID in JWT

# Decode token
python3 -c "import jwt, sys; print(jwt.decode(sys.argv[1], options={'verify_signature': False}))" "$TOKEN"

# Verify 'jti' claim exists
# If missing, token was issued before session tracking was added

2. Verify Session Marked as Revoked

SESSION_ID="..." # from jti claim
redis-cli GET "session:$SESSION_ID"

# Should show: {"revoked": true, ...}
# If revoked: false, revocation failed

3. Check Nexus Validation

# Nexus should validate token with Core on every request
# Check Nexus logs for validation calls:
cd ../hivematrix-helm
python logs_cli.py nexus --tail 50 | grep validate

4. Browser Cache

# User may see cached page after logout
# Core sets cache control headers to prevent this:
# Cache-Control: no-cache, no-store, must-revalidate

# Test in incognito/private window
# Force refresh: Ctrl+Shift+R (Chrome/Firefox)


Service Tokens Failing

Symptoms: - POST /service-token returns 400 Bad Request - Service-to-service calls fail with "Unknown calling_service"

Causes & Solutions:

1. Invalid Service Name Format

# Service names must match: ^[a-z0-9_-]{1,50}$
# ✅ Valid: codex, ledger, brain-hair
# ❌ Invalid: Codex, ledger!, service@123

curl -X POST http://localhost:5000/service-token \
  -H "Content-Type: application/json" \
  -d '{"calling_service": "CODEX", "target_service": "ledger"}'

# Returns: {"error": "Invalid calling_service format"}
# Solution: Use lowercase, alphanumeric + hyphens/underscores only

2. Service Not in Registry

# Core validates against services.json
cat services.json | jq 'keys'

# If service missing, add to apps_registry.json and regenerate:
cd ../hivematrix-helm
python install_manager.py update-config

3. Token Expired (5-Minute TTL)

# Service tokens expire quickly
# Check token age:
python3 -c "import jwt, sys, time; \
  payload = jwt.decode(sys.argv[1], options={'verify_signature': False}); \
  age = time.time() - payload['iat']; \
  print(f'Age: {age:.0f}s, Expires in: {payload[\"exp\"] - time.time():.0f}s')" "$SERVICE_TOKEN"

# If expired, request new token
# Note: service_client.py caches tokens automatically


Redis Connection Issues

Symptoms: - Core starts but logs: "SessionManager: Using in-memory sessions" - Sessions lost after Core restart - Rate limiting inconsistent

Troubleshooting:

1. Check Redis Running

# Test Redis connectivity
redis-cli ping
# Expected: PONG

# If connection refused:
sudo systemctl status redis
sudo systemctl start redis

2. Check Redis Port

# Core connects to localhost:6379
# Verify Redis listening:
sudo netstat -tlnp | grep 6379

# Expected: tcp 127.0.0.1:6379 LISTEN

3. Check Redis Authentication

# If Redis requires password:
redis-cli -a yourpassword ping

# Update Core to use password (not currently supported)
# Workaround: Disable Redis auth for localhost

4. Redis Memory Limits

# Check Redis memory usage
redis-cli INFO memory

# If near maxmemory limit, Redis may evict sessions
# Increase limit in /etc/redis/redis.conf:
maxmemory 256mb
maxmemory-policy allkeys-lru

sudo systemctl restart redis


Keycloak Connection Errors

Symptoms: - /api/token/exchange fails with "Invalid access token" - Login redirect fails - OAuth callback errors

Troubleshooting:

1. Verify Keycloak Running

# Check Keycloak service
curl http://localhost:8080/health

# Expected: 200 OK with health status

# If connection refused:
cd ../keycloak-26.4.0
bin/kc.sh start-dev

2. Check Keycloak URL Configuration

# Core should use correct Keycloak URL
grep KEYCLOAK .flaskenv

# Expected:
# KEYCLOAK_SERVER_URL=http://localhost:8080
# KEYCLOAK_REALM=hivematrix
# KEYCLOAK_CLIENT_ID=core-client

3. Validate Client Secret

# Test Keycloak client credentials
KC_URL="http://localhost:8080"
REALM="hivematrix"
CLIENT_ID="core-client"
CLIENT_SECRET="..." # from .flaskenv

# Get token
curl -X POST "$KC_URL/realms/$REALM/protocol/openid-connect/token" \
  -d "grant_type=client_credentials" \
  -d "client_id=$CLIENT_ID" \
  -d "client_secret=$CLIENT_SECRET"

# Should return access token
# If error, regenerate client secret via Keycloak admin console

4. SSL Verification Issues

# If using self-signed certificates:
# Set VERIFY_SSL=False in .flaskenv

# Check current setting:
grep VERIFY_SSL .flaskenv

# Test connection:
python3 -c "import requests; \
  r = requests.get('http://localhost:8080/health', verify=False); \
  print(f'Status: {r.status_code}')"


Health Check Failing

Symptoms: - /health returns 503 Service Unavailable - Kubernetes probes failing - Monitoring alerts

Troubleshooting:

1. Check Health Response

curl -s http://localhost:5000/health | jq

# Look for unhealthy components:
# - redis.status != "healthy"
# - disk.status == "unhealthy"

2. Redis Health

# If Redis unhealthy:
redis-cli ping

# Check Core can connect:
redis-cli -h 127.0.0.1 -p 6379 ping

3. Disk Space

# Check disk usage
df -h /

# If >= 95%, clean up:
# - Old logs: rm -f logs/*.log.old
# - Docker images: docker system prune -a
# - Package cache: sudo apt clean

4. Degraded vs Unhealthy

# Degraded (503) = non-critical issues
# - Redis down but in-memory working
# - Disk 85-95% full

# Unhealthy (503) = critical issues
# - Disk >= 95% full

# Core can still serve requests when degraded
# Fix non-critical issues to return to healthy (200)


See Also

Architecture & Design

Configuration & Setup

External Resources

Tools & Utilities

  • Token Testing: hivematrix-helm/create_test_token.py - Generate test JWT tokens
  • Token Validation: curl -X POST /api/token/validate
  • Log Viewer: hivematrix-helm/logs_cli.py core --tail 50
  • Security Audit: hivematrix-helm/security_audit.py --audit
  • Config Manager: hivematrix-helm/config_manager.py sync-all
  • Health Check: curl http://localhost:5000/health

Last Updated: 2025-11-22 Version: 1.0 Maintained By: HiveMatrix Team