Skip to content

Copyright 2026 Moonlight Technologies Inc. All Rights Reserved.

Auth Masahiro Aoki

EvoSpikeNet-BrainOS — Implementation Plan

Last updated: 2026-05-30 (v0.2.0 — Phase 1+2 complete, R5 client implemented)


Executive Summary

v0.2.0 key implementations (Phase 1 + Phase 2 + R5):

  • Phase 1 Foundation complete: brainos/config.py, brainos/models.py, brainos/event_bus/, brainos/security/, brainos/identity/, brainos/world_model/, brainos/observability/, brainos/audit/, api/app.py
  • Phase 2 Cognitive Loop complete: brainos/pfc/, brainos/safety/, brainos/degradation/, brainos/memory/, brainos/adapters/
  • R5 Cross-Platform-Client complete: Python SDK (brainos/client/sdk.py) + PWA Web Dashboard (api/static/) + Service Worker
  • Test results: 157 passed, 1 skipped

Next phase (Phase 3 — R1–R4 implementation):

R1 Multi-Platform → R2 Offline-AI → R4 Zero-Disconnection → R3 Genome-Sync over an 8-week roadmap.


Phase 1: Foundation (complete)

Task Module DoD Status
Event bus startup brainos/event_bus/ All nodes communicating
API authentication brainos/identity/ ALLOW_NO_AUTH=false enforced in production, 401 tests pass
Secure communication brainos/security/ HMAC signing on all messages, unsigned payloads rejected
World Model (minimal) brainos/world_model/ Entity CRUD + snapshot
Observability brainos/observability/ Prometheus counters collected
Audit log brainos/audit/ SHA-256 chain verify_chain() passes

Tests: tests/test_phase1_foundation.py — 52 passed


Phase 2: Cognitive Loop (complete)

Task Module DoD Status
PFC node brainos/pfc/ make_decision() round-trip stable
Conscience Circuit safety guard brainos/safety/ CRITICAL block + Human Approval queue working
Graceful degradation brainos/degradation/ ServiceLevel downgrade on node failure
Memory system brainos/memory/ Write / retrieve / TTL retention policy round-trip
First domain adapter brainos/adapters/ Observe→Decide→Act→Learn full path

Tests: tests/test_phase2_cognitive.py — 78 passed


R5: Cross-Platform-Client (complete — v0.2.0)

Task Module DoD Status
Python SDK brainos/client/sdk.py Works on Windows / Linux / macOS with httpx only
PWA Web Dashboard api/static/index.html All 5 platforms, no external CDN
Service Worker api/static/sw.js Offline cache + home screen install
CORS middleware api/app.py Controlled via BRAINOS_CORS_ORIGINS

Tests: tests/test_phase3_client.py — 27 passed


Phase 3: Multi-App Integration — R1–R4 Implementation

All R1–R4 items are release conditions for BrainOS v1.0 (MUST).

Priority & Dependencies

Requirement Phase Priority Dependency
R1 Multi-Platform Phase 3 first half HIGH Phase 2 complete (✅ done)
R2 Offline-AI Phase 3 first half HIGH R1 BrainOSHAL, LocalModelCache
R4 Zero-Disconnection Phase 3 first half ~ ongoing HIGH Phase 2 complete (✅ done)
R3 Genome-Sync Phase 3 second half MEDIUM R2 delta-sync protocol
R5 Cross-Platform-Client ✅ Complete (v0.2.0)

Detailed Roadmap (8 weeks)

First Half (weeks 1–4)
├── Week 1-2: brainos/platform/  (R1)
│            · PlatformDetector — detect 7 platforms
│              (CPU / GPU / Jetson / RasPi / EdgeTPU / Loihi / Quantum)
│            · BrainOSHAL — unified load_model / run_inference interface
│            · Dockerfile.raspi + requirements-raspi.txt
├── Week 2-3: brainos/offline/   (R2)
│            · OfflineModeManager — state machine (ONLINE/DEGRADED_ONLINE/OFFLINE)
│            · LocalModelCache — ONNX cache management
│            · Offline decision API integration test
└── Week 3-4: brainos/watchdog/  (R4)
             · BrainOSSupervisor + HealthProbe
             · AutoRecoverySystem integration
             · 3 new API endpoints

Second Half (weeks 5–8)
├── Week 5-6: brainos/genome/    (R3)
│            · GenomeSyncIntegrator — broadcast / receive / merge
│            · Delta sync protocol — export_delta / apply_delta
│            · 3 new Zenoh topics + 3 new REST endpoints
└── Week 7-8: Integration & regression tests
             · Full R1–R4 DoD verification
             · Phase 1+2 regression (157 passed baseline)
             · Chaos tests (Zenoh disconnect, node failure, offline toggle)

New Modules (Phase 3)

Module Requirement Key Classes
brainos/platform/detector.py R1 PlatformDetector, PlatformKind (7 platforms)
brainos/platform/hal.py R1 BrainOSHAL — unified load_model(), run_inference()
brainos/offline/mode.py R2 OfflineModeManager, OfflineState
brainos/offline/local_cache.py R2 LocalModelCache — ONNX cache
brainos/offline/client_agent.py R2 OfflineClientAgent — client-side offline inference
brainos/offline/llm_router.py R2 LLMRouter — BrainOS API / local LLM / rule-based fallback chain
brainos/genome/integrator.py R3 GenomeSyncIntegrator — broadcast / merge / delta
brainos/genome/protocol.py R3 Zenoh topic constants (brainos/genome/*)
brainos/genome/serializer.py R3 GenomeSerializer — delta serialization (MessagePack)
brainos/genome/merger.py R3 GenomeMerger — conflict resolution, weighted average, crossover
brainos/watchdog/supervisor.py R4 BrainOSSupervisor, RestartPolicy, SupervisedNode
brainos/watchdog/health_probe.py R4 HealthProbe — HTTP polling + timeout

R2 Detailed Design: Client-Side Offline Operation + Local LLM

The current brainos/offline/ covers only server-side (BrainOS node) offline continuity. The behavior when the client (Python SDK / PWA) cannot reach the BrainOS server is undefined. The following design fills that gap.

LLM Fallback Chain (brainos/offline/llm_router.py)

Client → BrainOS API (/api/v1/cognitive/decide)
       ↓ timeout / connection failure
       → Local LLM (Ollama / llama.cpp HTTP API, localhost:11434)
       ↓ Ollama not running / model not downloaded
       → Built-in rule engine (decision-tree minimal AI)
# brainos/offline/llm_router.py
class LLMBackendKind(str, Enum):
    BRAINOS_API  = "brainos_api"   # BrainOS REST API (normal)
    OLLAMA       = "ollama"        # Local Ollama (http://localhost:11434)
    LLAMACPP     = "llamacpp"      # llama.cpp HTTP server
    RULE_ENGINE  = "rule_engine"   # Fallback rule engine

class LLMRouter:
    """
    Routes inference requests according to backend priority order.
    Automatically falls back to the next backend on connection failure.
    """
    def __init__(self, backends: list[LLMBackendKind], timeout_ms: int = 3000)

    def route(self, prompt: str, context: dict) -> LLMResponse
        # 1. BRAINOS_API → fail → 2. OLLAMA → fail → 3. RULE_ENGINE

    def get_active_backend(self) -> LLMBackendKind
    def get_latency_stats(self) -> dict  # p50/p95 per backend

Client-Side Offline Agent (brainos/offline/client_agent.py)

# brainos/offline/client_agent.py
class OfflineClientAgent:
    """
    Autonomous agent that operates when the BrainOS SDK client
    (brainos/client/sdk.py) cannot reach the BrainOS server.

    - Local decision cache (saves the last N decide() results)
    - Inference via local LLM
    - Accumulates operation logs in a queue during offline period;
      syncs to BrainOS server on reconnection
    """
    def __init__(self, llm_router: LLMRouter, local_cache: "LocalModelCache")

    def decide(self, objective: str, context: dict) -> DecisionResult
        # LLMRouter.route() for local inference; enqueue result to offline_queue

    def flush_queue(self, client: "BrainOSClient") -> SyncReport
        # After reconnection: batch-send accumulated operation log (idempotent)

    def get_queue_size(self) -> int
    def is_offline(self) -> bool

PWA Client Offline Behavior (api/static/)

The Service Worker already caches static assets so the UI works offline. For API access:

Case Behavior
BrainOS server reachable Normal API call
Server unreachable, Service Worker cache hit Show cached response (read-only)
Ollama running at localhost Direct fetch to Ollama (/api/chat)
All unreachable Show offline banner + rule engine JSON response

Offline Zenoh Topics (additional)

Topic Direction Description
brainos/offline/client_queue Client → Server Bulk transfer of operation log after reconnection
brainos/offline/sync_ack Server → Client Acknowledgement of client queue receipt

Additional DoD (R2 client-side)

  • [ ] OfflineClientAgent.decide() returns a result from local LLM while BrainOS server is down
  • [ ] LLMRouter falls back to rule engine when Ollama is not running
  • [ ] After reconnection, flush_queue() sends accumulated operation log to BrainOS (integration test)
  • [ ] PWA displays the dashboard while offline (Service Worker + cache verification)

R3 Detailed Design: Genome Send/Receive Protocol

The current plan only mentions broadcast / receive / merge in GenomeSyncIntegrator. The actual data structure, delta format, conflict resolution policy, and client-side storage are undefined. The following adds those details.

Genome Data Structure (using EvoSpikeNet-Core/evospikenet/genome.py)

BrainOS wraps Core's EvoGenome in the following envelope for transport:

@dataclass
class GenomePacket:
    packet_id: str              # UUID v4 (idempotency key)
    sender_node_id: str         # Sending node ID
    genome: dict                # EvoGenome.to_dict() result
    generation: int             # Generation number
    fitness_score: float        # Latest fitness (used for merge priority)
    evolved_since: datetime     # Timestamp of last evolution
    platform: str               # PlatformKind (identifies hardware-dependent evolution)
    signature: bytes            # HMAC-SHA256 via secure_serialization.pack()

Delta Serialization (brainos/genome/serializer.py)

# brainos/genome/serializer.py
class GenomeSerializer:
    """
    MessagePack serializer for compact send/receive of EvoGenome deltas.
    Switches between full serialization and delta based on context.
    """
    @staticmethod
    def serialize_full(genome: "EvoGenome") -> bytes       # Full MessagePack

    @staticmethod
    def serialize_delta(base: "EvoGenome", current: "EvoGenome") -> bytes
        # Extracts only changed Gene/Chromosome
        # Delta format: {"changed": {gene_id: new_value}, "added": [...], "removed": [...]}

    @staticmethod
    def deserialize(data: bytes) -> "EvoGenome"

    @staticmethod
    def apply_delta(base: "EvoGenome", delta: bytes) -> "EvoGenome"

Merge and Conflict Resolution (brainos/genome/merger.py)

# brainos/genome/merger.py
class MergeStrategy(str, Enum):
    FED_AVG     = "fed_avg"     # Weighted average based on DynamicFedAvgStrategy
    CROSSOVER   = "crossover"   # GenomePool crossover (high-fitness priority)
    TOURNAMENT  = "tournament"  # Tournament selection (top fitness priority)

class GenomeMerger:
    """
    Merges EvoGenome received from multiple nodes.
    Conflict resolution: prefer the Gene with the higher fitness_score;
    when tied, prefer the one with the more recent evolved_since (LWW).
    """
    def __init__(self, strategy: MergeStrategy = MergeStrategy.FED_AVG)

    def merge(
        self,
        local: "EvoGenome",
        peers: list["EvoGenome"],
    ) -> "EvoGenome"
        # FED_AVG: fitness_score-weighted average of weight tensors
        # CROSSOVER: uses Core GenomePool.crossover()
        # TOURNAMENT: select from top 50% by fitness_score

    def resolve_conflict(self, gene_a: "Gene", gene_b: "Gene") -> "Gene"
        # Tie: adopt the one with more recent evolved_since (LWW)

Zenoh Topic Details (addition to §10)

Topic Direction Payload Type Description
brainos/genome/updated Node → All GenomePacket Broadcast after generation evolution completes
brainos/genome/sync_request Node → Peers {node_id, since: datetime} Delta request after reconnection
brainos/genome/delta Peer → Node GenomePacket (delta) Delta Genome reply

REST API Additions (Phase 3)

Endpoint Method Description
GET /api/v1/genome/status GET Current Genome state, generation count, fitness, last sync time
POST /api/v1/genome/evolve POST Immediately run one generation of evolution and broadcast
POST /api/v1/genome/sync POST Force sync Genome with peers (fetch delta + merge)

Client-Side Genome Storage

The Python SDK (brainos/client/sdk.py) and PWA client do not evolve Genomes directly; they only reference the Genome managed by the BrainOS server. Only BrainOS nodes running on edge devices (Jetson / RasPi) perform local evolution and send results to the server via brainos/genome/updated.

[BrainOS node on edge device]
  ↓ local evolution (evolution_engine.py / GenomePool)
  → GenomeSyncIntegrator.evolve_and_sync()
  → Zenoh brainos/genome/updated
  → [BrainOS server] GenomeMerger.merge()
  → distribute updated Genome to all clients

Additional DoD (R3 protocol)

  • [ ] GenomeSerializer.serialize_delta() extracts only changed Genes (unit test)
  • [ ] GenomeMerger.merge() works with FED_AVG / CROSSOVER / TOURNAMENT strategies
  • [ ] Conflict resolution applies fitness_score priority; tied scores use evolved_since (LWW)
  • [ ] All 3 Zenoh topics send/receive with HMAC-SHA256 signed payloads
  • [ ] All 3 REST API endpoints work via TestClient (integration test)

Test Plan

Test File Requirement Key Test Cases
tests/test_phase3_platform.py R1 PlatformDetector all 7 platforms; Jetson/RasPi skipped
tests/test_phase3_offline.py R2 is_ai_operational=True on Zenoh disconnect; offline decide API
tests/test_phase3_genome.py R3 Genome broadcast/receive/merge; delta export/apply; 3 REST APIs
tests/test_phase3_watchdog.py R4 Node failure detection; restart; alt-node; EMERGENCY; 3 REST APIs

Mandatory Requirements (BrainOS v1.0 Release Conditions)

# Requirement Status New Module
R1 Multi-Platform — CPU / GPU / Jetson / Raspberry Pi / Quantum Phase 3 planned brainos/platform/
R2 Offline-AI — AI continues operating even without network Phase 3 planned brainos/offline/
R3 Genome-Sync — models shared online, evolution integrated as Genome Phase 3 planned brainos/genome/
R4 Zero-Disconnection — no functional interruption; monitoring / restart / failover Phase 3 planned brainos/watchdog/
R5 Cross-Platform-Client — Windows / Linux / macOS / Android / iOS Complete (v0.2.0) brainos/client/, api/static/

See BrainOS.md §21 for gap analysis, module design, and DoD checklists.


Phase 4: Production Hardening

Task Implementation
Chaos testing Intentional node failures + Zenoh disconnection + latency injection
4-week soak test Continuous SLO measurement; MTTR -80% confirmation
Compliance audit audit_log.verify_chain() periodic + CSV export
Auto-recovery hardening auto_recovery.py playbook expansion (new FailureCategory)
Quantum integration quantum/, advanced_quantum_decision.py, ibm_quantum_plugin.py
Federated learning federated.py, federated_strategy.py

Detailed design: BrainOS.md