Copyright 2026 Moonlight Technologies Inc. All Rights Reserved.

Auth Masahiro Aoki

EvoSpikeNet-BrainOS — Implementation Plan

Last updated: 2026-05-30 (v0.2.0 — Phase 1+2 complete, R5 client implemented)

Executive Summary

v0.2.0 key implementations (Phase 1 + Phase 2 + R5):

Phase 1 Foundation complete: brainos/config.py, brainos/models.py, brainos/event_bus/, brainos/security/, brainos/identity/, brainos/world_model/, brainos/observability/, brainos/audit/, api/app.py
Phase 2 Cognitive Loop complete: brainos/pfc/, brainos/safety/, brainos/degradation/, brainos/memory/, brainos/adapters/
R5 Cross-Platform-Client complete: Python SDK (brainos/client/sdk.py) + PWA Web Dashboard (api/static/) + Service Worker
Test results: 157 passed, 1 skipped

Next phase (Phase 3 — R1–R4 implementation):

R1 Multi-Platform → R2 Offline-AI → R4 Zero-Disconnection → R3 Genome-Sync over an 8-week roadmap.

Phase 1: Foundation (complete)

Task	Module	DoD	Status
Event bus startup	`brainos/event_bus/`	All nodes communicating	✅
API authentication	`brainos/identity/`	`ALLOW_NO_AUTH=false` enforced in production, 401 tests pass	✅
Secure communication	`brainos/security/`	HMAC signing on all messages, unsigned payloads rejected	✅
World Model (minimal)	`brainos/world_model/`	Entity CRUD + snapshot	✅
Observability	`brainos/observability/`	Prometheus counters collected	✅
Audit log	`brainos/audit/`	SHA-256 chain `verify_chain()` passes	✅

Tests: tests/test_phase1_foundation.py — 52 passed

Phase 2: Cognitive Loop (complete)

Task	Module	DoD	Status
PFC node	`brainos/pfc/`	`make_decision()` round-trip stable	✅
Conscience Circuit safety guard	`brainos/safety/`	CRITICAL block + Human Approval queue working	✅
Graceful degradation	`brainos/degradation/`	`ServiceLevel` downgrade on node failure	✅
Memory system	`brainos/memory/`	Write / retrieve / TTL retention policy round-trip	✅
First domain adapter	`brainos/adapters/`	Observe→Decide→Act→Learn full path	✅

Tests: tests/test_phase2_cognitive.py — 78 passed

R5: Cross-Platform-Client (complete — v0.2.0)

Task	Module	DoD	Status
Python SDK	`brainos/client/sdk.py`	Works on Windows / Linux / macOS with httpx only	✅
PWA Web Dashboard	`api/static/index.html`	All 5 platforms, no external CDN	✅
Service Worker	`api/static/sw.js`	Offline cache + home screen install	✅
CORS middleware	`api/app.py`	Controlled via `BRAINOS_CORS_ORIGINS`	✅

Tests: tests/test_phase3_client.py — 27 passed

Phase 3: Multi-App Integration — R1–R4 Implementation

All R1–R4 items are release conditions for BrainOS v1.0 (MUST).

Priority & Dependencies

Requirement	Phase	Priority	Dependency
R1 Multi-Platform	Phase 3 first half	HIGH	Phase 2 complete (✅ done)
R2 Offline-AI	Phase 3 first half	HIGH	R1 `BrainOSHAL`, `LocalModelCache`
R4 Zero-Disconnection	Phase 3 first half ~ ongoing	HIGH	Phase 2 complete (✅ done)
R3 Genome-Sync	Phase 3 second half	MEDIUM	R2 delta-sync protocol
R5 Cross-Platform-Client	✅ Complete (v0.2.0)	—	—

Detailed Roadmap (8 weeks)

First Half (weeks 1–4)
├── Week 1-2: brainos/platform/  (R1)
│            · PlatformDetector — detect 7 platforms
│              (CPU / GPU / Jetson / RasPi / EdgeTPU / Loihi / Quantum)
│            · BrainOSHAL — unified load_model / run_inference interface
│            · Dockerfile.raspi + requirements-raspi.txt
├── Week 2-3: brainos/offline/   (R2)
│            · OfflineModeManager — state machine (ONLINE/DEGRADED_ONLINE/OFFLINE)
│            · LocalModelCache — ONNX cache management
│            · Offline decision API integration test
└── Week 3-4: brainos/watchdog/  (R4)
             · BrainOSSupervisor + HealthProbe
             · AutoRecoverySystem integration
             · 3 new API endpoints

Second Half (weeks 5–8)
├── Week 5-6: brainos/genome/    (R3)
│            · GenomeSyncIntegrator — broadcast / receive / merge
│            · Delta sync protocol — export_delta / apply_delta
│            · 3 new Zenoh topics + 3 new REST endpoints
└── Week 7-8: Integration & regression tests
             · Full R1–R4 DoD verification
             · Phase 1+2 regression (157 passed baseline)
             · Chaos tests (Zenoh disconnect, node failure, offline toggle)

New Modules (Phase 3)

Module	Requirement	Key Classes
`brainos/platform/detector.py`	R1	`PlatformDetector`, `PlatformKind` (7 platforms)
`brainos/platform/hal.py`	R1	`BrainOSHAL` — unified `load_model()`, `run_inference()`
`brainos/offline/mode.py`	R2	`OfflineModeManager`, `OfflineState`
`brainos/offline/local_cache.py`	R2	`LocalModelCache` — ONNX cache
`brainos/offline/client_agent.py`	R2	`OfflineClientAgent` — client-side offline inference
`brainos/offline/llm_router.py`	R2	`LLMRouter` — BrainOS API / local LLM / rule-based fallback chain
`brainos/genome/integrator.py`	R3	`GenomeSyncIntegrator` — broadcast / merge / delta
`brainos/genome/protocol.py`	R3	Zenoh topic constants (`brainos/genome/*`)
`brainos/genome/serializer.py`	R3	`GenomeSerializer` — delta serialization (MessagePack)
`brainos/genome/merger.py`	R3	`GenomeMerger` — conflict resolution, weighted average, crossover
`brainos/watchdog/supervisor.py`	R4	`BrainOSSupervisor`, `RestartPolicy`, `SupervisedNode`
`brainos/watchdog/health_probe.py`	R4	`HealthProbe` — HTTP polling + timeout

R2 Detailed Design: Client-Side Offline Operation + Local LLM

The current brainos/offline/ covers only server-side (BrainOS node) offline continuity. The behavior when the client (Python SDK / PWA) cannot reach the BrainOS server is undefined. The following design fills that gap.

LLM Fallback Chain (`brainos/offline/llm_router.py`)

Client → BrainOS API (/api/v1/cognitive/decide)
       ↓ timeout / connection failure
       → Local LLM (Ollama / llama.cpp HTTP API, localhost:11434)
       ↓ Ollama not running / model not downloaded
       → Built-in rule engine (decision-tree minimal AI)

# brainos/offline/llm_router.py
class LLMBackendKind(str, Enum):
    BRAINOS_API  = "brainos_api"   # BrainOS REST API (normal)
    OLLAMA       = "ollama"        # Local Ollama (http://localhost:11434)
    LLAMACPP     = "llamacpp"      # llama.cpp HTTP server
    RULE_ENGINE  = "rule_engine"   # Fallback rule engine

class LLMRouter:
    """
    Routes inference requests according to backend priority order.
    Automatically falls back to the next backend on connection failure.
    """
    def __init__(self, backends: list[LLMBackendKind], timeout_ms: int = 3000)

    def route(self, prompt: str, context: dict) -> LLMResponse
        # 1. BRAINOS_API → fail → 2. OLLAMA → fail → 3. RULE_ENGINE

    def get_active_backend(self) -> LLMBackendKind
    def get_latency_stats(self) -> dict  # p50/p95 per backend

Client-Side Offline Agent (`brainos/offline/client_agent.py`)

# brainos/offline/client_agent.py
class OfflineClientAgent:
    """
    Autonomous agent that operates when the BrainOS SDK client
    (brainos/client/sdk.py) cannot reach the BrainOS server.

    - Local decision cache (saves the last N decide() results)
    - Inference via local LLM
    - Accumulates operation logs in a queue during offline period;
      syncs to BrainOS server on reconnection
    """
    def __init__(self, llm_router: LLMRouter, local_cache: "LocalModelCache")

    def decide(self, objective: str, context: dict) -> DecisionResult
        # LLMRouter.route() for local inference; enqueue result to offline_queue

    def flush_queue(self, client: "BrainOSClient") -> SyncReport
        # After reconnection: batch-send accumulated operation log (idempotent)

    def get_queue_size(self) -> int
    def is_offline(self) -> bool

PWA Client Offline Behavior (`api/static/`)

The Service Worker already caches static assets so the UI works offline. For API access:

Case	Behavior
BrainOS server reachable	Normal API call
Server unreachable, Service Worker cache hit	Show cached response (read-only)
Ollama running at localhost	Direct fetch to Ollama (`/api/chat`)
All unreachable	Show offline banner + rule engine JSON response

Offline Zenoh Topics (additional)

Topic	Direction	Description
`brainos/offline/client_queue`	Client → Server	Bulk transfer of operation log after reconnection
`brainos/offline/sync_ack`	Server → Client	Acknowledgement of client queue receipt

Additional DoD (R2 client-side)

[ ] OfflineClientAgent.decide() returns a result from local LLM while BrainOS server is down
[ ] LLMRouter falls back to rule engine when Ollama is not running
[ ] After reconnection, flush_queue() sends accumulated operation log to BrainOS (integration test)
[ ] PWA displays the dashboard while offline (Service Worker + cache verification)

R3 Detailed Design: Genome Send/Receive Protocol

The current plan only mentions broadcast / receive / merge in GenomeSyncIntegrator. The actual data structure, delta format, conflict resolution policy, and client-side storage are undefined. The following adds those details.

Genome Data Structure (using `EvoSpikeNet-Core/evospikenet/genome.py`)

BrainOS wraps Core's EvoGenome in the following envelope for transport:

@dataclass
class GenomePacket:
    packet_id: str              # UUID v4 (idempotency key)
    sender_node_id: str         # Sending node ID
    genome: dict                # EvoGenome.to_dict() result
    generation: int             # Generation number
    fitness_score: float        # Latest fitness (used for merge priority)
    evolved_since: datetime     # Timestamp of last evolution
    platform: str               # PlatformKind (identifies hardware-dependent evolution)
    signature: bytes            # HMAC-SHA256 via secure_serialization.pack()

Delta Serialization (`brainos/genome/serializer.py`)

# brainos/genome/serializer.py
class GenomeSerializer:
    """
    MessagePack serializer for compact send/receive of EvoGenome deltas.
    Switches between full serialization and delta based on context.
    """
    @staticmethod
    def serialize_full(genome: "EvoGenome") -> bytes       # Full MessagePack

    @staticmethod
    def serialize_delta(base: "EvoGenome", current: "EvoGenome") -> bytes
        # Extracts only changed Gene/Chromosome
        # Delta format: {"changed": {gene_id: new_value}, "added": [...], "removed": [...]}

    @staticmethod
    def deserialize(data: bytes) -> "EvoGenome"

    @staticmethod
    def apply_delta(base: "EvoGenome", delta: bytes) -> "EvoGenome"

Merge and Conflict Resolution (`brainos/genome/merger.py`)

# brainos/genome/merger.py
class MergeStrategy(str, Enum):
    FED_AVG     = "fed_avg"     # Weighted average based on DynamicFedAvgStrategy
    CROSSOVER   = "crossover"   # GenomePool crossover (high-fitness priority)
    TOURNAMENT  = "tournament"  # Tournament selection (top fitness priority)

class GenomeMerger:
    """
    Merges EvoGenome received from multiple nodes.
    Conflict resolution: prefer the Gene with the higher fitness_score;
    when tied, prefer the one with the more recent evolved_since (LWW).
    """
    def __init__(self, strategy: MergeStrategy = MergeStrategy.FED_AVG)

    def merge(
        self,
        local: "EvoGenome",
        peers: list["EvoGenome"],
    ) -> "EvoGenome"
        # FED_AVG: fitness_score-weighted average of weight tensors
        # CROSSOVER: uses Core GenomePool.crossover()
        # TOURNAMENT: select from top 50% by fitness_score

    def resolve_conflict(self, gene_a: "Gene", gene_b: "Gene") -> "Gene"
        # Tie: adopt the one with more recent evolved_since (LWW)

Zenoh Topic Details (addition to §10)

Topic	Direction	Payload Type	Description
`brainos/genome/updated`	Node → All	`GenomePacket`	Broadcast after generation evolution completes
`brainos/genome/sync_request`	Node → Peers	`{node_id, since: datetime}`	Delta request after reconnection
`brainos/genome/delta`	Peer → Node	`GenomePacket` (delta)	Delta Genome reply

REST API Additions (Phase 3)

Endpoint	Method	Description
`GET /api/v1/genome/status`	GET	Current Genome state, generation count, fitness, last sync time
`POST /api/v1/genome/evolve`	POST	Immediately run one generation of evolution and broadcast
`POST /api/v1/genome/sync`	POST	Force sync Genome with peers (fetch delta + merge)

Client-Side Genome Storage

The Python SDK (brainos/client/sdk.py) and PWA client do not evolve Genomes directly; they only reference the Genome managed by the BrainOS server. Only BrainOS nodes running on edge devices (Jetson / RasPi) perform local evolution and send results to the server via brainos/genome/updated.

[BrainOS node on edge device]
  ↓ local evolution (evolution_engine.py / GenomePool)
  → GenomeSyncIntegrator.evolve_and_sync()
  → Zenoh brainos/genome/updated
  → [BrainOS server] GenomeMerger.merge()
  → distribute updated Genome to all clients

Additional DoD (R3 protocol)

[ ] GenomeSerializer.serialize_delta() extracts only changed Genes (unit test)
[ ] GenomeMerger.merge() works with FED_AVG / CROSSOVER / TOURNAMENT strategies
[ ] Conflict resolution applies fitness_score priority; tied scores use evolved_since (LWW)
[ ] All 3 Zenoh topics send/receive with HMAC-SHA256 signed payloads
[ ] All 3 REST API endpoints work via TestClient (integration test)

Test Plan

Test File	Requirement	Key Test Cases
`tests/test_phase3_platform.py`	R1	PlatformDetector all 7 platforms; Jetson/RasPi skipped
`tests/test_phase3_offline.py`	R2	`is_ai_operational=True` on Zenoh disconnect; offline decide API
`tests/test_phase3_genome.py`	R3	Genome broadcast/receive/merge; delta export/apply; 3 REST APIs
`tests/test_phase3_watchdog.py`	R4	Node failure detection; restart; alt-node; EMERGENCY; 3 REST APIs

Mandatory Requirements (BrainOS v1.0 Release Conditions)

#	Requirement	Status	New Module
R1	Multi-Platform — CPU / GPU / Jetson / Raspberry Pi / Quantum	Phase 3 planned	`brainos/platform/`
R2	Offline-AI — AI continues operating even without network	Phase 3 planned	`brainos/offline/`
R3	Genome-Sync — models shared online, evolution integrated as Genome	Phase 3 planned	`brainos/genome/`
R4	Zero-Disconnection — no functional interruption; monitoring / restart / failover	Phase 3 planned	`brainos/watchdog/`
R5	Cross-Platform-Client — Windows / Linux / macOS / Android / iOS	✅ Complete (v0.2.0)	`brainos/client/`, `api/static/`

See BrainOS.md §21 for gap analysis, module design, and DoD checklists.

Phase 4: Production Hardening

Task	Implementation
Chaos testing	Intentional node failures + Zenoh disconnection + latency injection
4-week soak test	Continuous SLO measurement; MTTR -80% confirmation
Compliance audit	`audit_log.verify_chain()` periodic + CSV export
Auto-recovery hardening	`auto_recovery.py` playbook expansion (new `FailureCategory`)
Quantum integration	`quantum/`, `advanced_quantum_decision.py`, `ibm_quantum_plugin.py`
Federated learning	`federated.py`, `federated_strategy.py`

Detailed design: BrainOS.md