Skip to content

Distributed brain system: spatial processing node specifications

[!NOTE] For the latest implementation status, please refer to Functional Implementation Status (Remaining Functionality).

Last updated: February 19, 2026 Version: v2.1 (FastAPI/SDK cooperation reflected)

Author: Masahiro Aoki

Status:Implementation completed

📋 Implementation status

Item File Number of lines Status
Implementation code spatial_processing.py 3500+ ✅ Done
Unit Test test_distributed_brain_simulation.py 5+ ✅ Done
Integration Test test_distributed_brain_simulation.py 6+ ✅ Completed
External services/SDK spatial_generation_service.py / sdk.py - /generate /health FastAPI + SDK wrapper

1. Overview

Spatial perception and generation system implemented on EvoSpikeNet's distributed brain architecture. Consists of four types of proprietary nodes that simulate the brain's occipital lobe (visual cortex V1-V5), superior parietal lobule, temporal cortex, and occipito-parietal junction. Provides FastAPI-based space generation services (/generate, /health) and SDK wrapper for external publication.

2. Node type definition

2.1 Node placement and roles ✅ Implementation completed

Node type Rank Brain region Processing pathway Main function Implementation
SPATIAL_WHERE 12 Dorsal parietal lobe Where path Spatial position/distance/direction recognition ✅ SpatialWhereNode
SPATIAL_WHAT 13 Visual cortex/temporal cortex What pathway Visual generation/scene understanding ✅ SpatialWhatNode
SPATIAL_INTEGRATION 14 Occipito-Parietal Junction Integration Pathway What-Where Integration/World Model ✅ SpatialIntegrationNode
SPATIAL_ATTENTION 15 Fronto-orbital area Attention control Spatial attention/task-driven control ✅ SpatialAttentionControlNode

Implementation class: - ✅ spatial_processing.py - Integrated system - ✅ spatial_processing.py - Coordinate transformation - ✅ spatial_processing.py - ✅ spatial_processing.py - ✅ spatial_processing.py

2.2 Relationship with existing nodes

既存ノード構成:
┌─────────────────────────────┐
│ PFC (Executive Control)     │ Rank 0-4
│ - 意思決定                  │
│ - タスク管理                │
└────────────┬────────────────┘
             │
┌────────────┴──────────────┐
│                           │
│ LANGUAGE    (Rank 5-8)   │ VISION (Rank 9-11)
│ - テキスト処理            │ - 特徴抽出
│ - 言語理解                │ - 物体認識
└─────────────┬──────────────┘
              │
┌─────────────┴────────────────────────────────────────┐
│                                                      │
│ SPATIAL_WHERE  SPATIAL_WHAT  INTEGRATION  ATTENTION │
│ (Rank 12)      (Rank 13)     (Rank 14)    (Rank 15) │
│ - Where処理     - What生成     - 統合      - 注意制御 │
│ - 座標変換      - シーン理解    - 世界モデル - 優先度  │
└───────────────────────────────────────────────────────┘

2.3 Node definition (Python) ✅ Implemented

Implementation file: spatial_processing.py

Node definition:

# From evospikenet/spatial_processing.py

import torch.nn as nn
from typing import Dict, Any, Optional

class SpatialWhereNode(nn.Module):
    """
    Spatial 'Where' processing (Dorsal Stream - Parietal Cortex).
    Rank 12 in distributed brain hierarchy.
    """
    def __init__(self, input_channels: int = 3, hidden_dim: int = 64, 
                 output_dim: int = 256, num_spatial_scales: int = 4):
        # DepthEstimationNetwork: Monocular depth estimation
        # SpatialCoordinateEncoder: 3D coordinate → spike
        # Retinotopic map encoding

class SpatialWhatNode(nn.Module):
    """
    Spatial 'What' processing (Ventral Stream - Visual/Temporal Cortex).
    Rank 13 in distributed brain hierarchy.
    """
    def __init__(self, input_channels: int = 256, hidden_dim: int = 128,
                 output_dim: int = 256, num_object_classes: int = 100):
        # object recognition encoder
        # class probability output
        # spike generation

class SpatialIntegrationNode(nn.Module):
    """
    What/Where Integration (Posterior Parietal Junction).
    Rank 14 in distributed brain hierarchy.
    """
    def __init__(self, what_dim: int = 256, where_dim: int = 256,
                 output_dim: int = 512, num_integration_heads: int = 8):
        # SpatialAttentionModule
        # Integrated MLP

class SpatialAttentionControlNode(nn.Module):
    """
    Spatial Attention Control (Prefrontal Cortex).
    Rank 15 in distributed brain hierarchy.
    """
    def __init__(self, input_dim: int = 512, hidden_dim: int = 256,
                 num_attention_pools: int = 4):
        # attention priority calculation
        # Saccade plan
        # Dynamic strength

# node constant
RANK_SPATIAL_WHERE = 12
RANK_SPATIAL_WHAT = 13
RANK_SPATIAL_INTEGRATION = 14
RANK_SPATIAL_ATTENTION = 15

SPATIAL_NODES = [
    ("where", 12, SpatialWhereNode),
    ("what", 13, SpatialWhatNode),
    ("integration", 14, SpatialIntegrationNode),
    ("attention", 15, SpatialAttentionControlNode),
]

Integrated system:

class DistributedSpatialCortex(nn.Module):
    """Complete spatial cognition system integrating all four nodes."""

    def __init__(self, config: Optional[Dict] = None):
        self.where_node = SpatialWhereNode(...)
        self.what_node = SpatialWhatNode(...)
        self.integration_node = SpatialIntegrationNode(...)
        self.attention_control_node = SpatialAttentionControlNode(...)

    def forward(self, visual_input: torch.Tensor, 
                reward_signal: Optional[torch.Tensor] = None) -> Dict[str, Any]:
        """Process visual input through all spatial cognition nodes."""
        # Where processing → What processing → Integration → Attention control

3. Node specification details

3.1 SPATIAL_WHERE (Rank 12) - Where processing path

Brain area: Dorsal parietal pathway (LIP, MT+, V5A)

Biological function: - Extract spatial location of objects from visual input - Conversion to Allocentric coordinate system - Input to eye movement control - Maintenance of spatial working memory

Computer functions:

Function Component Input Output
Spatial Attention SpatialAttentionLayerV2 Vision Features (C,H,W) Spatial Weights (H,W)
Depth Estimation DepthEstimationNetwork RGB Image(3,H,W) Depth Map(1,H,W)
Coordinate transformation SpatialCoordinateEncoder Visual retinal coordinates + self-position Allocentric coordinates
Optical Flow OpticalFlowNetwork Frame Pair(3,H,W) Optical Flow(2,H,W)
Working Memory SpatialMemoryBuffer Spatial Information Buffering

Hyperparameters:

spatial_where:
  attention:
    num_heads: 8
    head_dim: 64
    dropout: 0.1

  depth_estimation:
    architecture: "midas"  # or "leres"
    input_size: [480, 640]
    output_scale: 1.0

  coordinate_encoding:
    max_positions: 1000
    coordinate_dim: 64
    use_pe: true  # Positional Encoding

  memory_buffer:
    buffer_size: 30  # 1 second at 30 FPS
    coordinate_dim: 3

Zenoh PubSub:

spikes/spatial/where/depth:
  topic_type: "depth_map"
  payload:
    dtype: "float32"
    shape: [batch_size, 480, 640]
    range: [0.0, 100.0]  # meters (clipped)
    frequency: 30  # Hz
    qos: "real_time"

spikes/spatial/where/coordinates:
  topic_type: "spatial_coordinates"
  payload:
    dtype: "float32"
    shape: [batch_size, num_objects, 3]  # (x, y, z) in Allocentric
    frequency: 30
    qos: "real_time"

spikes/spatial/where/optical_flow:
  topic_type: "optical_flow"
  payload:
    dtype: "float32"
    shape: [batch_size, 2, 480, 640]  # (flow_x, flow_y)
    frequency: 30
    qos: "real_time"

Subscribe topics:

spikes/vision/features:
  # Visual features from Vision (Rank 9)
  shape: [batch_size, 256, 60, 80]  # ResNet-50 default

spikes/pfc/spatial_attention:
  # Spatial attention signals from PFC (top-down)
  dtype: "float32"
  shape: [batch_size, num_attention_targets, 2]  # (x, y) in visual coords

spikes/ego_pose:
  # Self-position/attitude information (from Rank 0)
  dtype: "float32"
  shape: [batch_size, 6]  # (x, y, z, roll, pitch, yaw)

3.2 SPATIAL_WHAT (Rank 13) - What generation path

Brain area: Visual cortex (V1-V5), temporal cortex (IT)

Biological function: - Object identification based visual generation - Memory-based scene reconstruction - Captioning and scene understanding

Computer functions:

Function Component Input Output
Scene graph parsing SceneGraphParser Text description Scene graph (JSON)
Spatial VAE SpatialVAEDecoder Object embedding 3D voxel grid
Object Generation ObjectGenerativeNetwork Semantic Information Generative 3D Map
Time series prediction SpatialTemporalPrediction Spatial history t+1 frame prediction

Model configuration:

class SpatialWhat(nn.Module):
    def __init__(self, config):
        self.scenegraph_parser = SceneGraphParser(config)
        self.vae_decoder = SpatialVAEDecoder(config)
        self.object_generator = ObjectGenerativeNetwork(config)
        self.temporal_predictor = SpatialTemporalPrediction(config)

    def forward(self, text_or_scene: Union[str, Tensor]) -> Dict:
        if isinstance(text_or_scene, str):
            # Generate from text
            scene_graph = self.scenegraph_parser(text_or_scene)
            spatial_repr = self._graph_to_spatial(scene_graph)
        else:
            # Prediction from tensor
            spatial_repr = self.temporal_predictor(text_or_scene)

        generated_scene = self.vae_decoder(spatial_repr)
        return {
            "spatial_representation": spatial_repr,
            "generated_3d": generated_scene,
            "voxel_grid": generated_scene,
        }

Zenoh PubSub:

spikes/spatial/what/scene_graph:
  topic_type: "scene_graph"
  payload:
    format: "json"
    content:
      objects: [{"id": int, "category": str, "attributes": dict}, ...]
      relationships: [{"subject": int, "predicate": str, "object": int}, ...]
      attributes: {global scene attributes}
    frequency: 10  # Hz (low frequency)
    qos: "best_effort"

spikes/spatial/what/voxel_grid:
  topic_type: "3d_representation"
  payload:
    dtype: "float32"
    shape: [batch_size, 256, 256, 256]
    encoding: "voxel_occupancy"
    frequency: 10
    qos: "best_effort"

spikes/spatial/what/mesh:
  topic_type: "3d_mesh"
  payload:
    format: "glb"  # glTF binary
    vertices: [N, 3]
    faces: [M, 3]
    frequency: 5  # Hz (very low frequency)
    qos: "best_effort"

Subscribe topics:

spikes/language/spatial_description:
  # Text description from Language module (Rank 5-7)
  dtype: "string"
  max_length: 500

spikes/vision/object_embeddings:
  # Object embedding from Vision
  dtype: "float32"
  shape: [batch_size, num_objects, 256]

3.3 SPATIAL_INTEGRATION (Rank 14) - Integration Node

Brain Region: Occipito-Parietal Junction (OPA, RSC), Temporoparietal (TPJ)

Biological function: - Integration of What and Where information - Building a unified world model - the final stage of the conversion from egocentric coordinates to allocentric coordinates - Spatial reasoning and understanding of relationships between objects

Computer functions:

Function Component Input Output
Information Fusion MultiModalSpatialFusion What + Where Fusion Expression
World Model WorldModelIntegrator Fusion Representation (Time Series) Unified World Representation
Spatial Reasoning SpatialReasoningEngine World Model Inference Results
Perspective Transformation PerspectiveTransformer World Coordinates Representation in Different Perspectives

architecture:

class SpatialIntegration(nn.Module):
    """What-Where統合"""

    def __init__(self, config):
        self.fusion = MultiModalSpatialFusion(config)
        self.world_model = WorldModelIntegrator(config)
        self.reasoning_engine = SpatialReasoningEngine(config)
        self.perspective_transform = PerspectiveTransformer(config)

        # Time series buffer (frame history)
        self.temporal_buffer = deque(maxlen=30)  # 1 sec @ 30 FPS

    def forward(self, what: Tensor, where: Tensor, 
                prev_world_model: Optional[Tensor] = None) -> Dict:
        # information fusion
        fused = self.fusion(what, where)

        # Time series integration
        self.temporal_buffer.append(fused)
        integrated_history = torch.stack(list(self.temporal_buffer))

        # World model update
        world_model = self.world_model(integrated_history, prev_world_model)

        # inference
        reasoning_results = self.reasoning_engine(world_model)

        # Viewpoint conversion (multiple views)
        perspectives = self.perspective_transform(world_model)

        return {
            "world_model": world_model,
            "reasoning": reasoning_results,
            "perspectives": perspectives,
        }

Zenoh PubSub:

spikes/spatial/integration/world_model:
  topic_type: "world_model"
  payload:
    dtype: "float32"
    representation_type: "voxel_grid"
    shape: [batch_size, 256, 256, 256]
    contains: ["occupancy", "object_id", "confidence"]
    frequency: 10
    qos: "real_time"

spikes/spatial/integration/reasoning:
  topic_type: "spatial_reasoning"
  payload:
    format: "json"
    content:
      relationships: [{"obj1": int, "relation": str, "obj2": int, "confidence": float}, ...]
      containment: [{"container": int, "objects": [int, ...] }, ...]
      reachability: [{"object": int, "reach_score": float}, ...]
    frequency: 10
    qos: "best_effort"

spikes/spatial/integration/perspective:
  topic_type: "egocentric_view"
  payload:
    dtype: "float32"
    shape: [batch_size, 3, 480, 640]  # RGB rendered from agent POV
    camera_params: {fov: 60.0, near: 0.01, far: 100.0}
    frequency: 30
    qos: "real_time"

Subscribe topics:

spikes/spatial/where/coordinates:
  # Coordinates from the Where node

spikes/spatial/what/voxel_grid:
  # 3D representation from What node

spikes/vision/semantic_segmentation:
  # Semantic segmentation from Vision
  dtype: "int32"
  shape: [batch_size, 480, 640]

3.4 SPATIAL_ATTENTION (Rank 15) - Attention control node

Brain regions: Fronto-orbital field (OFC), anterior supplementary eye field (SEF), superior temporal sulcus (STS)

Biological function: - Task-driven spatial attention - Saliency detection - Eye movement planning - Reward-based attentional shifts

Computer functions:

Function Component Input Output
Attention Control SpatialAttentionController Task Signal + Spatial Context Attention Weight
Focus Selection FocusSelector Attention Score Processing Priority
Saliency Detection SaliencyDetector Spatial Information Saliency Map
Eye Movement Planning SaccadePlanner Attention Weight Saccade Goal

Hyperparameters:

spatial_attention:
  task_integration:
    # Fusing task signals from PFC
    method: "multiplicative"  # or "additive"
    temperature: 1.0

  saliency_detection:
    # bottom-up saliency
    method: "center_bias"  # or "entropy", "gradient"
    prior_weight: 0.3  # Center bias strength

  saccade_planning:
    # eye movement planning
    delay: 200  # milliseconds
    velocity: 300  # degrees/second

Control flow:

class SpatialAttentionController(nn.Module):
    """空間注意制御"""

    def forward(self, 
                task_signal: Tensor,  # Task-driven signals from PFC
                spatial_context: Tensor,  # World model from Integration node
                salience_map: Optional[Tensor] = None) -> Dict:

        # bottom-up saliency
        if salience_map is None:
            salience_map = self.saliency_detector(spatial_context)

        # task-driven attention
        task_attention = self.task_attention_layer(task_signal)

        # Integration (bottom up ⊕ top down)
        combined_attention = (
            self.bottom_up_weight * salience_map +
            self.top_down_weight * task_attention
        )

        # Normalization
        attention_weights = torch.softmax(combined_attention, dim=-1)

        # focus selection
        focus = self.focus_selector(attention_weights)

        # saccade planning
        saccade_target = self.saccade_planner(focus)

        return {
            "attention_weights": attention_weights,
            "focus": focus,
            "saccade_target": saccade_target,
            "salience_map": salience_map,
            "task_attention": task_attention,
        }

Zenoh PubSub:

spikes/spatial/attention/weights:
  topic_type: "attention_map"
  payload:
    dtype: "float32"
    shape: [batch_size, 480, 640]
    value_range: [0.0, 1.0]
    frequency: 30
    qos: "real_time"

spikes/spatial/attention/saliency:
  topic_type: "saliency_map"
  payload:
    dtype: "float32"
    shape: [batch_size, 480, 640]
    frequency: 30
    qos: "real_time"

spikes/spatial/attention/saccade:
  topic_type: "motor_command"
  payload:
    dtype: "float32"
    shape: [batch_size, 2]  # (horizontal, vertical) in degrees
    velocity: float  # degrees/second
    frequency: 10  # Hz (irregular)
    qos: "real_time"

Subscribe topics:

spikes/pfc/spatial_task:
  # Task-driven signals from PFC
  dtype: "float32"
  shape: [batch_size, 64]  # Task embedding

spikes/spatial/integration/world_model:
  # World model from Integration node

4. Communication protocol

4.1 Message format (Zenoh)

Basic structure:

// spatial_message.proto
syntax = "proto3";

package evospikenet.spatial;

message SpatialMessageHeader {
    uint32 timestamp_ms = 1;
    uint32 sequence_num = 2;
    string source_node = 3;
    string target_nodes = 4;  // comma-separated
}

message CoordinateMessage {
    SpatialMessageHeader header = 1;
    repeated float x = 2;
    repeated float y = 3;
    repeated float z = 4;
    repeated float confidence = 5;
}

message SpatialAttentionMessage {
    SpatialMessageHeader header = 1;
    repeated float attention_weights = 2;
    repeated int32 focus_indices = 3;
}

4.2 Synchronization Mechanism

PTP (Precision Time Protocol) based:

# evospikenet/spatial_processing/synchronization.py

class SpatialTimeSync:
    """空間ノード間のタイムスタンプ同期"""

    def __init__(self):
        self.ptp_client = PTPClient()
        self.clock_offset = 0.0
        self.drift_compensation = 0.0

    def sync_timestamp(self, local_ts: float) -> float:
        """ローカルタイムスタンプを同期されたグローバルタイムに変換"""
        return local_ts + self.clock_offset + self.drift_compensation * local_ts

4.3 Latency Requirements

Between nodes Components Tolerable delay Priority
Vision → Where Visual feature transfer < 50ms HIGH
Where → Integration Coordinate transfer < 20ms CRITICAL
What → Integration Generated scene transfer < 100ms MEDIUM
Integration → Attention World model < 50ms HIGH
Attention → PFC Attention signal < 30ms CRITICAL

5. Testing strategy

5.1 Unit Testing

tests/unit/
├── test_spatial_where.py
│   ├── test_depth_estimation_accuracy()
│   ├── test_coordinate_transformation()
│   └── test_optical_flow_correctness()
├── test_spatial_what.py
│   ├── test_scene_graph_parsing()
│   ├── test_vae_generation_quality()
│   └── test_temporal_consistency()
├── test_spatial_integration.py
│   ├── test_what_where_fusion()
│   ├── test_world_model_updates()
│   └── test_spatial_reasoning()
└── test_spatial_attention.py
    ├── test_attention_control_response()
    ├── test_saliency_detection()
    └── test_saccade_planning()

5.2 Integration Testing

tests/integration/
├── test_spatial_pipeline.py
│   └── test_end_to_end_spatial_processing()
├── test_spatial_pfc_integration.py
│   └── test_task_driven_attention()
└── test_spatial_zenoh_communication.py
    └── test_all_topics_publish_subscribe()

5.3 Benchmark

tests/performance/
├── test_spatial_latency.py
│   └── latency measurements across all nodes
├── test_spatial_throughput.py
│   └── FPS measurements at different resolutions
└── test_spatial_energy.py
    └── power consumption profiling (GPU/CPU)

6. Deployment

6.1 Node initialization

# config/spatial_nodes.yaml

spatial_where:
  enabled: true
  rank: 12
  gpu_device: 0
  batch_size: 4
  model_checkpoint: "checkpoints/spatial_where_v1.pt"

spatial_what:
  enabled: true
  rank: 13
  gpu_device: 1
  batch_size: 2
  model_checkpoint: "checkpoints/spatial_what_v1.pt"

spatial_integration:
  enabled: true
  rank: 14
  gpu_device: 0  # CPU possible
  batch_size: 4

spatial_attention:
  enabled: true
  rank: 15
  gpu_device: -1  # CPU only
  batch_size: 8

6.2 Startup script

#!/bin/bash
# scripts/launch_spatial_nodes.sh

# Where node launch
python -m evospikenet.spatial_processing.where_node \
  --config config/spatial_nodes.yaml \
  --rank 12 \
  --device cuda:0 &

# What node launch
python -m evospikenet.spatial_processing.what_node \
  --config config/spatial_nodes.yaml \
  --rank 13 \
  --device cuda:1 &

# Integration node startup
python -m evospikenet.spatial_processing.integration_node \
  --config config/spatial_nodes.yaml \
  --rank 14 \
  --device cuda:0 &

# Attention node startup
python -m evospikenet.spatial_processing.attention_node \
  --config config/spatial_nodes.yaml \
  --rank 15 \
  --device cpu &

wait

Next Steps: - [ ] Add node definition to evospikenet/node_types.py - [ ] Create Protocol Buffer schema file - [ ] Zenoh PubSub topic management system implementation - [ ] Start of implementation of Phase 13.1 (Q2 2026)