Distributed brain system: spatial processing node specifications
[!NOTE] For the latest implementation status, please refer to Functional Implementation Status (Remaining Functionality).
Last updated: February 19, 2026 Version: v2.1 (FastAPI/SDK cooperation reflected)
Author: Masahiro Aoki
Status: ✅ Implementation completed
Copyright: 2026 Moonlight Technologies Inc. All Rights Reserved.
📋 Implementation status
| Item | File | Number of lines | Status |
|---|---|---|---|
| Implementation code | spatial_processing.py |
3500+ | ✅ Done |
| Unit Test | test_distributed_brain_simulation.py |
5+ | ✅ Done |
| Integration Test | test_distributed_brain_simulation.py |
6+ | ✅ Completed |
| External services/SDK | spatial_generation_service.py / sdk.py |
- | ✅ /generate /health FastAPI + SDK wrapper |
1. Overview
Spatial perception and generation system implemented on EvoSpikeNet's distributed brain architecture. Consists of four types of proprietary nodes that simulate the brain's occipital lobe (visual cortex V1-V5), superior parietal lobule, temporal cortex, and occipito-parietal junction. Provides FastAPI-based space generation services (/generate, /health) and SDK wrapper for external publication.
2. Node type definition
2.1 Node placement and roles ✅ Implementation completed
| Node type | Rank | Brain region | Processing pathway | Main function | Implementation |
|---|---|---|---|---|---|
| SPATIAL_WHERE | 12 | Dorsal parietal lobe | Where path | Spatial position/distance/direction recognition | ✅ SpatialWhereNode |
| SPATIAL_WHAT | 13 | Visual cortex/temporal cortex | What pathway | Visual generation/scene understanding | ✅ SpatialWhatNode |
| SPATIAL_INTEGRATION | 14 | Occipito-Parietal Junction | Integration Pathway | What-Where Integration/World Model | ✅ SpatialIntegrationNode |
| SPATIAL_ATTENTION | 15 | Fronto-orbital area | Attention control | Spatial attention/task-driven control | ✅ SpatialAttentionControlNode |
Implementation class:
- ✅ spatial_processing.py - Integrated system
- ✅ spatial_processing.py - Coordinate transformation
- ✅ spatial_processing.py
- ✅ spatial_processing.py
- ✅ spatial_processing.py
2.2 Relationship with existing nodes
既存ノード構成:
┌─────────────────────────────┐
│ PFC (Executive Control) │ Rank 0-4
│ - 意思決定 │
│ - タスク管理 │
└────────────┬────────────────┘
│
┌────────────┴──────────────┐
│ │
│ LANGUAGE (Rank 5-8) │ VISION (Rank 9-11)
│ - テキスト処理 │ - 特徴抽出
│ - 言語理解 │ - 物体認識
└─────────────┬──────────────┘
│
┌─────────────┴────────────────────────────────────────┐
│ │
│ SPATIAL_WHERE SPATIAL_WHAT INTEGRATION ATTENTION │
│ (Rank 12) (Rank 13) (Rank 14) (Rank 15) │
│ - Where処理 - What生成 - 統合 - 注意制御 │
│ - 座標変換 - シーン理解 - 世界モデル - 優先度 │
└───────────────────────────────────────────────────────┘
2.3 Node definition (Python) ✅ Implemented
Implementation file: spatial_processing.py
Node definition:
# From evospikenet/spatial_processing.py
import torch.nn as nn
from typing import Dict, Any, Optional
class SpatialWhereNode(nn.Module):
"""
Spatial 'Where' processing (Dorsal Stream - Parietal Cortex).
Rank 12 in distributed brain hierarchy.
"""
def __init__(self, input_channels: int = 3, hidden_dim: int = 64,
output_dim: int = 256, num_spatial_scales: int = 4):
# DepthEstimationNetwork: Monocular depth estimation
# SpatialCoordinateEncoder: 3D coordinate → spike
# Retinotopic map encoding
class SpatialWhatNode(nn.Module):
"""
Spatial 'What' processing (Ventral Stream - Visual/Temporal Cortex).
Rank 13 in distributed brain hierarchy.
"""
def __init__(self, input_channels: int = 256, hidden_dim: int = 128,
output_dim: int = 256, num_object_classes: int = 100):
# object recognition encoder
# class probability output
# spike generation
class SpatialIntegrationNode(nn.Module):
"""
What/Where Integration (Posterior Parietal Junction).
Rank 14 in distributed brain hierarchy.
"""
def __init__(self, what_dim: int = 256, where_dim: int = 256,
output_dim: int = 512, num_integration_heads: int = 8):
# SpatialAttentionModule
# Integrated MLP
class SpatialAttentionControlNode(nn.Module):
"""
Spatial Attention Control (Prefrontal Cortex).
Rank 15 in distributed brain hierarchy.
"""
def __init__(self, input_dim: int = 512, hidden_dim: int = 256,
num_attention_pools: int = 4):
# attention priority calculation
# Saccade plan
# Dynamic strength
# node constant
RANK_SPATIAL_WHERE = 12
RANK_SPATIAL_WHAT = 13
RANK_SPATIAL_INTEGRATION = 14
RANK_SPATIAL_ATTENTION = 15
SPATIAL_NODES = [
("where", 12, SpatialWhereNode),
("what", 13, SpatialWhatNode),
("integration", 14, SpatialIntegrationNode),
("attention", 15, SpatialAttentionControlNode),
]
Integrated system:
class DistributedSpatialCortex(nn.Module):
"""Complete spatial cognition system integrating all four nodes."""
def __init__(self, config: Optional[Dict] = None):
self.where_node = SpatialWhereNode(...)
self.what_node = SpatialWhatNode(...)
self.integration_node = SpatialIntegrationNode(...)
self.attention_control_node = SpatialAttentionControlNode(...)
def forward(self, visual_input: torch.Tensor,
reward_signal: Optional[torch.Tensor] = None) -> Dict[str, Any]:
"""Process visual input through all spatial cognition nodes."""
# Where processing → What processing → Integration → Attention control
3. Node specification details
3.1 SPATIAL_WHERE (Rank 12) - Where processing path
Brain area: Dorsal parietal pathway (LIP, MT+, V5A)
Biological function: - Extract spatial location of objects from visual input - Conversion to Allocentric coordinate system - Input to eye movement control - Maintenance of spatial working memory
Computer functions:
| Function | Component | Input | Output |
|---|---|---|---|
| Spatial Attention | SpatialAttentionLayerV2 | Vision Features (C,H,W) | Spatial Weights (H,W) |
| Depth Estimation | DepthEstimationNetwork | RGB Image(3,H,W) | Depth Map(1,H,W) |
| Coordinate transformation | SpatialCoordinateEncoder | Visual retinal coordinates + self-position | Allocentric coordinates |
| Optical Flow | OpticalFlowNetwork | Frame Pair(3,H,W) | Optical Flow(2,H,W) |
| Working Memory | SpatialMemoryBuffer | Spatial Information | Buffering |
Hyperparameters:
spatial_where:
attention:
num_heads: 8
head_dim: 64
dropout: 0.1
depth_estimation:
architecture: "midas" # or "leres"
input_size: [480, 640]
output_scale: 1.0
coordinate_encoding:
max_positions: 1000
coordinate_dim: 64
use_pe: true # Positional Encoding
memory_buffer:
buffer_size: 30 # 1 second at 30 FPS
coordinate_dim: 3
Zenoh PubSub:
spikes/spatial/where/depth:
topic_type: "depth_map"
payload:
dtype: "float32"
shape: [batch_size, 480, 640]
range: [0.0, 100.0] # meters (clipped)
frequency: 30 # Hz
qos: "real_time"
spikes/spatial/where/coordinates:
topic_type: "spatial_coordinates"
payload:
dtype: "float32"
shape: [batch_size, num_objects, 3] # (x, y, z) in Allocentric
frequency: 30
qos: "real_time"
spikes/spatial/where/optical_flow:
topic_type: "optical_flow"
payload:
dtype: "float32"
shape: [batch_size, 2, 480, 640] # (flow_x, flow_y)
frequency: 30
qos: "real_time"
Subscribe topics:
spikes/vision/features:
# Visual features from Vision (Rank 9)
shape: [batch_size, 256, 60, 80] # ResNet-50 default
spikes/pfc/spatial_attention:
# Spatial attention signals from PFC (top-down)
dtype: "float32"
shape: [batch_size, num_attention_targets, 2] # (x, y) in visual coords
spikes/ego_pose:
# Self-position/attitude information (from Rank 0)
dtype: "float32"
shape: [batch_size, 6] # (x, y, z, roll, pitch, yaw)
3.2 SPATIAL_WHAT (Rank 13) - What generation path
Brain area: Visual cortex (V1-V5), temporal cortex (IT)
Biological function: - Object identification based visual generation - Memory-based scene reconstruction - Captioning and scene understanding
Computer functions:
| Function | Component | Input | Output |
|---|---|---|---|
| Scene graph parsing | SceneGraphParser | Text description | Scene graph (JSON) |
| Spatial VAE | SpatialVAEDecoder | Object embedding | 3D voxel grid |
| Object Generation | ObjectGenerativeNetwork | Semantic Information | Generative 3D Map |
| Time series prediction | SpatialTemporalPrediction | Spatial history | t+1 frame prediction |
Model configuration:
class SpatialWhat(nn.Module):
def __init__(self, config):
self.scenegraph_parser = SceneGraphParser(config)
self.vae_decoder = SpatialVAEDecoder(config)
self.object_generator = ObjectGenerativeNetwork(config)
self.temporal_predictor = SpatialTemporalPrediction(config)
def forward(self, text_or_scene: Union[str, Tensor]) -> Dict:
if isinstance(text_or_scene, str):
# Generate from text
scene_graph = self.scenegraph_parser(text_or_scene)
spatial_repr = self._graph_to_spatial(scene_graph)
else:
# Prediction from tensor
spatial_repr = self.temporal_predictor(text_or_scene)
generated_scene = self.vae_decoder(spatial_repr)
return {
"spatial_representation": spatial_repr,
"generated_3d": generated_scene,
"voxel_grid": generated_scene,
}
Zenoh PubSub:
spikes/spatial/what/scene_graph:
topic_type: "scene_graph"
payload:
format: "json"
content:
objects: [{"id": int, "category": str, "attributes": dict}, ...]
relationships: [{"subject": int, "predicate": str, "object": int}, ...]
attributes: {global scene attributes}
frequency: 10 # Hz (low frequency)
qos: "best_effort"
spikes/spatial/what/voxel_grid:
topic_type: "3d_representation"
payload:
dtype: "float32"
shape: [batch_size, 256, 256, 256]
encoding: "voxel_occupancy"
frequency: 10
qos: "best_effort"
spikes/spatial/what/mesh:
topic_type: "3d_mesh"
payload:
format: "glb" # glTF binary
vertices: [N, 3]
faces: [M, 3]
frequency: 5 # Hz (very low frequency)
qos: "best_effort"
Subscribe topics:
spikes/language/spatial_description:
# Text description from Language module (Rank 5-7)
dtype: "string"
max_length: 500
spikes/vision/object_embeddings:
# Object embedding from Vision
dtype: "float32"
shape: [batch_size, num_objects, 256]
3.3 SPATIAL_INTEGRATION (Rank 14) - Integration Node
Brain Region: Occipito-Parietal Junction (OPA, RSC), Temporoparietal (TPJ)
Biological function: - Integration of What and Where information - Building a unified world model - the final stage of the conversion from egocentric coordinates to allocentric coordinates - Spatial reasoning and understanding of relationships between objects
Computer functions:
| Function | Component | Input | Output |
|---|---|---|---|
| Information Fusion | MultiModalSpatialFusion | What + Where | Fusion Expression |
| World Model | WorldModelIntegrator | Fusion Representation (Time Series) | Unified World Representation |
| Spatial Reasoning | SpatialReasoningEngine | World Model | Inference Results |
| Perspective Transformation | PerspectiveTransformer | World Coordinates | Representation in Different Perspectives |
architecture:
class SpatialIntegration(nn.Module):
"""What-Where統合"""
def __init__(self, config):
self.fusion = MultiModalSpatialFusion(config)
self.world_model = WorldModelIntegrator(config)
self.reasoning_engine = SpatialReasoningEngine(config)
self.perspective_transform = PerspectiveTransformer(config)
# Time series buffer (frame history)
self.temporal_buffer = deque(maxlen=30) # 1 sec @ 30 FPS
def forward(self, what: Tensor, where: Tensor,
prev_world_model: Optional[Tensor] = None) -> Dict:
# information fusion
fused = self.fusion(what, where)
# Time series integration
self.temporal_buffer.append(fused)
integrated_history = torch.stack(list(self.temporal_buffer))
# World model update
world_model = self.world_model(integrated_history, prev_world_model)
# inference
reasoning_results = self.reasoning_engine(world_model)
# Viewpoint conversion (multiple views)
perspectives = self.perspective_transform(world_model)
return {
"world_model": world_model,
"reasoning": reasoning_results,
"perspectives": perspectives,
}
Zenoh PubSub:
spikes/spatial/integration/world_model:
topic_type: "world_model"
payload:
dtype: "float32"
representation_type: "voxel_grid"
shape: [batch_size, 256, 256, 256]
contains: ["occupancy", "object_id", "confidence"]
frequency: 10
qos: "real_time"
spikes/spatial/integration/reasoning:
topic_type: "spatial_reasoning"
payload:
format: "json"
content:
relationships: [{"obj1": int, "relation": str, "obj2": int, "confidence": float}, ...]
containment: [{"container": int, "objects": [int, ...] }, ...]
reachability: [{"object": int, "reach_score": float}, ...]
frequency: 10
qos: "best_effort"
spikes/spatial/integration/perspective:
topic_type: "egocentric_view"
payload:
dtype: "float32"
shape: [batch_size, 3, 480, 640] # RGB rendered from agent POV
camera_params: {fov: 60.0, near: 0.01, far: 100.0}
frequency: 30
qos: "real_time"
Subscribe topics:
spikes/spatial/where/coordinates:
# Coordinates from the Where node
spikes/spatial/what/voxel_grid:
# 3D representation from What node
spikes/vision/semantic_segmentation:
# Semantic segmentation from Vision
dtype: "int32"
shape: [batch_size, 480, 640]
3.4 SPATIAL_ATTENTION (Rank 15) - Attention control node
Brain regions: Fronto-orbital field (OFC), anterior supplementary eye field (SEF), superior temporal sulcus (STS)
Biological function: - Task-driven spatial attention - Saliency detection - Eye movement planning - Reward-based attentional shifts
Computer functions:
| Function | Component | Input | Output |
|---|---|---|---|
| Attention Control | SpatialAttentionController | Task Signal + Spatial Context | Attention Weight |
| Focus Selection | FocusSelector | Attention Score | Processing Priority |
| Saliency Detection | SaliencyDetector | Spatial Information | Saliency Map |
| Eye Movement Planning | SaccadePlanner | Attention Weight | Saccade Goal |
Hyperparameters:
spatial_attention:
task_integration:
# Fusing task signals from PFC
method: "multiplicative" # or "additive"
temperature: 1.0
saliency_detection:
# bottom-up saliency
method: "center_bias" # or "entropy", "gradient"
prior_weight: 0.3 # Center bias strength
saccade_planning:
# eye movement planning
delay: 200 # milliseconds
velocity: 300 # degrees/second
Control flow:
class SpatialAttentionController(nn.Module):
"""空間注意制御"""
def forward(self,
task_signal: Tensor, # Task-driven signals from PFC
spatial_context: Tensor, # World model from Integration node
salience_map: Optional[Tensor] = None) -> Dict:
# bottom-up saliency
if salience_map is None:
salience_map = self.saliency_detector(spatial_context)
# task-driven attention
task_attention = self.task_attention_layer(task_signal)
# Integration (bottom up ⊕ top down)
combined_attention = (
self.bottom_up_weight * salience_map +
self.top_down_weight * task_attention
)
# Normalization
attention_weights = torch.softmax(combined_attention, dim=-1)
# focus selection
focus = self.focus_selector(attention_weights)
# saccade planning
saccade_target = self.saccade_planner(focus)
return {
"attention_weights": attention_weights,
"focus": focus,
"saccade_target": saccade_target,
"salience_map": salience_map,
"task_attention": task_attention,
}
Zenoh PubSub:
spikes/spatial/attention/weights:
topic_type: "attention_map"
payload:
dtype: "float32"
shape: [batch_size, 480, 640]
value_range: [0.0, 1.0]
frequency: 30
qos: "real_time"
spikes/spatial/attention/saliency:
topic_type: "saliency_map"
payload:
dtype: "float32"
shape: [batch_size, 480, 640]
frequency: 30
qos: "real_time"
spikes/spatial/attention/saccade:
topic_type: "motor_command"
payload:
dtype: "float32"
shape: [batch_size, 2] # (horizontal, vertical) in degrees
velocity: float # degrees/second
frequency: 10 # Hz (irregular)
qos: "real_time"
Subscribe topics:
spikes/pfc/spatial_task:
# Task-driven signals from PFC
dtype: "float32"
shape: [batch_size, 64] # Task embedding
spikes/spatial/integration/world_model:
# World model from Integration node
4. Communication protocol
4.1 Message format (Zenoh)
Basic structure:
// spatial_message.proto
syntax = "proto3";
package evospikenet.spatial;
message SpatialMessageHeader {
uint32 timestamp_ms = 1;
uint32 sequence_num = 2;
string source_node = 3;
string target_nodes = 4; // comma-separated
}
message CoordinateMessage {
SpatialMessageHeader header = 1;
repeated float x = 2;
repeated float y = 3;
repeated float z = 4;
repeated float confidence = 5;
}
message SpatialAttentionMessage {
SpatialMessageHeader header = 1;
repeated float attention_weights = 2;
repeated int32 focus_indices = 3;
}
4.2 Synchronization Mechanism
PTP (Precision Time Protocol) based:
# evospikenet/spatial_processing/synchronization.py
class SpatialTimeSync:
"""空間ノード間のタイムスタンプ同期"""
def __init__(self):
self.ptp_client = PTPClient()
self.clock_offset = 0.0
self.drift_compensation = 0.0
def sync_timestamp(self, local_ts: float) -> float:
"""ローカルタイムスタンプを同期されたグローバルタイムに変換"""
return local_ts + self.clock_offset + self.drift_compensation * local_ts
4.3 Latency Requirements
| Between nodes | Components | Tolerable delay | Priority |
|---|---|---|---|
| Vision → Where | Visual feature transfer | < 50ms | HIGH |
| Where → Integration | Coordinate transfer | < 20ms | CRITICAL |
| What → Integration | Generated scene transfer | < 100ms | MEDIUM |
| Integration → Attention | World model | < 50ms | HIGH |
| Attention → PFC | Attention signal | < 30ms | CRITICAL |
5. Testing strategy
5.1 Unit Testing
tests/unit/
├── test_spatial_where.py
│ ├── test_depth_estimation_accuracy()
│ ├── test_coordinate_transformation()
│ └── test_optical_flow_correctness()
├── test_spatial_what.py
│ ├── test_scene_graph_parsing()
│ ├── test_vae_generation_quality()
│ └── test_temporal_consistency()
├── test_spatial_integration.py
│ ├── test_what_where_fusion()
│ ├── test_world_model_updates()
│ └── test_spatial_reasoning()
└── test_spatial_attention.py
├── test_attention_control_response()
├── test_saliency_detection()
└── test_saccade_planning()
5.2 Integration Testing
tests/integration/
├── test_spatial_pipeline.py
│ └── test_end_to_end_spatial_processing()
├── test_spatial_pfc_integration.py
│ └── test_task_driven_attention()
└── test_spatial_zenoh_communication.py
└── test_all_topics_publish_subscribe()
5.3 Benchmark
tests/performance/
├── test_spatial_latency.py
│ └── latency measurements across all nodes
├── test_spatial_throughput.py
│ └── FPS measurements at different resolutions
└── test_spatial_energy.py
└── power consumption profiling (GPU/CPU)
6. Deployment
6.1 Node initialization
# config/spatial_nodes.yaml
spatial_where:
enabled: true
rank: 12
gpu_device: 0
batch_size: 4
model_checkpoint: "checkpoints/spatial_where_v1.pt"
spatial_what:
enabled: true
rank: 13
gpu_device: 1
batch_size: 2
model_checkpoint: "checkpoints/spatial_what_v1.pt"
spatial_integration:
enabled: true
rank: 14
gpu_device: 0 # CPU possible
batch_size: 4
spatial_attention:
enabled: true
rank: 15
gpu_device: -1 # CPU only
batch_size: 8
6.2 Startup script
#!/bin/bash
# scripts/launch_spatial_nodes.sh
# Where node launch
python -m evospikenet.spatial_processing.where_node \
--config config/spatial_nodes.yaml \
--rank 12 \
--device cuda:0 &
# What node launch
python -m evospikenet.spatial_processing.what_node \
--config config/spatial_nodes.yaml \
--rank 13 \
--device cuda:1 &
# Integration node startup
python -m evospikenet.spatial_processing.integration_node \
--config config/spatial_nodes.yaml \
--rank 14 \
--device cuda:0 &
# Attention node startup
python -m evospikenet.spatial_processing.attention_node \
--config config/spatial_nodes.yaml \
--rank 15 \
--device cpu &
wait
7. Related Documents
- Remaining_Functionality.md - Feature 13 Detailed implementation plan
- DISTRIBUTED_BRAIN_SYSTEM.md - Whole distributed brain specification
- ZIGZAG_WHERE_WHAT_INTEGRATION.md - Where-What integration details
Next Steps: - [ ] Add node definition to evospikenet/node_types.py - [ ] Create Protocol Buffer schema file - [ ] Zenoh PubSub topic management system implementation - [ ] Start of implementation of Phase 13.1 (Q2 2026)