分散脳システム：空間処理ノード仕様書

[!NOTE] 最新の実装状況は機能実装ステータス (Remaining Functionality) を参照してください。

最終更新日: 2026年2月19日
バージョン: v2.1 (FastAPI/SDK連携反映)

Author: Masahiro Aoki

Status: ✅ 実装完了

Copyright: 2026 Moonlight Technologies Inc. All Rights Reserved.

📋 実装ステータス

項目	ファイル	行数	状態
実装コード	`spatial_processing.py`	3500+	✅ 完了
ユニットテスト	`test_distributed_brain_simulation.py`	5+	✅ 完了
統合テスト	`test_distributed_brain_simulation.py`	6+	✅ 完了
外部サービス/SDK	`spatial_generation_service.py` / `sdk.py`	-	✅ `/generate` `/health` FastAPI + SDK ラッパー

1. 概要

EvoSpikeNetの分散脳アーキテクチャに実装された空間認知・生成システム。脳の後頭葉（視覚皮質V1-V5）、上頭頂小葉、側頭皮質、後頭頭頂接合部をシミュレートする4種類の専有ノードで構成。外部公開用として FastAPI ベースの空間生成サービス(/generate, /health)と SDK ラッパーを提供。

2. ノードタイプ定義

2.1 ノード配置と役割 ✅ 実装完了

ノードタイプ	Rank	脳領域	処理経路	主要機能	実装
SPATIAL_WHERE	12	頭頂葉背側	Where経路	空間位置・距離・方向認識	✅ SpatialWhereNode
SPATIAL_WHAT	13	視覚皮質/側頭皮質	What経路	視覚生成・シーン理解	✅ SpatialWhatNode
SPATIAL_INTEGRATION	14	後頭頭頂接合部	統合経路	What-Where統合・世界モデル	✅ SpatialIntegrationNode
SPATIAL_ATTENTION	15	前頭眼窩野	注意制御	空間注意・タスク駆動制御	✅ SpatialAttentionControlNode

実装クラス: - ✅ spatial_processing.py - 統合システム - ✅ spatial_processing.py - 座標変換 - ✅ spatial_processing.py - ✅ spatial_processing.py - ✅ spatial_processing.py

2.2 既存ノードとの関係

既存ノード構成:
┌─────────────────────────────┐
│ PFC (Executive Control)     │ Rank 0-4
│ - 意思決定                  │
│ - タスク管理                │
└────────────┬────────────────┘
             │
┌────────────┴──────────────┐
│                           │
│ LANGUAGE    (Rank 5-8)   │ VISION (Rank 9-11)
│ - テキスト処理            │ - 特徴抽出
│ - 言語理解                │ - 物体認識
└─────────────┬──────────────┘
              │
┌─────────────┴────────────────────────────────────────┐
│                                                      │
│ SPATIAL_WHERE  SPATIAL_WHAT  INTEGRATION  ATTENTION │
│ (Rank 12)      (Rank 13)     (Rank 14)    (Rank 15) │
│ - Where処理     - What生成     - 統合      - 注意制御 │
│ - 座標変換      - シーン理解    - 世界モデル - 優先度  │
└───────────────────────────────────────────────────────┘

2.3 ノード定義（Python） ✅ 実装済み

実装ファイル: spatial_processing.py

ノード定義:

# evospikenet/spatial_processing.py より

import torch.nn as nn
from typing import Dict, Any, Optional

class SpatialWhereNode(nn.Module):
    """
    Spatial 'Where' processing (Dorsal Stream - Parietal Cortex).
    Rank 12 in distributed brain hierarchy.
    """
    def __init__(self, input_channels: int = 3, hidden_dim: int = 64, 
                 output_dim: int = 256, num_spatial_scales: int = 4):
        # DepthEstimationNetwork: 単眼深度推定
        # SpatialCoordinateEncoder: 3D座標 → スパイク
        # Retinotopic map encoding

class SpatialWhatNode(nn.Module):
    """
    Spatial 'What' processing (Ventral Stream - Visual/Temporal Cortex).
    Rank 13 in distributed brain hierarchy.
    """
    def __init__(self, input_channels: int = 256, hidden_dim: int = 128,
                 output_dim: int = 256, num_object_classes: int = 100):
        # 物体認識エンコーダ
        # クラス確率出力
        # スパイク生成

class SpatialIntegrationNode(nn.Module):
    """
    What/Where Integration (Posterior Parietal Junction).
    Rank 14 in distributed brain hierarchy.
    """
    def __init__(self, what_dim: int = 256, where_dim: int = 256,
                 output_dim: int = 512, num_integration_heads: int = 8):
        # SpatialAttentionModule
        # 統合MLP

class SpatialAttentionControlNode(nn.Module):
    """
    Spatial Attention Control (Prefrontal Cortex).
    Rank 15 in distributed brain hierarchy.
    """
    def __init__(self, input_dim: int = 512, hidden_dim: int = 256,
                 num_attention_pools: int = 4):
        # 注意優先度計算
        # Saccade計画
        # 動調強度

# ノード定数
RANK_SPATIAL_WHERE = 12
RANK_SPATIAL_WHAT = 13
RANK_SPATIAL_INTEGRATION = 14
RANK_SPATIAL_ATTENTION = 15

SPATIAL_NODES = [
    ("where", 12, SpatialWhereNode),
    ("what", 13, SpatialWhatNode),
    ("integration", 14, SpatialIntegrationNode),
    ("attention", 15, SpatialAttentionControlNode),
]

統合システム:

class DistributedSpatialCortex(nn.Module):
    """Complete spatial cognition system integrating all four nodes."""

    def __init__(self, config: Optional[Dict] = None):
        self.where_node = SpatialWhereNode(...)
        self.what_node = SpatialWhatNode(...)
        self.integration_node = SpatialIntegrationNode(...)
        self.attention_control_node = SpatialAttentionControlNode(...)

    def forward(self, visual_input: torch.Tensor, 
                reward_signal: Optional[torch.Tensor] = None) -> Dict[str, Any]:
        """Process visual input through all spatial cognition nodes."""
        # Where処理 → What処理 → 統合 → 注意制御

3. ノード仕様詳細

3.1 SPATIAL_WHERE (Rank 12) - Where処理経路

脳領域: 頭頂葉背側経路（LIP, MT+, V5A）

生物学的機能: - 視覚入力から物体の空間位置を抽出 - Allocentric（環境中心）座標系への変換 - 眼球運動制御への入力 - 空間ワーキングメモリの維持

計算機的機能:

機能	コンポーネント	入力	出力
空間アテンション	SpatialAttentionLayerV2	Vision特徴(C,H,W)	空間ウェイト(H,W)
奥行き推定	DepthEstimationNetwork	RGB画像(3,H,W)	深度マップ(1,H,W)
座標変換	SpatialCoordinateEncoder	視網膜座標+自己位置	Allocentric座標
光学フロー	OpticalFlowNetwork	フレーム対(3,H,W)	光学フロー(2,H,W)
ワーキングメモリ	SpatialMemoryBuffer	空間情報	バッファリング

ハイパーパラメータ:

spatial_where:
  attention:
    num_heads: 8
    head_dim: 64
    dropout: 0.1

  depth_estimation:
    architecture: "midas"  # or "leres"
    input_size: [480, 640]
    output_scale: 1.0

  coordinate_encoding:
    max_positions: 1000
    coordinate_dim: 64
    use_pe: true  # Positional Encoding

  memory_buffer:
    buffer_size: 30  # 1 second at 30 FPS
    coordinate_dim: 3

Zenoh PubSub:

spikes/spatial/where/depth:
  topic_type: "depth_map"
  payload:
    dtype: "float32"
    shape: [batch_size, 480, 640]
    range: [0.0, 100.0]  # meters (clipped)
    frequency: 30  # Hz
    qos: "real_time"

spikes/spatial/where/coordinates:
  topic_type: "spatial_coordinates"
  payload:
    dtype: "float32"
    shape: [batch_size, num_objects, 3]  # (x, y, z) in Allocentric
    frequency: 30
    qos: "real_time"

spikes/spatial/where/optical_flow:
  topic_type: "optical_flow"
  payload:
    dtype: "float32"
    shape: [batch_size, 2, 480, 640]  # (flow_x, flow_y)
    frequency: 30
    qos: "real_time"

Subscribe topics:

spikes/vision/features:
  # Vision (Rank 9) からの視覚特徴
  shape: [batch_size, 256, 60, 80]  # ResNet-50 デフォルト

spikes/pfc/spatial_attention:
  # PFC からの空間注意信号（top-down）
  dtype: "float32"
  shape: [batch_size, num_attention_targets, 2]  # (x, y) in visual coords

spikes/ego_pose:
  # 自己位置・姿勢情報（Rank 0 から）
  dtype: "float32"
  shape: [batch_size, 6]  # (x, y, z, roll, pitch, yaw)

3.2 SPATIAL_WHAT (Rank 13) - What生成経路

脳領域: 視覚皮質（V1-V5）、側頭皮質（IT）

生物学的機能: - オブジェクト識別ベースの視覚生成 - 記憶ベースのシーン再構成 - キャプショニングおよびシーン理解

計算機的機能:

機能	コンポーネント	入力	出力
シーングラフ解析	SceneGraphParser	テキスト説明	Scene graph (JSON)
空間VAE	SpatialVAEDecoder	オブジェクト埋め込み	3D voxel grid
オブジェクト生成	ObjectGenerativeNetwork	セマンティック情報	生成3Dマップ
時系列予測	SpatialTemporalPrediction	空間履歴	t+1フレーム予測

モデル構成:

class SpatialWhat(nn.Module):
    def __init__(self, config):
        self.scenegraph_parser = SceneGraphParser(config)
        self.vae_decoder = SpatialVAEDecoder(config)
        self.object_generator = ObjectGenerativeNetwork(config)
        self.temporal_predictor = SpatialTemporalPrediction(config)

    def forward(self, text_or_scene: Union[str, Tensor]) -> Dict:
        if isinstance(text_or_scene, str):
            # テキストからの生成
            scene_graph = self.scenegraph_parser(text_or_scene)
            spatial_repr = self._graph_to_spatial(scene_graph)
        else:
            # テンソルからの予測
            spatial_repr = self.temporal_predictor(text_or_scene)

        generated_scene = self.vae_decoder(spatial_repr)
        return {
            "spatial_representation": spatial_repr,
            "generated_3d": generated_scene,
            "voxel_grid": generated_scene,
        }

Zenoh PubSub:

spikes/spatial/what/scene_graph:
  topic_type: "scene_graph"
  payload:
    format: "json"
    content:
      objects: [{"id": int, "category": str, "attributes": dict}, ...]
      relationships: [{"subject": int, "predicate": str, "object": int}, ...]
      attributes: {global scene attributes}
    frequency: 10  # Hz (低頻度)
    qos: "best_effort"

spikes/spatial/what/voxel_grid:
  topic_type: "3d_representation"
  payload:
    dtype: "float32"
    shape: [batch_size, 256, 256, 256]
    encoding: "voxel_occupancy"
    frequency: 10
    qos: "best_effort"

spikes/spatial/what/mesh:
  topic_type: "3d_mesh"
  payload:
    format: "glb"  # glTF binary
    vertices: [N, 3]
    faces: [M, 3]
    frequency: 5  # Hz (超低頻度)
    qos: "best_effort"

Subscribe topics:

spikes/language/spatial_description:
  # Language モジュール（Rank 5-7）からのテキスト説明
  dtype: "string"
  max_length: 500

spikes/vision/object_embeddings:
  # Vision からのオブジェクト埋め込み
  dtype: "float32"
  shape: [batch_size, num_objects, 256]

3.3 SPATIAL_INTEGRATION (Rank 14) - 統合ノード

脳領域: 後頭頭頂接合部（OPA, RSC）、側頭頭頂部（TPJ）

生物学的機能: - What（何）と Where（どこ）情報の統合 - 統一的な世界モデルの構築 - 自己中心座標からallocentricへの変換の最終段階 - 空間推論と物体間関係の理解

計算機的機能:

機能	コンポーネント	入力	出力
情報融合	MultiModalSpatialFusion	What + Where	融合表現
世界モデル	WorldModelIntegrator	融合表現(時系列)	統一的世界表現
空間推論	SpatialReasoningEngine	世界モデル	推論結果
視点変換	PerspectiveTransformer	世界座標	異なる視点での表現

アーキテクチャ:

class SpatialIntegration(nn.Module):
    """What-Where統合"""

    def __init__(self, config):
        self.fusion = MultiModalSpatialFusion(config)
        self.world_model = WorldModelIntegrator(config)
        self.reasoning_engine = SpatialReasoningEngine(config)
        self.perspective_transform = PerspectiveTransformer(config)

        # 時系列バッファ（フレーム履歴）
        self.temporal_buffer = deque(maxlen=30)  # 1 sec @ 30 FPS

    def forward(self, what: Tensor, where: Tensor, 
                prev_world_model: Optional[Tensor] = None) -> Dict:
        # 情報融合
        fused = self.fusion(what, where)

        # 時系列統合
        self.temporal_buffer.append(fused)
        integrated_history = torch.stack(list(self.temporal_buffer))

        # 世界モデル更新
        world_model = self.world_model(integrated_history, prev_world_model)

        # 推論
        reasoning_results = self.reasoning_engine(world_model)

        # 視点変換 (複数視点)
        perspectives = self.perspective_transform(world_model)

        return {
            "world_model": world_model,
            "reasoning": reasoning_results,
            "perspectives": perspectives,
        }

Zenoh PubSub:

spikes/spatial/integration/world_model:
  topic_type: "world_model"
  payload:
    dtype: "float32"
    representation_type: "voxel_grid"
    shape: [batch_size, 256, 256, 256]
    contains: ["occupancy", "object_id", "confidence"]
    frequency: 10
    qos: "real_time"

spikes/spatial/integration/reasoning:
  topic_type: "spatial_reasoning"
  payload:
    format: "json"
    content:
      relationships: [{"obj1": int, "relation": str, "obj2": int, "confidence": float}, ...]
      containment: [{"container": int, "objects": [int, ...] }, ...]
      reachability: [{"object": int, "reach_score": float}, ...]
    frequency: 10
    qos: "best_effort"

spikes/spatial/integration/perspective:
  topic_type: "egocentric_view"
  payload:
    dtype: "float32"
    shape: [batch_size, 3, 480, 640]  # RGB rendered from agent POV
    camera_params: {fov: 60.0, near: 0.01, far: 100.0}
    frequency: 30
    qos: "real_time"

Subscribe topics:

spikes/spatial/where/coordinates:
  # Where ノードからの座標

spikes/spatial/what/voxel_grid:
  # What ノードから3D表現

spikes/vision/semantic_segmentation:
  # Vision からのセマンティックセグメンテーション
  dtype: "int32"
  shape: [batch_size, 480, 640]

3.4 SPATIAL_ATTENTION (Rank 15) - 注意制御ノード

脳領域: 前頭眼窩野（OFC）、前補足眼野（SEF）、上側頭溝（STS）

生物学的機能: - タスク駆動型空間注意 - サリエンシ（顕著性）検出 - 眼球運動計画 - 報酬に基づく注意シフト

計算機的機能:

機能	コンポーネント	入力	出力
注意制御	SpatialAttentionController	タスク信号 + 空間コンテキスト	注意ウェイト
フォーカス選択	FocusSelector	注意スコア	処理優先度
顕著性検出	SaliencyDetector	空間情報	顕著性マップ
眼球運動計画	SaccadePlanner	注意ウェイト	サッカード目標

ハイパーパラメータ:

spatial_attention:
  task_integration:
    # PFC からのタスク信号を融合
    method: "multiplicative"  # or "additive"
    temperature: 1.0

  saliency_detection:
    # ボトムアップ顕著性
    method: "center_bias"  # or "entropy", "gradient"
    prior_weight: 0.3  # 中心バイアスの強度

  saccade_planning:
    # 眼球運動計画
    delay: 200  # milliseconds
    velocity: 300  # degrees/second

制御フロー:

class SpatialAttentionController(nn.Module):
    """空間注意制御"""

    def forward(self, 
                task_signal: Tensor,  # PFC からのタスク駆動信号
                spatial_context: Tensor,  # Integration ノードからの世界モデル
                salience_map: Optional[Tensor] = None) -> Dict:

        # ボトムアップ顕著性
        if salience_map is None:
            salience_map = self.saliency_detector(spatial_context)

        # タスク駆動注意
        task_attention = self.task_attention_layer(task_signal)

        # 統合（ボトムアップ ⊕ トップダウン）
        combined_attention = (
            self.bottom_up_weight * salience_map +
            self.top_down_weight * task_attention
        )

        # 正規化
        attention_weights = torch.softmax(combined_attention, dim=-1)

        # フォーカス選択
        focus = self.focus_selector(attention_weights)

        # サッカード計画
        saccade_target = self.saccade_planner(focus)

        return {
            "attention_weights": attention_weights,
            "focus": focus,
            "saccade_target": saccade_target,
            "salience_map": salience_map,
            "task_attention": task_attention,
        }

Zenoh PubSub:

spikes/spatial/attention/weights:
  topic_type: "attention_map"
  payload:
    dtype: "float32"
    shape: [batch_size, 480, 640]
    value_range: [0.0, 1.0]
    frequency: 30
    qos: "real_time"

spikes/spatial/attention/saliency:
  topic_type: "saliency_map"
  payload:
    dtype: "float32"
    shape: [batch_size, 480, 640]
    frequency: 30
    qos: "real_time"

spikes/spatial/attention/saccade:
  topic_type: "motor_command"
  payload:
    dtype: "float32"
    shape: [batch_size, 2]  # (horizontal, vertical) in degrees
    velocity: float  # degrees/second
    frequency: 10  # Hz (不規則）
    qos: "real_time"

Subscribe topics:

spikes/pfc/spatial_task:
  # PFC からのタスク駆動信号
  dtype: "float32"
  shape: [batch_size, 64]  # タスク埋め込み

spikes/spatial/integration/world_model:
  # Integration ノードからの世界モデル

4. 通信プロトコル

4.1 メッセージフォーマット (Zenoh)

基本構造:

// spatial_message.proto
syntax = "proto3";

package evospikenet.spatial;

message SpatialMessageHeader {
    uint32 timestamp_ms = 1;
    uint32 sequence_num = 2;
    string source_node = 3;
    string target_nodes = 4;  // comma-separated
}

message CoordinateMessage {
    SpatialMessageHeader header = 1;
    repeated float x = 2;
    repeated float y = 3;
    repeated float z = 4;
    repeated float confidence = 5;
}

message SpatialAttentionMessage {
    SpatialMessageHeader header = 1;
    repeated float attention_weights = 2;
    repeated int32 focus_indices = 3;
}

4.2 同期メカニズム

PTP (Precision Time Protocol) ベース:

# evospikenet/spatial_processing/synchronization.py

class SpatialTimeSync:
    """空間ノード間のタイムスタンプ同期"""

    def __init__(self):
        self.ptp_client = PTPClient()
        self.clock_offset = 0.0
        self.drift_compensation = 0.0

    def sync_timestamp(self, local_ts: float) -> float:
        """ローカルタイムスタンプを同期されたグローバルタイムに変換"""
        return local_ts + self.clock_offset + self.drift_compensation * local_ts

4.3 レイテンシ要件

ノード間	コンポーネント	許容遅延	優先度
Vision → Where	視覚特徴転送	< 50ms	HIGH
Where → Integration	座標転送	< 20ms	CRITICAL
What → Integration	生成シーン転送	< 100ms	MEDIUM
Integration → Attention	世界モデル	< 50ms	HIGH
Attention → PFC	注意信号	< 30ms	CRITICAL

5. テスト戦略

5.1 ユニットテスト

tests/unit/
├── test_spatial_where.py
│   ├── test_depth_estimation_accuracy()
│   ├── test_coordinate_transformation()
│   └── test_optical_flow_correctness()
├── test_spatial_what.py
│   ├── test_scene_graph_parsing()
│   ├── test_vae_generation_quality()
│   └── test_temporal_consistency()
├── test_spatial_integration.py
│   ├── test_what_where_fusion()
│   ├── test_world_model_updates()
│   └── test_spatial_reasoning()
└── test_spatial_attention.py
    ├── test_attention_control_response()
    ├── test_saliency_detection()
    └── test_saccade_planning()

5.2 統合テスト

tests/integration/
├── test_spatial_pipeline.py
│   └── test_end_to_end_spatial_processing()
├── test_spatial_pfc_integration.py
│   └── test_task_driven_attention()
└── test_spatial_zenoh_communication.py
    └── test_all_topics_publish_subscribe()

5.3 ベンチマーク

tests/performance/
├── test_spatial_latency.py
│   └── latency measurements across all nodes
├── test_spatial_throughput.py
│   └── FPS measurements at different resolutions
└── test_spatial_energy.py
    └── power consumption profiling (GPU/CPU)

6. デプロイメント

6.1 ノード初期化

# config/spatial_nodes.yaml

spatial_where:
  enabled: true
  rank: 12
  gpu_device: 0
  batch_size: 4
  model_checkpoint: "checkpoints/spatial_where_v1.pt"

spatial_what:
  enabled: true
  rank: 13
  gpu_device: 1
  batch_size: 2
  model_checkpoint: "checkpoints/spatial_what_v1.pt"

spatial_integration:
  enabled: true
  rank: 14
  gpu_device: 0  # CPU 可能
  batch_size: 4

spatial_attention:
  enabled: true
  rank: 15
  gpu_device: -1  # CPU only
  batch_size: 8

6.2 起動スクリプト

#!/bin/bash
# scripts/launch_spatial_nodes.sh

# Where ノード起動
python -m evospikenet.spatial_processing.where_node \
  --config config/spatial_nodes.yaml \
  --rank 12 \
  --device cuda:0 &

# What ノード起動
python -m evospikenet.spatial_processing.what_node \
  --config config/spatial_nodes.yaml \
  --rank 13 \
  --device cuda:1 &

# Integration ノード起動
python -m evospikenet.spatial_processing.integration_node \
  --config config/spatial_nodes.yaml \
  --rank 14 \
  --device cuda:0 &

# Attention ノード起動
python -m evospikenet.spatial_processing.attention_node \
  --config config/spatial_nodes.yaml \
  --rank 15 \
  --device cpu &

wait

7. 関連ドキュメント

Remaining_Functionality.md - Feature 13 詳細実装プラン
DISTRIBUTED_BRAIN_SYSTEM.md - 分散脳全体仕様
ZIGZAG_WHERE_WHAT_INTEGRATION.md - Where-What統合の詳細

Next Steps: - [x] evospikenet/node_types.py にノード定義を追加 - [x] Protocol Buffer スキーマファイル作成 - [x] Zenoh PubSub トピック管理システム実装 - [x] Phase 13.1 の実装完了（2026年Q2）

[!TIP] すべての主要な空間処理機能は実装完了し、SDK 経由で利用可能です。