DB integrated federated learning - aggregation by LLM type

[!NOTE] For the latest implementation status, please refer to Functional Implementation Status (Remaining Functionality).

Implementation notes (artifacts): See docs/implementation/ARTIFACT_MANIFESTS.md for the artifact_manifest.json output by the training script and recommended CLI flags.

overview

EvoSpikeNet's federated learning automatically selects the best aggregation method based on the type of LLM model stored in the database.

Why do we need aggregation by type?

Problems

Traditional FedAvg (Federated Averaging) treats all models equally. But in EvoSpikeNet's distributed brain system:

Different LLM types coexist: Text, MultiModal, Vision, Audio
Different evaluation criteria: perplexity for text, object detection accuracy for vision
Differences in characteristics between modalities: WER for audio, modality balance for multimodal

Solution

By applying optimized aggregation logic by LLM type:

✅ Weighted with evaluation indicators suitable for each type
✅ Automatically retrieve type information from DB
✅ Prioritize collection of high-quality models

LLM type and DB cooperation

Database schema

class DataArtifact(Base):
    artifact_id = Column(UUID, primary_key=True)
    session_id = Column(UUID, ForeignKey("execution_sessions.session_id"))
    artifact_type = Column(String)  # 'model', 'log', etc.
    llm_type = Column(String)       # ★This is important!
    name = Column(String)
    data = Column(LargeBinary)
    file_path = Column(String)
    created_at = Column(DateTime)

LLM type value

SpikingEvoTextLM - Text-only LLM
SpikingEvoMultiModalLM - Multimodal LLM
SpikingEvoVisionEncoder - Visual encoder
SpikingEvoAudioEncoder - Audio encoder

Aggregation logic by type

1. SpikingEvoTextLM (Text LLM)

# Evaluation index
- perplexity（パープレキシティ）: 低いほど良い
- loss（損失）: 低いほど良い

# Weighting calculation
perplexity_score = 1.0 / (1.0 + log(perplexity))
loss_score = 1.0 / (1.0 + loss)
quality_score = (perplexity_score + loss_score) / 2

# aggregation
adjusted_weight = num_examples × quality_score

Features: Prefer models with low perplexity and loss

2. SpikingEvoMultiModalLM (Multimodal LLM)

# Evaluation index
- text_accuracy: テキスト精度
- image_accuracy: 画像精度
- audio_accuracy: 音声精度

# modality balance
modality_balance = min(text_acc, image_acc, audio_acc)
avg_accuracy = (text_acc + image_acc + audio_acc) / 3

# Weighting (emphasis on balance with harmonic average)
quality_score = 2 × modality_balance × avg_accuracy /
                (modality_balance + avg_accuracy)

# aggregation
adjusted_weight = num_examples × quality_score

Feature: Prioritize models that perform equally well across all modalities

3. SpikingEvoVisionEncoder (visual encoder)

# Evaluation index
- object_detection_accuracy: 物体検出精度
- edge_detection_accuracy: エッジ検出精度
- image_classification_accuracy: 画像分類精度

# weighting
vision_score = 0.4 × object_detection_acc +
               0.3 × edge_detection_acc +
               0.3 × image_classification_acc

# aggregation
adjusted_weight = num_examples × (1.0 + vision_score)

Features: Prioritizes object detection, also considers edge detection and classification

4. SpikingEvoAudioEncoder

# Evaluation index
- speech_recognition_accuracy: 音声認識精度
- word_error_rate (WER): 単語誤り率（低いほど良い）
- noise_robustness: ノイズ耐性

# weighting
wer_score = 1.0 - min(wer, 1.0)
audio_score = 0.5 × speech_recognition_acc +
              0.3 × wer_score +
              0.2 × noise_robustness

# aggregation
adjusted_weight = num_examples × (1.0 + audio_score)

Features: Focus on speech recognition accuracy and WER, also consider noise resistance

How to use

1. Starting the DB integrated server

# Setting environment variables
export DATABASE_URL="postgresql://user:password@localhost/evospikenet"

# Server startup (automatic LLM type detection)
python examples/run_fl_server_with_db.py \
    --strategy hybrid \
    --num-rounds 10 \
    --min-clients 3 \
    --use-type-based-aggregation

# Aggregate only specific LLM types
python examples/run_fl_server_with_db.py \
    --strategy hybrid \
    --target-llm-type SpikingEvoTextLM \
    --use-type-based-aggregation

2. Client-side implementation

import flwr as fl
try:
    from evospikenet.federated import EvoSpikeNetClient
except Exception:
    EvoSpikeNetClient = None

class MyClient(fl.client.NumPyClient):
    def fit(self, parameters, config):
        # training process
        # ...

        # Including LLM type in metrics★Important★
        metrics = {
            'llm_type': 'SpikingEvoTextLM',  # ★Specify type here
            'loss': 0.05,
            'perplexity': 12.5,
            'accuracy': 0.94
        }

        # num_examples is client implementation dependent
        num_examples = 100
        return parameters, num_examples, metrics

# Client startup (guarded)
fl.client.start_numpy_client(
    server_address="localhost:8080",
    client=MyClient()
)

3. Use from a program

try:
    from evospikenet.federated_strategy import create_db_integrated_strategy
except Exception:
    create_db_integrated_strategy = None

if create_db_integrated_strategy is not None:
    strategy = create_db_integrated_strategy(
        db_session_factory=SessionLocal,
        strategy_type="hybrid",
        target_llm_type="SpikingEvoMultiModalLM",
        use_type_based_aggregation=True,
        min_fit_clients=3
    )

    # Start Flower server
    fl.server.start_server(
        server_address="0.0.0.0:8080",
        config=fl.server.ServerConfig(num_rounds=10),
        strategy=strategy
    )
else:
    print("create_db_integrated_strategy not available in this environment; ensure evospikenet.federated_strategy is installed.")

Get model from DB

Get the latest model

# Example: fetch latest models from DB (guarded)
try:
    from evospikenet.db import get_latest_models_by_type
except Exception:
    get_latest_models_by_type = None

if get_latest_models_by_type is not None:
    models = get_latest_models_by_type(llm_type='SpikingEvoTextLM')
    for model_info in models:
        print(f"Model: {model_info['artifact_id']}")
        print(f"  Type: {model_info['llm_type']}")
        print(f"  Created: {model_info['created_at']}")
        print(f"  Session: {model_info['session_id']}")
else:
    print("DB helper get_latest_models_by_type not available; inspect evospikenet.db for DB utilities.")

Loading the model

try:
    from evospikenet.federated_strategy import load_model_from_artifact
except Exception:
    load_model_from_artifact = None

if load_model_from_artifact is not None:
    artifact_id = "some-artifact-uuid"
    model = load_model_from_artifact(artifact_id)
    if model:
        print("Model loaded successfully:", getattr(model, 'artifact_id', None))
    else:
        print("Model not found or failed to load")
else:
    print("load_model_from_artifact not available; check evospikenet.federated_strategy.")

Case 2: Automatic detection of multimodal LLM

# Server startup (LLM type automatic detection)
python examples/run_fl_server_with_db.py \
    --strategy hybrid \
    --num-rounds 10

# Output example
[INFO] Round 1: Target LLM type = SpikingEvoMultiModalLM (auto-detected)
[DEBUG] MultiModal LLM client-1: balance=0.82, quality=0.875
[DEBUG] MultiModal LLM client-2: balance=0.91, quality=0.923
[INFO] Selected aggregator: client-2 (hybrid score: 0.945)

Case 3: Visual encoder aggregation

# server start
python examples/run_fl_server_with_db.py \
    --target-llm-type SpikingEvoVisionEncoder \
    --strategy performance_based \
    --num-rounds 8

# Output example
[INFO] Round 1: Target LLM type = SpikingEvoVisionEncoder
[DEBUG] Vision Encoder client-1: vision_score=0.789
[DEBUG] Vision Encoder client-2: vision_score=0.845
[INFO] Selected aggregator: client-2 (performance score: 0.912)

Disable aggregation by type

If you don't want to use aggregation by type and want to use standard FedAvg:

python examples/run_fl_server_with_db.py \
    --no-type-based-aggregation \
    --strategy pfc_centric

Best practices

1. Client side

# ✅ Recommended: Always include llm_type in metrics
metrics = {
    'llm_type': 'SpikingEvoTextLM',
    'loss': 0.05,
    'perplexity': 12.5,
    # Type-specific metrics
}

# ❌ DEPRECATED: Auto-detection is not possible without llm_type
metrics = {
    'loss': 0.05,
    'accuracy': 0.94
}

2. Server side

# ✅ Recommended: Specify target_llm_type explicitly
strategy = create_db_integrated_strategy(
    db_session_factory=SessionLocal,
    target_llm_type='SpikingEvoTextLM',  # explicitly specified
    use_type_based_aggregation=True
)

# ⚠️ Note: Automatic detection is useful in mixed cases, but less predictable
strategy = create_db_integrated_strategy(
    db_session_factory=SessionLocal,
    target_llm_type=None,  # automatic detection
    use_type_based_aggregation=True
)

3. Metric naming conventions

LLM Type	Required Metrics	Optional Metrics
SpikingEvoTextLM	`perplexity`, `loss`	`bleu_score`, `rouge_score`
SpikingEvoMultiModalLM	`text_accuracy`, `image_accuracy`, `audio_accuracy`	`cross_modal_coherence`
SpikingEvoVisionEncoder	`object_detection_accuracy`	`edge_detection_accuracy`, `accuracy`
SpikingEvoAudioEncoder	`speech_recognition_accuracy`, `word_error_rate`	`noise_robustness`

troubleshooting

Q1: "No clients with target LLM type" error

Cause: Client of the specified LLM type is not connected

Solution:```bash

Automatic detection by setting target_llm_type to None

python examples/run_fl_server_with_db.py --target-llm-type None

### Q2: Metrics are not reflected

**Cause**: Metrics sent by the client lack type-specific information

**Solution**:```python
# Add metrics on the client side
metrics = {
    'llm_type': 'SpikingEvoTextLM',
    'perplexity': calculate_perplexity(),  # ★Add
    'loss': current_loss
}

Q3: DATABASE_URL error

Cause: Environment variable not set

Solution:bash export DATABASE_URL="postgresql://user:pass@localhost/evospikenet" python examples/run_fl_server_with_db.py

summary

With DB integrated federated learning:

Automatic LLM type detection: Get type information from DB
Optimization by type: Aggregation logic suitable for each LLM
Prefer high quality models: Metric-based weighting
Flexible configuration: Supports both automatic detection and explicit specification

will be realized.