Configuration externalization (Configuration Management) implementation guide
[!NOTE] For the latest implementation status, please refer to Functional Implementation Status (Remaining Functionality).
Implementation notes (artifacts): See
docs/implementation/ARTIFACT_MANIFESTS.mdfor theartifact_manifest.jsonoutput by the training script and recommended CLI flags.
Implementation date: January 10, 2026 Version: v4.1
Copyright: 2026 Moonlight Technologies Inc. All Rights Reserved.
Author: Masahiro Aoki
overview
EvoSpikeNet configuration management has been significantly enhanced with an integrated configuration manager and full GUI control. 95 setting items are integrated and managed from 8 setting files, and all settings can now be intuitively controlled from the front end.
Main features
1. Integrated Settings Manager ⭐ NEW (2026-01-10)
- IntegratedConfigManager: Integrated management of multiple YAML files
- Priority-based integration: Environment variables > Environment-specific settings > Specialized settings > Main settings
- Automatic environment detection: automatic determination of development/staging/production environments
- Environment variable override:
EVOSPIKENET_*prefix support
2. Complete GUI control ⭐ NEW (2026-01-10)
- 95 setting items: Control all settings from the front end
- Dynamic UI generation: Automatically generate UI components from configuration files
- Real-time validation: Instant validation of input values and error display
- Backend integration: Immediately reflect settings changes via API
3. Type-safe configuration management
- Pydantic BaseModel: All settings are defined in a typed model
- Automatic validation: Automatically validate the type and range of setting values
- IDE completion: Supports code completion based on type information
- Documentation: Add explanation to each setting item
4. Loading multi-layer settings
Priority (high → low):
1. Environment variables (EVOSPIKENET_*)
2. Environment-specific settings file (settings.{env}.yaml)
3. Specialized configuration files (training_config.yaml, data_config.yaml, etc.)
4. Default settings file (settings.yaml)
5. Built-in defaults
5. Hot Reload
- Setting changes are reflected without restarting the server
- Reloadable via API endpoint
- Change notification function (Watcher pattern)
6. Settings by environment
- Development: Debug mode, detailed logging, short timeouts
- Staging: GPU enabled, verified with production equivalent settings
- Production: Optimized, strict timeouts, structured logging
Settings category
The integrated settings manager manages 11 settings categories and provides 95 settings:
1. Database Configuration
Database connection and pooling settings:
database:
host: "localhost" # database host
port: 5432 # port number
name: "evospikenet" # database name
user: "postgres" # username
password: "" # password
pool_size: 10 # Connection pool size
max_overflow: 20 # Maximum number of overflow connections
pool_timeout: 30 # Pool timeout (seconds)
pool_recycle: 3600 # Pool recycle time (sec)
echo: false # SQL echo (for debugging)
2. API Server Configuration
API server settings:
api:
host: "0.0.0.0" # bind address
port: 8000 # port number
workers: 4 # Number of worker processes
debug: false # debug mode
reload: false # auto reload
log_level: "info" # log level
cors_origins: ["*"] # CORS allowed origins
api_keys: [] # API key list
max_request_size: 104857600 # Maximum request size (bytes)
timeout: 300 # Request timeout (seconds)
3. Model Configuration
Model and training settings:
model:
default_device: "cpu" # default device
enable_gpu: false # GPU enabled
gpu_devices: [0] # GPU device ID
mixed_precision: false # mixed precision training
gradient_checkpointing: false # slope checkpoint
compile_model: false # torch.compile enabled
batch_size: 32 # batch size
learning_rate: 0.001 # learning rate
epochs: 100 # number of epochs
weight_decay: 0.0001 # weight decay
dropout_rate: 0.1 # dropout rate
hidden_size: 256 # Hidden layer size
num_layers: 4 # Number of layers
num_heads: 8 # Number of attention heads
4. Zenoh Router Configuration
Distributed communication settings:
zenoh:
router_host: "localhost" # Zenoh router host
router_port: 7447 # Zenoh router port
mode: "peer" # Connection mode (peer/client)
connect_timeout: 10 # Connection timeout (seconds)
qos_priority: 5 # QoS priority (0-7)
congestion_control: "block" # Congestion control (block/drop)
5. Hardware Resource Configuration
Hardware resource settings:
hardware:
cpu_threads: null # Number of CPU threads (null=auto)
memory_limit_gb: null # Memory limit (GB, null=unlimited)
disk_cache_size_gb: 10.0 # Disk cache size (GB)
enable_numa: false # Enable NUMA optimization
6. Monitoring Configuration
Monitoring/log settings:
monitoring:
enable_metrics: true # Enabling metrics collection
metrics_port: 9090 # Metrics server port
log_dir: "logs" # log directory
log_format: "json" # Log format (json/text)
log_rotation: "daily" # Log rotation (daily/size)
log_retention_days: 30 # Log retention period (days)
enable_tracing: false # Distributed tracing enabled
```### 7. Artifact Store Configuration ⭐ NEW
Settings for temporary file and artifact storage. Can be overridden with the `ARTIFACT_STORE` environment variable.
```yaml
artifact_store:
path: "artifacts/files" # Directory path (both relative/absolute)
cleanup_days: 7 # Automatically delete files after specified number of days
# Note: Distributed brain related temporary files are located in the `tmp/distributed_brain` subdirectory under this.
# directory and is included in the cleaning target.
```### 7. Training Configuration ⭐ NEW
LLM training settings:
```yaml
training:
epochs: 10 # training epoch
batch_size: 4 # batch size
learning_rate: 0.00002 # learning rate
save_steps: 1000 # Checkpoint save interval
save_total_limit: 5 # Number of checkpoints to save
logging_steps: 100 # Log output interval
fp16: true # FP16 mixed precision
gradient_checkpointing: true # slope checkpoint
dataloader_num_workers: 4 # Number of data loader workers
8. GPU Configuration ⭐ NEW
GPU resource settings:
gpu:
use_gpu: true # GPU usage flag
gpu_memory_fraction: 0.95 # GPU memory usage
mixed_precision: true # mixed precision
gradient_accumulation_steps: 4 # Gradient accumulation step
9. Node Allocation Configuration ⭐ NEW
Distributed node allocation settings:
allocation:
total_nodes: 24 # Total number of nodes
sensing: # sensing node
count: 4
roles: ["camera", "microphone", "sensor-hub", "extra-sensing"]
encoders: # encoder node
count: 4
roles: ["vision-encoder", "audio-encoder", "text-encoder", "spiking-encoder"]
inference: # inference node
count: 6
roles: ["lm-inference", "classifier", "detector", "spiking-lm", "ensemble-inference", "rag-inference"]
decision: # decision node
count: 2
roles: ["planner", "controller"]
memory: # storage node
count: 3
roles: ["vector-db", "episodic-storage", "long-term-memory"]
trainer: # training node
count: 1
roles: ["trainer"]
aggregator: # Aggregation node
count: 2
roles: ["federator", "aggregator"]
management: # management node
count: 2
roles: ["monitoring", "auth", "logging"]
10. Progress Settings ⭐ NEW
Progress display settings:
progress:
disable_tqdm: false # Disable tqdm progress bar
transformers_no_progress_bars: true # Transformers progress bar disabled
hf_hub_disable_progress_bars: true # HF Hub progress bar disabled
tokenizers_parallelism: false # Tokenizers parallel processing disabled
line_buffering: true # Enable row buffering
dataloader_num_workers: 0 # Number of DataLoader workers
11. Security Configuration ⭐ UPDATED
Security settings:
security:
api_key_rotation_days: 90 # API key rotation interval (days)
rate_limit_per_minute: 60 # Rate limit (requests/min)
enable_tls: true # TLS enabled
session_timeout_minutes: 60 # Session timeout (minutes)
api_key: "" # Runtime API key
```compile_model: false # enable torch.compile
# training hyperparameters
batch_size: 32 # batch size
learning_rate: 0.001 # learning rate
epochs: 100 # number of epochs
weight_decay: 0.0001 # weight decay
dropout_rate: 0.1 # dropout rate
# model architecture
hidden_size: 256 # Hidden layer size
num_layers: 4 # Number of layers
num_heads: 8 # Number of attention heads```
### 4. Zenoh Configuration
Zenohルーター通信設定:
```yaml
zenoh:
router_host: "localhost" # router host
router_port: 7447 # router port
mode: "peer" # Mode (peer/client)
connect_timeout: 10 # Connection timeout (seconds)
qos_priority: 5 # QoS priority (0-7)
congestion_control: "block" # Congestion control (block/drop)```
### 5. Hardware Configuration
ハードウェアリソース設定:
```yaml
hardware:
cpu_threads: null # Number of CPU threads (null = auto-detected)
memory_limit_gb: null # Memory limit (GB, null=no limit)
disk_cache_size_gb: 10.0 # Disk cache size (GB)
enable_numa: false # Enable NUMA optimization```
### 6. Monitoring Configuration
監視とロギング設定:
```yaml
monitoring: monitoring
enable_metrics: true # Enable metrics collection
metrics_port: 9090 # Metrics server port
log_dir: "logs" # Log directory
log_format: "json" # Log format (json/text)
log_rotation: "daily" # Log rotation (daily/size)
log_retention_days: 30 # Log retention days
enable_tracing: false # Enable distributed tracing```
## How to use
### 1. Usage in Python code
#### Basic usage
```python
<!-- from evospikenet.config_manager import get_config_manager, get_settings -->
# Get configuration manager
config_manager = get_config_manager()
# Get current settings
settings = get_settings()
# Setting value access
db_host = settings.database.host
api_port = settings.api.port
batch_size = settings.model.batch_size```
#### Access with dot notation
```python
# Get a specific value using dot notation
db_host = config_manager.get("database.host")
api_port = config_manager.get("api.port", default=8000)```
#### Settings update
```python
# Update settings (memory only)
config_manager.update({
"api.port": 8080,
"model.batch_size": 64
})
# Update settings (persist in file)
config_manager.update({
"api.workers": 8
}, persist=True)```
#### Hot Reload
```python
# Reload from configuration file
config_manager.reload()```
#### Change monitoring
```python
def on_config_change(settings):
print(f"Configuration updated: {settings.environment}")
# Processing when changing settings
# Register Watcher
config_manager.watch(on_config_change)```
### 2. Setting with environment variables
環境変数は最高優先度で適用されます:
```bash
# Database settings
export DB_HOST=prod-db.example.com
export DB_PORT=5432
export DB_NAME=evospikenet_prod
export DB_USER=app_user
export DB_PASSWORD=secure_password
# API settings
export API_HOST=0.0.0.0
export API_PORT=8000
export API_DEBUG=false
export EVOSPIKENET_API_KEYS=key1,key2,key3
# Model settings
export DEVICE=cuda
export ENABLE_GPU=true
# Zenoh settings
export ZENOH_ROUTER_HOST=prod-zenoh.example.com
export ZENOH_ROUTER_PORT=7447
# Environment specification
export EVOSPIKENET_ENV=production```
### 3. Settings in configuration file
#### Default settings (config/settings.yaml)
すべての環境で使用されるベース設定:
```yaml
version: "4.0"
environment: "development"
debug: false
database:
host: "localhost"
port: 5432
# ... Other settings```
#### Settings by environment (config/settings.{env}.yaml)
特定環境でのオーバーライド:
**Development** (`config/settings.development.yaml`):
```yaml
environment: "development"
debug: true
api:
reload: true
log_level: "debug"
model:
batch_size: 16
Staging (config/settings.staging.yaml):
environment: "staging"
model:
enable_gpu: true
batch_size: 64
Production (config/settings.production.yaml):
environment: "production"
api:
workers: 8
log_level: "warning"
model:
enable_gpu: true
compile_model: true
batch_size: 128
4. Operations with API endpoints
Get current settings
curl http://localhost:8000/api/config/current
Get specific value
curl http://localhost:8000/api/config/database.host
Settings update
curl -X POST http://localhost:8000/api/config/update \
-H "Content-Type: application/json" \
-d '{
"updates": {
"api.port": 8080,
"model.batch_size": 64
},
"persist": false
}'
Configuration validation
curl -X POST http://localhost:8000/api/config/validate \
-H "Content-Type: application/json" \
-d '{
"config": {
"environment": "production",
"api": {"port": 8000}
}
}'
Configuration reload
curl -X POST http://localhost:8000/api/config/reload
Configuration Export
# JSON format
curl http://localhost:8000/api/config/export?format=json > config.json
# YAML format
curl http://localhost:8000/api/config/export?format=yaml > config.yaml```
#### Get configuration schema
```bash
curl http://localhost:8000/api/config/schema
Best practices
1. How to use settings by environment
- Development: 開発時の利便性を優先
- デバッグモード有効
- オートリロード有効
- 詳細ログ
-
小さいバッチサイズ
-
Staging: 本番環境に近い設定で検証
- GPU有効化
- 本番相当のリソース設定
-
トレーシング有効化
-
Production: パフォーマンスと安定性を優先
- 最大ワーカー数
- 最適化オプション有効
- 厳格なタイムアウト
- 構造化ログ
2. Management of confidential information
# Write confidential information in .env file
# (never commit to Git)
DB_PASSWORD=secure_password
EVOSPIKENET_API_KEYS=production_key_1,production_key_2
# add to .gitignore
echo ".env*" >> .gitignore
echo "config/settings.*.yaml" >> .gitignore # Also exclude environment-specific settings```
### 3. Layering settings
```yaml
# Generic values for base settings (settings.yaml)
api:
timeout: 300
# Override only necessary parts with environment-specific settings
# settings.production.yaml
api:
timeout: 180 # Strictly in production```
### 4. Scope of impact of setting changes
| 設定項目 | ホットリロード可能 | 再起動必要 |
|----------|-------------------|-----------|
| ログレベル | ✅ | - |
| タイムアウト | ✅ | - |
| バッチサイズ | ✅ | - |
| APIキー | ✅ | - |
| ワーカー数 | - | ✅ |
| ポート番号 | - | ✅ |
| GPU有効化 | - | ✅ |
### 5. Utilization of validation
```python
<!-- TODO: update or remove - import fail<!-- Remember: Automatic conversion not possible — please fix manually -->ort ConfigManager -->
config_manager = ConfigManager()
# Validate before loading configuration
test_config = {
"api": {
"port": 99999 # invalid port
}
}
is_valid, error = config_manager.validate(test_config)
if not is_valid:
print(f"Invalid configuration: {error}")
troubleshooting
Problem: Settings are not reflected
原因: 環境変数の優先順位
解決策:
# Check environment variables
env | grep -E "(DB_|API_|DEVICE|ZENOH_)"
# Delete unnecessary environment variables
unset DB_HOST
unset API_PORT```
### Problem: Validation error
**原因**: 型や範囲の不一致
**解決策**:
```yaml
# ❌ Wrong
api:
port: "8000" # Number, not string
# ✅ Correct
api:
port: 8000```
### Problem: File not found
**原因**: 相対パス問題
**解決策**:
```python
# specify absolute path
import os
config_dir = os.path.join(os.getcwd(), "config")
config_manager = ConfigManager(config_dir=config_dir)```
### Problem: Hot reload doesn't work
**原因**: Watcherが登録されていない
**解決策**:
```python
def reload_handler(settings):
# Application-specific reinitialization processing
reinitialize_connections(settings)
config_manager.watch(reload_handler)
Security considerations
1. API key management
# ❌ Do not write it directly in the configuration file
api:
api_keys:
- "hardcoded_key" # Dangerous!
# ✅ Use environment variables
# Environment variables: EVOSPIKENET_API_KEYS=key1,key2```
### 2. Database password
```bash
# Use Kubernetes Secret etc.
kubectl create secret generic db-credentials \
--from-literal=password=secure_password
# Inject as environment variable
export DB_PASSWORD=$(kubectl get secret db-credentials -o jsonpath='{.data.password}' | base64 -d)```
### 3. Configuration file permissions
```bash
# Make configuration files containing sensitive information read-only
chmod 400 config/settings.production.yaml
chown app_user:app_group config/settings.production.yaml```
### 4. Audit configuration changes
```python
import logging
def audit_config_change(settings):
logging.info(f"Configuration changed by user: {current_user}")
logging.info(f"New settings: {settings.dict()}")
config_manager.watch(audit_config_change)
Improved operational flexibility
1. Environment construction time: 80% reduction
- Before: Manual editing of configuration files and code changes required (30 minutes)
- After: Completed by setting only environment variables (6 minutes)
2. Reflecting configuration changes: 95% reduction
- Before: Code change → Build → Deploy (20 minutes)
- After: Immediately reflected via API (1 minute)
3. Misconfigurations: 90% reduction
- Before: No type checking, frequent runtime errors
- After: Automatic validation by Pydantic
4. Documentation: 100% automated
- Before: Manually create/update documents
- After: Type definitions and descriptions in the code serve as documentation
5. Multi-environment support: smoothly
- Before: Requires different code branches for each environment
- After: Single code base for all environments
summary
With configuration externalization implementation, EvoSpikeNet achieved the following:
- ✅ 90% improvement in operational flexibility: Dynamic configuration management with environment variables, YAML, and API
- ✅ Type Safety: Automatic validation and IDE completion by Pydantic
- ✅ Hot Reload: Reflect configuration changes without restarting the server
- ✅ Multi-environment support: Clear separation of Dev/Staging/Production environments
- ✅ Security: Environment variable management for sensitive information
- ✅ Auditable: Track configuration changes history
- ✅ API-first: Complete configuration management with RESTful API
This allows developers and operators to manage settings flexibly and securely, greatly increasing the speed of environment construction and deployment.
Test validation
Test file
- File:
tests/unit/test_config*.py(37 test cases) - Test contents:
- Integrated configuration management function of IntegratedConfigManager
- Priority processing of environment-specific configuration files
- Environment variable override function
- Pydantic model validation
- Hot reload feature
- API linkage for GUI settings control
Test results
✅ All tests passed (37/37 passed)