Edge implementation plan

Edge implementation plan

Implementation status summary: - Phase 1: Implemented (environment check, representative sample generation) - Phase 2: Implemented (local edge server + client delay measurement) - Phase 3: Partially implemented (TorchScript verification completed, TFLite/CoreML dependencies not installed and not executed) - Phase 4: Procedures have been created (actual machine runbook has been generated, only actual measurements have not been performed) - Phase 5: Implemented (automatic generation of evaluation report)

Deliverables: - Automatic execution pipeline: scripts/device/run_edge_phase_pipeline.py - Implementation report: Docs/edge_phase1_5_implementation_report.md - Latest local run results: bench_output/edge_phase_runs/20260409-130535/edge_phase1_5_report.md

Goal: - Verify whether EvoSpikeNet SDK can be operated in an edge environment (Raspberry Pi / Android / iPhone) and finally decide on the optimal deployment strategy (SDK standalone execution vs. model conversion).

Timeline (outline): 1) Preparation (0.5–1 day) - Development PC: Prepared virtual environment (Python 3.12) - Required tools: onnx, onnx-tf, tensorflow, coremltools (required for quantization verification) - Prepare representative sample data (.npy in rep_samples/)

2) Local PoC (1–2 days) - Use scripts/device/edge_server.py to run Edge SDK as a local server - Measure latency/throughput with mobile_client_sim.py - Get CPU/memory statistics with collect_bench_and_log.py

3) Conversion & Quantization Test (1–3 days) - Test TorchScript/ONNX/TFLite/CoreML on smaller models - Quantize using representative samples (android_convert_tflite.py --quantize --rep-dir rep_samples/) - Automate post-conversion differential testing (inference output, accuracy, latency)

4) Actual machine verification (Raspberry Pi first, then mobile) (1–2 days/model) - RPi: Install dependencies → bench using rpi_setup_and_bench.sh - Android: Embed TFLite to measure inference on device - iPhone: Integrate CoreML and measure Energy with Instruments

5) Evaluation and decision making (0.5 days) - Key metrics: latency, throughput, average power consumption, implementation cost (CI/maintenance) - Threshold proposal: - SDK alone can meet target latency (e.g. <10ms) and power consumption → Prioritize SDK - Reduce weight by model conversion, and use conversion if accuracy degradation is limited

Checklist (at runtime): - [ ] Can reproduce virtual environment and dependencies cleanly - [ ] More than 200 representative samples (for quantization) - [ ] Bench: Measure cold-start and steady-state separately - [ ] Power: Obtained simultaneously with external power meter (or RAPL) - [ ] After conversion: Compare output L2 difference/task level accuracy

Automation script (recommended): - scripts/device/rpi_setup_and_bench.sh — Dependency installation + bench - scripts/device/android_convert_tflite.py — Conversion and quantization - scripts/device/collect_bench_and_log.py — Sample acquisition - scripts/device/plot_bench.py — Plot/summary generation

Risks and mitigations: - Conversion failure (custom SNN layer): Switch to SDK-at-the-edge approach and implement custom operations as TFLite custom OPs if needed - Accuracy loss due to quantization: consider light fine tuning (QAT) before post-training quantization - Device-dependent performance differences: Create profiles for each RPi/Android/iOS and separate deployment strategies

Next action (suggestion): 1. Prepare a representative sample directory (.npy in rep_samples/) 2. Please specify one actual small model (e.g. models/tinynet.pt) 3. I run the transformation and quantization and report the results (CSV/JSON/plots)

Note: Hardware is required for power measurement on the actual device. If measurement data is available, power consumption analysis can also be performed here.