Edge implementation plan
Edge implementation plan
Implementation status summary: - Phase 1: Implemented (environment check, representative sample generation) - Phase 2: Implemented (local edge server + client delay measurement) - Phase 3: Partially implemented (TorchScript verification completed, TFLite/CoreML dependencies not installed and not executed) - Phase 4: Procedures have been created (actual machine runbook has been generated, only actual measurements have not been performed) - Phase 5: Implemented (automatic generation of evaluation report)
Deliverables:
- Automatic execution pipeline: scripts/device/run_edge_phase_pipeline.py
- Implementation report: Docs/edge_phase1_5_implementation_report.md
- Latest local run results: bench_output/edge_phase_runs/20260409-130535/edge_phase1_5_report.md
Goal: - Verify whether EvoSpikeNet SDK can be operated in an edge environment (Raspberry Pi / Android / iPhone) and finally decide on the optimal deployment strategy (SDK standalone execution vs. model conversion).
Timeline (outline):
1) Preparation (0.5–1 day)
- Development PC: Prepared virtual environment (Python 3.12)
- Required tools: onnx, onnx-tf, tensorflow, coremltools (required for quantization verification)
- Prepare representative sample data (.npy in rep_samples/)
2) Local PoC (1–2 days)
- Use scripts/device/edge_server.py to run Edge SDK as a local server
- Measure latency/throughput with mobile_client_sim.py
- Get CPU/memory statistics with collect_bench_and_log.py
3) Conversion & Quantization Test (1–3 days)
- Test TorchScript/ONNX/TFLite/CoreML on smaller models
- Quantize using representative samples (android_convert_tflite.py --quantize --rep-dir rep_samples/)
- Automate post-conversion differential testing (inference output, accuracy, latency)
4) Actual machine verification (Raspberry Pi first, then mobile) (1–2 days/model)
- RPi: Install dependencies → bench using rpi_setup_and_bench.sh
- Android: Embed TFLite to measure inference on device
- iPhone: Integrate CoreML and measure Energy with Instruments
5) Evaluation and decision making (0.5 days) - Key metrics: latency, throughput, average power consumption, implementation cost (CI/maintenance) - Threshold proposal: - SDK alone can meet target latency (e.g. <10ms) and power consumption → Prioritize SDK - Reduce weight by model conversion, and use conversion if accuracy degradation is limited
Checklist (at runtime): - [ ] Can reproduce virtual environment and dependencies cleanly - [ ] More than 200 representative samples (for quantization) - [ ] Bench: Measure cold-start and steady-state separately - [ ] Power: Obtained simultaneously with external power meter (or RAPL) - [ ] After conversion: Compare output L2 difference/task level accuracy
Automation script (recommended):
- scripts/device/rpi_setup_and_bench.sh — Dependency installation + bench
- scripts/device/android_convert_tflite.py — Conversion and quantization
- scripts/device/collect_bench_and_log.py — Sample acquisition
- scripts/device/plot_bench.py — Plot/summary generation
Risks and mitigations: - Conversion failure (custom SNN layer): Switch to SDK-at-the-edge approach and implement custom operations as TFLite custom OPs if needed - Accuracy loss due to quantization: consider light fine tuning (QAT) before post-training quantization - Device-dependent performance differences: Create profiles for each RPi/Android/iOS and separate deployment strategies
Next action (suggestion):
1. Prepare a representative sample directory (.npy in rep_samples/)
2. Please specify one actual small model (e.g. models/tinynet.pt)
3. I run the transformation and quantization and report the results (CSV/JSON/plots)
Note: Hardware is required for power measurement on the actual device. If measurement data is available, power consumption analysis can also be performed here.