Edge Phase 1-5 implementation report
[!NOTE] For the latest implementation status, please refer to Functional Implementation Status (Remaining Functionality).
Update date: 2026-04-09
Overview: - Condensed Phases 1-5 of the Edge implementation plan into a locally re-executable format. - Execution pipeline, representative sample generation, conversion difference verification, actual machine runbook generation, and evaluation report generation are already implemented. - The actual measurement of Raspberry Pi / Android / iPhone has not been carried out due to the absence of hardware, but the execution procedure and deliverables have been prepared.
Implemented content
- Phase 1: Preparation
- Added
scripts/device/generate_rep_samples.pyto enable automatic generation of.npysamples for quantization. -
The presence of dependencies in
scripts/device/run_edge_phase_pipeline.pyis automatically checked and saved in JSON. -
Phase 2: Local PoC
- Started
edge_server.pyfromscripts/device/run_edge_phase_pipeline.pyand automated delay measurement for 100 requests inmobile_client_sim.py. -
In local execution, steady-state was about 1ms, and the maximum value including cold-start was about 4.6 seconds.
-
Phase 3: Transformation and quantization verification
- Modified
scripts/device/convert_torchscript.pyto output eager / TorchScript / ONNX artifact. - Fixed broken execution flow in
scripts/device/ios_convert_coreml.py. - Made
scripts/device/android_convert_tflite.pycompatible with eager / TorchScript artifact. - Added
scripts/device/validate_converted_model.pyso that output differences between eager and TorchScript can be compared by L2 distance. -
In this local run, the TorchScript difference was 0.0 on average and 0.0 at maximum.
-
Phase 4: Preparation for actual machine verification
- Automatically generated
phase4_device_runbook.mdfromscripts/device/run_edge_phase_pipeline.pyand fixed the execution procedure for Raspberry Pi / Android / iPhone. -
It is assumed that an external USB power meter, Android Studio Profiler, and Xcode Instruments are used to measure the power of the actual device.
-
Phase 5: Evaluation and decision making
- Automatically generates a Markdown report from local execution results and provides current recommendations.
- The current judgment is ``SDK on edge is the first choice for production models that include SNN-specific processing, and conversion is also a good choice for simple dense models.''
Current execution result
- Run report:
bench_output/edge_phase_runs/20260409-130535/edge_phase1_5_report.md - Validation JSON:
bench_output/edge_phase_runs/20260409-130535/phase3_validation.json - Latency summary:
bench_output/edge_phase_runs/20260409-130535/phase2_latency_summary.json - Actual device runbook:
bench_output/edge_phase_runs/20260409-130535/phase4_device_runbook.md
Main figures
- Dependencies:
- Available:
torch,requests,psutil,fastapi,uvicorn,zenoh - Not installed:
onnx,onnx_tf,tensorflow,coremltools - Latency:
- Average: about 46.97 ms
- p50: approx. 1.02 ms
- p95: approx. 1.26 ms
- Maximum: approx. 4592.98 ms
- Conversion difference:
- TorchScript L2 Average: 0.0
- TorchScript L2 Max: 0.0
Unfinished items
- TFLite real conversion
- Reason:
onnx,onnx_tf,tensorfloware not installed. - CoreML real conversion
- Reason:
coremltoolsis not installed - Actual power measurement
- Reason: Hardware and instruments are not available from this environment
judgment
- For now, we prioritize SDK on edge.
- However, if the production model is configured with a dense system and the difference after TFLite / CoreML conversion and the actual power consumption are within an acceptable range, it is worth switching to converted deployment for Android / iPhone.
Next implementation candidate
- Install
onnx,onnx-tf,tensorflow,coremltoolsand fully execute Phase 3 - Run
phase4_device_runbook.mdon the actual device and obtain power, thermal, and long-term stability data - Create a representative sample with the production model and rerun the same validation pipeline