Show HN: Dual YOLOv8n UAV Detection on RK3588S at 42 FPS Using NPU

(github.com)

26 points by alebal123bal3 hours ago

4 comments

stefan_0 minutes ago
More slop again. The way to get more throughput is to bump batch size, not to try and "multithread" job submits to the NPU as if its a CPU.
robinduckett1 hour ago
Is there something special about yolov8 over later models (9-12)? It seems most of the research and working examples default to v8 despite it being 3 years old. Or just because it is what fits on this hardware?
- snovv_crash54 minutes ago
  Newer versions aren't open source, or at least have murky licencing.
  - robinduckett43 minutes ago
    Ahh that’ll do it. A shame really, the later models seem to be fairly good just from my idle testing as an enthusiast.
- alebal123bal48 minutes ago
  [flagged]
alebal123bal3 hours ago
I built this while trying to understand how much of the RK3588S vision pipeline could be kept off the CPU.The main trick is not the YOLO model itself, but the pipeline structure: MIPI capture through the ISP, resize/color conversion through RGA, and YOLOv8n inference through all 3 NPU cores with one RKNN context per core. With a 3-thread inference pool the pipeline goes from ~31 FPS to the OS08A10 camera’s 46 FPS ceiling.The memory footprint is also small: roughly 137–152 MB RSS for one 1080p stream, using a fixed preallocated buffer pool rather than per-frame allocations. Two streams are roughly 276–304 MB RSS.The repo also has a multi-process side of the pipeline: detections are published over Unix-domain sockets to tracking, temporal features, a presence FSM, and an optional Qwen2.5-0.5B summary step. For the LLM step, the camera pipeline can temporarily blackout/resume so RKLLM gets the whole NPU.I split the work into three repos:- runtime dual-stream YOLOv8n RK3588S pipeline: <a href="https://github.com/alebal123bal/khadas_yolov8n_multithread" rel="nofollow">https://github.com/alebal123bal/khadas_yolov8n_multithread</a>- train/export/INT8 RKNN conversion for YOLOv8/YOLOv5: <a href="https://github.com/alebal123bal/RKNN_TRAIN_YOLO" rel="nofollow">https://github.com/alebal123bal/RKNN_TRAIN_YOLO</a>- Qwen on RK3588S, via RKLLM/NPU or llama.cpp/CPU: <a href="https://github.com/alebal123bal/RKLLM_LLAMA_QWEN" rel="nofollow">https://github.com/alebal123bal/RKLLM_LLAMA_QWEN</a>The demo class is UAV/drone, but this is meant as a general edge-inference pipeline example, not an operational/surveillance/defense system.
ancientmoth1 hour ago
[dead]