[Feature Request] Integration of Qualcomm QNN for Snapdragon NPU Acceleration


First off, 
      Thanks for the work on SDAI—the flexibility of choosing backends is fantastic. I’m currently using SDAI for local generation, as it gives you control over not just local SD but hosted SD. However I’m noticing a significant performance gap when generating locally compared to apps like Local Dream which utilize the Snapdragon NPU via the Qualcomm AI Stack (QNN). On a modern Snapdragon (8 Gen 2/3), Local Dream is hitting 5-10 second generations, whereas the current implementation in SDAI relies heavily on CPU/GPU, which is slower and thermally demanding.

Suggested Implementation:
Add support for the QNN Execution Provider (Qualcomm AI Engine).

Target Hardware: Hexagon NPU (specifically for Snapdragon 8 Gen 1 and newer).  

Reference: Similar to the implementation in xororz/local-dream or the Off Grid mobile client, leveraging optimized models for the Qualcomm AI Stack.

Benefits:
Speed: Potential 5x–10x speedup over current CPU/GPU inference.

I’d love to see SDAI local Generation speed increase by unlocking the full power of the hardware we’re carrying

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature Request] Integration of Qualcomm QNN for Snapdragon NPU Acceleration #637

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

[Feature Request] Integration of Qualcomm QNN for Snapdragon NPU Acceleration #637

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions