Skip to content

[Feature Request] Integration of Qualcomm QNN for Snapdragon NPU Acceleration #637

@pizzownzore-gif

Description

@pizzownzore-gif

First off,
Thanks for the work on SDAI—the flexibility of choosing backends is fantastic. I’m currently using SDAI for local generation, as it gives you control over not just local SD but hosted SD. However I’m noticing a significant performance gap when generating locally compared to apps like Local Dream which utilize the Snapdragon NPU via the Qualcomm AI Stack (QNN). On a modern Snapdragon (8 Gen 2/3), Local Dream is hitting 5-10 second generations, whereas the current implementation in SDAI relies heavily on CPU/GPU, which is slower and thermally demanding.

Suggested Implementation:
Add support for the QNN Execution Provider (Qualcomm AI Engine).

Target Hardware: Hexagon NPU (specifically for Snapdragon 8 Gen 1 and newer).

Reference: Similar to the implementation in xororz/local-dream or the Off Grid mobile client, leveraging optimized models for the Qualcomm AI Stack.

Benefits:
Speed: Potential 5x–10x speedup over current CPU/GPU inference.

I’d love to see SDAI local Generation speed increase by unlocking the full power of the hardware we’re carrying

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions