First off,
Thanks for the work on SDAI—the flexibility of choosing backends is fantastic. I’m currently using SDAI for local generation, as it gives you control over not just local SD but hosted SD. However I’m noticing a significant performance gap when generating locally compared to apps like Local Dream which utilize the Snapdragon NPU via the Qualcomm AI Stack (QNN). On a modern Snapdragon (8 Gen 2/3), Local Dream is hitting 5-10 second generations, whereas the current implementation in SDAI relies heavily on CPU/GPU, which is slower and thermally demanding.
Suggested Implementation:
Add support for the QNN Execution Provider (Qualcomm AI Engine).
Target Hardware: Hexagon NPU (specifically for Snapdragon 8 Gen 1 and newer).
Reference: Similar to the implementation in xororz/local-dream or the Off Grid mobile client, leveraging optimized models for the Qualcomm AI Stack.
Benefits:
Speed: Potential 5x–10x speedup over current CPU/GPU inference.
I’d love to see SDAI local Generation speed increase by unlocking the full power of the hardware we’re carrying
First off,
Thanks for the work on SDAI—the flexibility of choosing backends is fantastic. I’m currently using SDAI for local generation, as it gives you control over not just local SD but hosted SD. However I’m noticing a significant performance gap when generating locally compared to apps like Local Dream which utilize the Snapdragon NPU via the Qualcomm AI Stack (QNN). On a modern Snapdragon (8 Gen 2/3), Local Dream is hitting 5-10 second generations, whereas the current implementation in SDAI relies heavily on CPU/GPU, which is slower and thermally demanding.
Suggested Implementation:
Add support for the QNN Execution Provider (Qualcomm AI Engine).
Target Hardware: Hexagon NPU (specifically for Snapdragon 8 Gen 1 and newer).
Reference: Similar to the implementation in xororz/local-dream or the Off Grid mobile client, leveraging optimized models for the Qualcomm AI Stack.
Benefits:
Speed: Potential 5x–10x speedup over current CPU/GPU inference.
I’d love to see SDAI local Generation speed increase by unlocking the full power of the hardware we’re carrying