Back to Nexa Sdk

README

README.md

0.2.7314.1 KB
Original Source
<div align="center" style="text-decoration: none;"> <p style="font-size: 1.3em; font-weight: 600; margin-bottom: 20px;"> <a href="README_zh.md"> įŽ€äŊ“中文 </a> | <a href="README.md"> English </a> </p> <p style="font-size: 1.3em; font-weight: 600; margin-bottom: 20px;">🤝 Supported chipmakers </p> <picture> <source srcset="assets/chipmakers-dark.png" media="(prefers-color-scheme: dark)"> <source srcset="assets/chipmakers.png" media="(prefers-color-scheme: light)">
</picture>
</p> <p> <a href="https://www.producthunt.com/products/nexasdk-for-mobile?embed=true&utm_source=badge-top-post-badge&utm_medium=badge&utm_campaign=badge-nexasdk-for-mobile" target="_blank" rel="noopener noreferrer">
</a>
<a href="https://trendshift.io/repositories/12239" target="_blank" rel="noopener noreferrer">
    
</a>
</p> <p> <a href="https://docs.nexa.ai">
</a>
<a href="https://sdk.nexa.ai/wishlist">
    
</a>
<a href="https://x.com/nexa_ai"></a>
<a href="https://discord.com/invite/nexa-ai">
    
</a>
<a href="https://join.slack.com/t/nexa-ai-community/shared_invite/zt-3837k9xpe-LEty0disTTUnTUQ4O3uuNw">
    
</a>
</p> </div>

NexaSDK

NexaSDK lets you build the smartest and fastest on-device AI with minimum energy. It is a highly performant local inference framework that runs the latest multimodal AI models locally on NPU, GPU, and CPU - across Android, Windows, Linux, macOS, and iOS devices with a few lines of code.

NexaSDK supports latest models weeks or months before anyone else — Qwen3-VL, DeepSeek-OCR, Gemma3n (Vision), and more.

⭐ Star this repo to keep up with exciting updates and new releases about latest on-device AI capabilities.

🏆 Recognized Milestones

🚀 Quick Start

PlatformLinks
đŸ–Ĩī¸ CLIQuick Start īŊœ Docs
🐍 PythonQuick Start īŊœ Docs
🤖 AndroidQuick Start īŊœ Docs
đŸŗ Linux DockerQuick Start īŊœ Docs
🍎 iOSQuick Start īŊœ Docs

đŸ–Ĩī¸ CLI

Download:

WindowsmacOSLinux
arm64 (Qualcomm NPU)arm64 (Apple Silicon)arm64
x64 (Intel/AMD NPU)x64x64

Run your first model:

bash
# Chat with Qwen3
nexa infer ggml-org/Qwen3-1.7B-GGUF

# Multimodal: drag images into the CLI
nexa infer NexaAI/Qwen3-VL-4B-Instruct-GGUF

# NPU (Windows arm64 with Snapdragon X Elite)
nexa infer NexaAI/OmniNeural-4B
  • Models: LLM, Multimodal, ASR, OCR, Rerank, Object Detection, Image Generation, Embedding
  • Formats: GGUF, MLX, NEXA
  • NPU Models: Model Hub
  • 📖 CLI Reference Docs

🐍 Python SDK

bash
pip install nexaai
python
from nexaai import LLM, GenerationConfig, ModelConfig, LlmChatMessage

llm = LLM.from_(model="NexaAI/Qwen3-0.6B-GGUF", config=ModelConfig())

conversation = [
    LlmChatMessage(role="user", content="Hello, tell me a joke")
]
prompt = llm.apply_chat_template(conversation)
for token in llm.generate_stream(prompt, GenerationConfig(max_tokens=100)):
    print(token, end="", flush=True)
  • Models: LLM, Multimodal, ASR, OCR, Rerank, Object Detection, Image Generation, Embedding
  • Formats: GGUF, MLX, NEXA
  • NPU Models: Model Hub
  • 📖 Python SDK Docs

🤖 Android SDK

Add to your app/AndroidManifest.xml

xml
<application android:extractNativeLibs="true">

Add to your build.gradle.kts:

kotlin
dependencies {
    implementation("ai.nexa:core:0.0.19")
}
kotlin
// Initialize SDK
NexaSdk.getInstance().init(this)

// Load and run model
VlmWrapper.builder()
    .vlmCreateInput(VlmCreateInput(
        model_name = "omni-neural",
        model_path = "/data/data/your.app/files/models/OmniNeural-4B/files-1-1.nexa",
        plugin_id = "npu",
        config = ModelConfig()
    ))
    .build()
    .onSuccess { vlm ->
        vlm.generateStreamFlow("Hello!", GenerationConfig()).collect { print(it) }
    }
  • Requirements: Android minSdk 27, Qualcomm Snapdragon 8 Gen 4 Chip
  • Models: LLM, Multimodal, ASR, OCR, Rerank, Embedding
  • NPU Models: Supported Models
  • 📖 Android SDK Docs

đŸŗ Linux Docker

bash
docker pull nexa4ai/nexasdk:latest

export NEXA_TOKEN="your_token_here"
docker run --rm -it --privileged \
  -e NEXA_TOKEN \
  nexa4ai/nexasdk:latest infer NexaAI/Granite-4.0-h-350M-NPU

🍎 iOS SDK

Download NexaSdk.xcframework and add to your Xcode project.

swift
import NexaSdk

// Example: Speech Recognition
let asr = try Asr(plugin: .ane)
try await asr.load(from: modelURL)

let result = try await asr.transcribe(options: .init(audioPath: "audio.wav"))
print(result.asrResult.transcript)

âš™ī¸ Features & Comparisons

<div align="center">
FeaturesNexaSDKOllamallama.cppLM Studio
NPU support✅ NPU-first❌❌❌
Android/iOS SDK support✅ NPU/GPU/CPU supportâš ī¸âš ī¸âŒ
Linux support (Docker image)✅✅✅❌
Day-0 model support in GGUF, MLX, NEXAâœ…âŒâš ī¸âŒ
Full multimodality support✅ Image, Audio, Text, Embedding, Rerank, ASR, TTSâš ī¸âš ī¸âš ī¸
Cross-platform support✅ Desktop, Mobile (Android, iOS), Automotive, IoT (Linux)âš ī¸âš ī¸âš ī¸
One line of code to runâœ…âœ…âš ī¸âœ…
OpenAI-compatible API + Function calling✅✅✅✅
<p align="center" style="margin-top:14px"> <i> <b>Legend:</b> <span title="Full support">✅ Supported</span> &nbsp; | &nbsp; <span title="Partial or limited support">âš ī¸ Partial or limited support </span> &nbsp; | &nbsp; <span title="Not Supported">❌ No</span> </i> </p> </div>

🙏 Acknowledgements

We would like to thank the following projects:

📄 License

NexaSDK uses a dual licensing model:

CPU/GPU Components

Licensed under Apache License 2.0.

NPU Components

🤝 Contact & Community Support

Business Inquiries

For model launching partner, business inquiries, or any other questions, please schedule a call with us here.

Community & Support

Want more model support, backend support, device support or other features? We'd love to hear from you!

Feel free to submit an issue on our GitHub repository with your requests, suggestions, or feedback. Your input helps us prioritize what to build next.

Join our community:

🏆 Nexa × Qualcomm On-Device Bounty Program

Round 1: Build a working Android AI app that runs fully on-device on Qualcomm Hexagon NPU with NexaSDK.

Timeline (PT): Jan 15 → Feb 15 Prizes: $6,500 cash prize, Qualcomm official spotlight, flagship Snapdragon device, expert mentorship, and more

👉 Join & details: https://sdk.nexa.ai/bounty