Cactus Compute builds Cactus, an inference engine for on-device AI across smartphones, laptops, and edge hardware. The product runs LLMs, transcription, embeddings, and agent functions locally, then falls back to cloud inference when needed. The site also highlights Needle, a 26M tool-calling model distilled from Gemini.
Deploy AI models on smartphones for real-time applications; Minimize latency in mobile AI interactions; Ensure user privacy with on-device processing; Support unreliable network environments with offline capabilities; Integrate AI seamlessly into mobile applications.
Cactus specializes in deploying AI models locally on smartphones through a unified cross-platform SDK. Their main product offerings include:
Unified Cross-Platform SDK: This SDK allows developers to deploy AI models on mobile devices, ensuring compatibility across different platforms such as iOS and Android.
Offline-Ready AI Model Deployment: Cactus's SDK supports the deployment of AI models that can function without a constant internet connection, making it ideal for unreliable network environments.
On-Device Inference: By processing data on the device, Cactus guarantees user privacy and minimizes latency, as data does not need to be sent to external servers for processing.
Multimodal Support: The SDK supports various AI models, including those for language, vision, and speech, enabling developers to create versatile applications.
Key Features and Benefits:
Overall, Cactus provides efficient and secure mobile AI solutions tailored for developers looking to enhance their applications with AI capabilities.
Raised funding from Oxford Seed Fund and Google for Startups; Achieved over 2.1k stars on GitHub; Offers demo applications for user experience testing.