Try our new intelligent model routing solution, Arcee Conductor. Sign up today and get a $20 credit.

Return to blog

Open-Source SLMs

30
Jun
2025
-
5
min read

Releasing Five Open-Weights Models

SuperNova 70B, Virtuoso-Large 72B, Caller 32B, GLM-4-32B-Base-32K, and Homunculus 12B

Julien Simon
,

Give Arcee a Try

Lorem ipsum dolor sit amet consectetur. Vitae enim libero lectus urna blandit sapien. In egestas ac dolor dictum.
Book a Demo

Today, we're happy to announce the open-weights release of five language models, including three enterprise-grade production models that have been powering customer workloads through our SaaS platform and two cutting-edge research models. This release underscores Arcee AI's fundamental commitment to democratizing access to cutting-edge AI technology through open-weight models, even for our most advanced commercial offerings.

At Arcee AI, we believe that the future of AI lies in transparency, accessibility, and community-driven innovation. By releasing these production-tested models as open weights, we're enabling developers, researchers, and enterprises to deploy, customize, and build upon our work without restrictions, whether for research, commercial applications, or further model development.

This significant release marks our transition to the primary focus on the Arcee Foundation Model (AFM) family, which represents our next-generation approach to building efficient, compliant, and deployable foundation models. With the release of the first AFM model, AFM-4.5B-Preview, we're confident in opening our previous generation of specialized models to the broader community.

Production Models 

Our production models have been battle-tested in real-world enterprise environments, delivering reliable performance across diverse use cases from customer service automation to complex reasoning tasks. These models were previously available exclusively through Arcee Conductor, our SaaS platform, and have now been released with full commercial licensing, enabling unrestricted deployment.

Arcee-SuperNova-v1

Arcee-SuperNova-v1 (70B) is a merged model built from multiple advanced training approaches. At its core is a distilled version of Llama-3.1-405B-Instruct, converted into Llama-3.1-70B-Instruct using our DistillKit library, which preserves instruction-following strengths while reducing size. Alongside this, another Llama-3.1-70B model was instruction-tuned using synthetic data from our EvolKit library, improving precision and adherence across diverse queries. A third version underwent Direct Preference Optimization (DPO) to better align with human feedback. The model targets general intelligence applications and mathematical reasoning, serving as an effective base for further RLHF training.

Enterprise Applications: Mathematical problem solving, technical documentation generation, code review and generation, multi-turn customer support conversations. 

License: Llama 3 Community License (commercial use permitted for <700M users).

Caller

Caller represents our specialized 32-billion parameter model optimized specifically for tool use and API orchestration. Based on Qwen-2.5-32B and refined through extensive training on structured interaction patterns, Caller was developed using our proprietary training frameworks to excel in function calling, API integration, and workflow automation scenarios. The model demonstrates precision in parsing complex API specifications, generating appropriate function calls, and managing multi-step automation pipelines.

Enterprise Applications: Customer support automation, CRM integration workflows, API gateway management, business process orchestration. Enterprise environments requiring reliable tool integration. 

License: Apache 2.0 (unrestricted commercial use).

Virtuoso-Large

Virtuoso-Large is our versatile 72-billion-parameter model, engineered for precision and adaptability across diverse domain-specific applications. Based on Qwen-2.5-72B, the model was built using our comprehensive training pipeline, incorporating both DistillKit for knowledge transfer optimization and MergeKit for architectural refinements. This model excels in scenarios requiring deep domain expertise, complex multi-step reasoning, and high-fidelity content generation.

Enterprise Applications: Professional services automation, technical documentation generation, compliance report creation, and sophisticated analytical tasks requiring domain expertise. Demonstrated performance in legal document analysis, financial reporting automation, and engineering specification generation.

License: Apache 2.0 (unrestricted commercial use).

Research Models

Our research models push the boundaries of what's possible with specialized architectures and training methodologies. These models serve as testbeds for innovative approaches that inform our future foundation model development, offering the community unique insights into advanced training techniques and architectural innovations.

GLM-4-32B-Base-32K

GLM-4-32B-Base-32K is an enhanced version of THUDM's GLM-4-32B-Base-0414, specifically engineered to maintain robust performance over an extended 32,000-token context window (compared to the original's 8,192-token effective limit). This model emerged from our foundational research on context extension techniques originally developed for AFM-4.5B, where we pioneered methods for maintaining coherence and performance across dramatically extended sequence lengths. The architecture incorporates innovative attention scaling mechanisms and positional encoding strategies that prevent the typical degradation seen in long-context scenarios.

Research Applications: Long-document analysis, extended conversation modeling, large codebase understanding, and multi-document reasoning tasks. Particularly valuable for research into context extension techniques and efficient long-sequence modeling. 

License: MIT (unrestricted commercial use).

Homunculus

Homunculus is a 12-billion parameter instruction model distilled from Qwen3-235B onto the Mistral-Nemo-Base-2407 backbone. The model preserves Qwen's distinctive two-mode interaction style—/think (deliberate chain-of-thought reasoning) and /nothink (concise direct answers)—while running efficiently on consumer hardware, demonstrating that sophisticated reasoning patterns can be preserved across significant parameter reductions. The name "Homunculus" reflects the model's nature as a miniaturized yet complete representation of its much larger teacher model, similar to the alchemical concept of a homunculus as a miniature, fully-formed human being created through artificial means.

Research Applications: Knowledge distillation research, dual-mode reasoning studies, efficient inference optimization, and consumer-grade AI deployment scenarios.

License: Apache 2.0 (unrestricted commercial use).

Conclusion 

These five model releases represent years of research and development in enterprise-grade AI systems, now made freely available to accelerate innovation across the AI community. From production-tested commercial solutions to cutting-edge research architectures, these models demonstrate our commitment to advancing the field through open collaboration.

Model Description Hugging Face Together AI
Arcee-SuperNova-v1 70B flagship model with exceptional instruction-following Link -
Caller 32B specialized model for tool use and API orchestration Link Available
Virtuoso-Large 72B versatile model for precision domain applications Link Available
GLM-4-32B-Base-32K 32B research model with extended 32K context window Link -
Homunculus 12B distilled model preserving dual reasoning modes Link -

We invite the community to explore these models, build upon our work, and contribute to the continued advancement of open-source AI. As we focus our efforts on the next generation of AFM models, these releases ensure that the knowledge and capabilities we've developed remain accessible to researchers, developers, and enterprises worldwide.

Ready to deploy these models in your environment? Visit our Hugging Face organization to get started, or explore deployment options through Together AI for hosted inference.

Sign up for the Arcee AI newsletter

Subscribe to get the latest news and insights on SLM-powered AI agents

Thank you!

We will get back
to you soon.
Oops! Something went wrong while submitting the form.