Arcee AI | Arcee AI Releases Five New Open Weights Models

Releasing Five Open-Weights Models

Julien Simon

•

June 30, 2025

SuperNova 70B, Virtuoso-Large 72B, Caller 32B, GLM-4-32B-Base-32K, and Homunculus 12B

Today, we're happy to announce the open-weights release of five language models, including three enterprise-grade production models that have been powering customer workloads through our SaaS platform and two cutting-edge research models. This release underscores Arcee AI's fundamental commitment to democratizing access to cutting-edge AI technology through open-weight models, even for our most advanced commercial offerings.

At Arcee AI, we believe that the future of AI lies in transparency, accessibility, and community-driven innovation. By releasing these production-tested models as open weights, we're enabling developers, researchers, and enterprises to deploy, customize, and build upon our work without restrictions, whether for research, commercial applications, or further model development.

This significant release marks our transition to the primary focus on the Arcee Foundation Model (AFM) family, which represents our next-generation approach to building efficient, compliant, and deployable foundation models. With the release of the first AFM model, AFM-4.5B-Preview, we're confident in opening our previous generation of specialized models to the broader community.

Production Models

Our production models have been battle-tested in real-world enterprise environments, delivering reliable performance across diverse use cases from customer service automation to complex reasoning tasks. These models were previously available exclusively through Arcee Conductor, our SaaS platform, and have now been released with full commercial licensing, enabling unrestricted deployment.

Arcee-SuperNova-v1

Arcee-SuperNova-v1 (70B) is a merged model built from multiple advanced training approaches. At its core is a distilled version of Llama-3.1-405B-Instruct, converted into Llama-3.1-70B-Instruct using our DistillKit library, which preserves instruction-following strengths while reducing size. Alongside this, another Llama-3.1-70B model was instruction-tuned using synthetic data from our EvolKit library, improving precision and adherence across diverse queries. A third version underwent Direct Preference Optimization (DPO) to better align with human feedback. The model targets general intelligence applications and mathematical reasoning, serving as an effective base for further RLHF training.

‍Enterprise Applications: Mathematical problem solving, technical documentation generation, code review and generation, multi-turn customer support conversations.

‍License: Llama 3 Community License (commercial use permitted for <700M users).

Caller

Caller represents our specialized 32-billion parameter model optimized specifically for tool use and API orchestration. Based on Qwen-2.5-32B and refined through extensive training on structured interaction patterns, Caller was developed using our proprietary training frameworks to excel in function calling, API integration, and workflow automation scenarios. The model demonstrates precision in parsing complex API specifications, generating appropriate function calls, and managing multi-step automation pipelines.

‍Enterprise Applications: Customer support automation, CRM integration workflows, API gateway management, business process orchestration. Enterprise environments requiring reliable tool integration.

‍License: Apache 2.0 (unrestricted commercial use).

Virtuoso-Large

Virtuoso-Large is our versatile 72-billion-parameter model, engineered for precision and adaptability across diverse domain-specific applications. Based on Qwen-2.5-72B, the model was built using our comprehensive training pipeline, incorporating both DistillKit for knowledge transfer optimization and MergeKit for architectural refinements. This model excels in scenarios requiring deep domain expertise, complex multi-step reasoning, and high-fidelity content generation.

Enterprise Applications: Professional services automation, technical documentation generation, compliance report creation, and sophisticated analytical tasks requiring domain expertise. Demonstrated performance in legal document analysis, financial reporting automation, and engineering specification generation.

License: Apache 2.0 (unrestricted commercial use).

Research Models

Our research models push the boundaries of what's possible with specialized architectures and training methodologies. These models serve as testbeds for innovative approaches that inform our future foundation model development, offering the community unique insights into advanced training techniques and architectural innovations.

GLM-4-32B-Base-32K

GLM-4-32B-Base-32K is an enhanced version of THUDM's GLM-4-32B-Base-0414, specifically engineered to maintain robust performance over an extended 32,000-token context window (compared to the original's 8,192-token effective limit). This model emerged from our foundational research on context extension techniques originally developed for AFM-4.5B, where we pioneered methods for maintaining coherence and performance across dramatically extended sequence lengths. The architecture incorporates innovative attention scaling mechanisms and positional encoding strategies that prevent the typical degradation seen in long-context scenarios.

Research Applications: Long-document analysis, extended conversation modeling, large codebase understanding, and multi-document reasoning tasks. Particularly valuable for research into context extension techniques and efficient long-sequence modeling.

License: MIT (unrestricted commercial use).

Homunculus

Homunculus is a 12-billion parameter instruction model distilled from Qwen3-235B onto the Mistral-Nemo-Base-2407 backbone. The model preserves Qwen's distinctive two-mode interaction style—/think (deliberate chain-of-thought reasoning) and /nothink (concise direct answers)—while running efficiently on consumer hardware, demonstrating that sophisticated reasoning patterns can be preserved across significant parameter reductions. The name "Homunculus" reflects the model's nature as a miniaturized yet complete representation of its much larger teacher model, similar to the alchemical concept of a homunculus as a miniature, fully-formed human being created through artificial means.

Research Applications: Knowledge distillation research, dual-mode reasoning studies, efficient inference optimization, and consumer-grade AI deployment scenarios.

License: Apache 2.0 (unrestricted commercial use).

Conclusion

These five model releases represent years of research and development in enterprise-grade AI systems, now made freely available to accelerate innovation across the AI community. From production-tested commercial solutions to cutting-edge research architectures, these models demonstrate our commitment to advancing the field through open collaboration.

Model	Description	Hugging Face	Together AI
Arcee-SuperNova-v1	70B flagship model with exceptional instruction-following	Link	-
Caller	32B specialized model for tool use and API orchestration	Link	Available
Virtuoso-Large	72B versatile model for precision domain applications	Link	Available
GLM-4-32B-Base-32K	32B research model with extended 32K context window	Link	-
Homunculus	12B distilled model preserving dual reasoning modes	Link	-

We invite the community to explore these models, build upon our work, and contribute to the continued advancement of open-source AI. As we focus our efforts on the next generation of AFM models, these releases ensure that the knowledge and capabilities we've developed remain accessible to researchers, developers, and enterprises worldwide.

Ready to deploy these models in your environment? Visit our Hugging Face organization to get started, or explore deployment options through Together AI for hosted inference.

Related Blogs

Open-Source SLMs

•

July 29, 2025

Announcing the Official Launch of AFM-4.5B

Today, we’re excited to officially take AFM-4.5B out of preview and release the weights of both AFM-4.5B and AFM-4.5B-Base on HuggingFace. This marks a major milestone for our team at Arcee AI as we open up access to a new, enterprise-grade language model designed for both flexibility and performance across a wide range of deployment environments.

Open-Source SLMs

•

July 18, 2025

Arcee AI Models Excel Across Yupp.ai Leaderboards

Small and mighty!

Open-Source SLMs

•

February 21, 2025

Advanced Reasoning & Super Speed: Meet Arcee-Maestro-7B-Preview and Arcee-Blitz

Today we bring you exciting updates on two small language models (SLMs) we've been working on: our first reasoning model, Arcee-Maestro-7B-Preview, and a fast and efficient Mistral-based DeepSeek distillation we call Arcee-Blitz.