Arcee AI | Arcee Foundation Model 4.5B

Announcing the Official Launch of AFM-4.5B

Lucas Atkins

•

July 29, 2025

Today, we’re excited to officially take AFM-4.5B out of preview and release the weights of both AFM-4.5B and AFM-4.5B-Base on HuggingFace. This marks a major milestone for our team at Arcee AI as we open up access to a new, enterprise-grade language model designed for both flexibility and performance across a wide range of deployment environments.

Today, we’re excited to officially take AFM-4.5B out of preview and release the weights of both AFM-4.5B and AFM-4.5B-Base on HuggingFace. This marks a major milestone for our team at Arcee.ai as we open up access to a new, enterprise-grade language model designed for both flexibility and performance across a wide range of deployment environments.

What Is AFM-4.5B?

AFM-4.5B is a 4.5 billion parameter instruction-tuned language model. From the start, our mission was to create a model that meets our customers’ needs for a reliable, adaptable, and cost-effective tool, one that can run efficiently on everything from modern GPUs to CPUs and edge devices. We know that organizations need models that are not just powerful, but also practical to deploy and easy to customize for downstream tasks.

Training & Data Quality

The foundation of AFM-4.5B’s capabilities is the data it was trained on. The base model saw 8 trillion tokens—6.5 trillion tokens of general pretraining data, followed by 1.5 trillion tokens of mid-training data, with special emphasis on mathematical reasoning and code generation. After pretraining, we performed supervised fine-tuning using high-quality instruction datasets, then further refined the model with reinforcement learning using both verifiable rewards and human preference signals.

Data quality was our north star throughout this process. We partnered with DatologyAI, a leader in large-scale data curation, whose proprietary pipeline ensured that only the highest quality data made it into the training set. Their methods, ranging from model-based quality filtering to embedding-based curation and target distribution matching, made a real difference in the model’s performance and reliability.

Architecture & Adaptability

AFM-4.5B uses a decoder-only transformer architecture, but we’ve made key modifications for efficiency and capability. Notable features include grouped query attention for faster inference and ReLU² activation functions (replacing SwiGLU), which improve sparsification without compromising performance.

A major focus for us was making the model easy to customize. We paid particular attention to ensuring AFM-4.5B adapts smoothly with supervised fine-tuning (SFT), and we’ve put significant effort into making reinforcement learning workflows straightforward and effective. Whether you’re training for a new domain or adapting for a specialized use case, AFM-4.5B was built to make that process as simple as possible.

Use Cases and Future Plans

*These were all evaluated under the same benchmark parameters using our in-house evaluation suite. Qwen3-4B and SmolLM3’s hybrid reasoning style makes benchmark scores noisier; therefore, these are their scores on this suite. Qwen’s MMLU is closer to 70, and GPQA-D ~37 on their benches. SmolLM is similarly varied across benchmark suites.

Use Cases and Future Plans

The version we’re releasing today is tuned for general chat assistant, retrieval, and creative writing scenarios. It does not have a dedicated “reasoning” or “thinking” mode at this time; those variants will be coming in the weeks ahead, along with future releases focused on advanced math and code performance.

Our top priority was always to solve real customer problems: delivering a performant, efficient, and easily customizable model for practical, day-to-day deployment. While math and code completions are already strong, we’ll continue to improve these areas and introduce agentic and reasoning-focused models soon.

Community and Transparency

During its preview phase, AFM-4.5B made an appearance on the Yupp.ai leaderboard—at one point tying for 2nd place with GPT-4.5 when filtered for 2–5 conversation turns. While this model wasn’t specifically optimized for human preference, we spent a considerable amount of time ensuring that the “vibe” of its responses is approachable, clear, and even enjoyable to chat with. We’re sincerely proud of how this model communicates for its size, and we hope you’ll enjoy using it as much as we enjoyed building it.

Model Access & Licensing

You can try out the model today:

We’re also making a change to our original licensing plan. Instead of the previously planned CC-BY-NC license, AFM-4.5B is now released under the Arcee Model License. Here’s the key point:

If your company makes less than $1.75 million in annual revenue, you’re free to use the model for commercial purposes, as long as you’re not providing the weights to a company above that threshold. If your product or application using AFM-4.5B is sold to a larger company, that’s fine—as long as they don’t receive or run the weights directly.

We want as many developers, researchers, and builders as possible to benefit from AFM-4.5B. At the same time, this license ensures that we can continue to develop and support the model for the community.

Thank You

Building AFM-4.5B has been a labor of love, and none of this would be possible without our partners, community, and users. We’re grateful for your support and feedback. We’re excited to see what you build with AFM-4.5B—and can’t wait to share what’s coming next.

If you have feedback or ideas, we’d love to hear from you.

The Arcee Team

Related Blogs

Open-Source SLMs

•

July 18, 2025

Arcee AI Models Excel Across Yupp.ai Leaderboards

Small and mighty!

Open-Source SLMs

•

June 30, 2025

Releasing Five Open-Weights Models

SuperNova 70B, Virtuoso-Large 72B, Caller 32B, GLM-4-32B-Base-32K, and Homunculus 12B

Open-Source SLMs

•

February 21, 2025

Advanced Reasoning & Super Speed: Meet Arcee-Maestro-7B-Preview and Arcee-Blitz

Today we bring you exciting updates on two small language models (SLMs) we've been working on: our first reasoning model, Arcee-Maestro-7B-Preview, and a fast and efficient Mistral-based DeepSeek distillation we call Arcee-Blitz.