Arcee AI | Blog

Is Running Language Models on CPU Really Viable?

Running language models on CPUs has been discussed for some time, but delivering accurate results with production-level performance remains unproven. So, is using CPUs for language models truly viable in production?

Open-Source SLMs

•

June 30, 2025

Releasing Five Open-Weights Models

SuperNova 70B, Virtuoso-Large 72B, Caller 32B, GLM-4-32B-Base-32K, and Homunculus 12B

Case Studies

•

June 24, 2025

Research Spotlight: 3 Learnings from 3 MergeKit Use Cases

Merging for pre-training, data privacy in healthcare, and language support

Research

•

June 23, 2025

Extending AFM-4.5B to 64k Context Length

From 4k to 64k context through aggressive experimentation, model merging, distillation, and a concerning amount of soup.

Company

•

June 18, 2025

Deep Dive: AFM-4.5B, the First Arcee Foundation Model

Built for performance, compliance, and affordability.

Company

•

June 18, 2025

Announcing Arcee Foundation Models

The first release—AFM-4.5B—is a 4.5-billion-parameter model that delivers excellent accuracy, strict compliance, and very high cost-efficiency.

Research

•

June 10, 2025

Breaking Down Model Vocabulary Barriers

A training-free method to transplant tokenizers in pre-trained language models

Partnerships

•

June 7, 2025

Building an AI Retail Assistant at the Edge with SLMs on Intel CPUs

Thanks to a chatbot interface powered by open-source small language models and real-time data analytics, store associates can interact naturally through voice or text.