Building an AI Retail Assistant at the Edge with SLMs on Intel CPUs
Thanks to a chatbot interface powered by open-source small language models and real-time data analytics, store associates can interact naturally through voice or text.

Arcee partners with Hugging Face to make the Hub the exclusive home for all models, datasets, and agent traces.
Arcee AI has entered a multi-million-dollar strategic partnership with Hugging Face. Starting now, the Hub is the exclusive home for everything we build: every open model we release, every private model we train, every proprietary dataset we curate, and every agent trace we log. All of it lives there.
As we’ve been building on the momentum and lessons from our first generation of Trinity models, we kept coming back to the same question:
What enables a small team to consistently ship world-class work?
Running a modern AI lab means orchestrating a lot of moving pieces. You’re managing costs, training across clouds, spinning clusters up and down, iterating quickly, and trying to get models into the hands of developers and enterprises as fast as possible. For a team our size, with 14 people on research and 30 across the whole company, the infrastructure holding all of that together has to be invisible. The moment our team is thinking more about storage and cost than model quality and design, we’ve already lost.
We needed a storage layer that every ML engineer on the team, and anyone who joins the team, already knows. One that abstracts away the complexity of multi-cloud without adding new complexity in its place. One that doesn’t lock us into any single compute provider, so we can train wherever capacity is cheapest and best.
Hugging Face Buckets solves all of that: per-TB storage, with egress and CDN included, optimized for AI artifacts. Fast reads and writes for weights and datasets from anywhere. Because it sits outside any single cloud, we’re fully compute agnostic: train on any provider, spin it down, and our models and data are right there waiting.
We’ve been building in the open on the Hugging Face Hub for more than two years: 200+ models, tons of datasets, and millions of downloads. The arcee-ai organization is already one of the most active American labs on the platform. It’s the home of our Trinity models, and the easiest way for users to quickly start working with fully permissive American models trained by a small, dedicated team here in San Francisco.
This community has been generous with us, and making Hugging Face the official home for everything felt like the natural next step.
Hugging Face is the platform 15 million ML engineers live on. It’s the UX they already understand. When we distribute through the Hub, we’re meeting them exactly where they are. For a team whose mission is to build the best open-weight models in the world and get them deployed everywhere, from the smallest edge devices to the largest and most diverse developer workloads and enterprise deployments, that ease of access is critical to delivering amazing AI transformations for our customers.
We’re not a big lab. We don’t have hundreds of engineers to throw at infrastructure problems. What we do have is a tight, ridiculously talented team that punches well above its weight because we’re ruthless about where we spend our energy.
This partnership removes an entire category of operational overhead. Our team wakes up thinking about training runs, data quality, model design, and most importantly, the product experiences we deliver.
We intend to spend every bit of focus we get back on the exciting and important work ahead.
This cements a relationship that’s been building for years, and there is a lot to be excited about: deeper integrations, co-developed releases with the amazing HF team, and a whole lot more.
Follow arcee-ai on the Hub, because we’ve been busy on our next generation of open models, and more importantly, the way they’ll grow with you.