AI News·4 min read

Cohere Command A+: The First Apache 2.0 Open Model with Lossless Quantization

Cohere releases Command A+, a 218-billion-parameter open-source model under Apache 2.0 license with lossless quantization and native citations for enterprise AI.


Cohere Command A+ — What Makes It Different?

Cohere just released Command A+, a 218-billion-parameter language model built on a Sparse Mixture-of-Experts (MoE) architecture. Unlike most frontier models locked behind proprietary APIs, Command A+ ships under the Apache 2.0 license — meaning enterprises can run, modify, and deploy it entirely on their own infrastructure.

Lossless Quantization — Why Does It Matter?

Most quantized AI models sacrifice accuracy for speed and smaller memory footprints. Cohere claims Command A+ achieves lossless 4-bit quantization (W4A4), meaning the compressed model retains the same quality as the full-precision version. This is a breakthrough for teams running large models on limited hardware.

Sovereign AI — Who Controls Your Data?

By open-sourcing under Apache 2.0, Cohere is betting on "sovereign AI" — the idea that enterprises and governments should control their own AI infrastructure. No data leaves your servers, no vendor lock-in, and no surprise API price hikes.

Native Citations — Building Trust in AI Outputs

Command A+ includes built-in citation capabilities, automatically grounding its responses in verifiable sources. For industries like legal, healthcare, and finance where accuracy is non-negotiable, this feature could be a game-changer.

How Can You Use It?

The model weights are available on Hugging Face. Developers can download, fine-tune, and deploy Command A+ locally or on private cloud infrastructure. Cohere also merged with German AI startup Aleph Alpha, expanding its enterprise reach across Europe and North America.

Frequently Asked Questions

Q: Is Command A+ really free to use commercially? A: Yes. The Apache 2.0 license permits commercial use, modification, and distribution without licensing fees.

Q: How does lossless quantization work? A: Cohere uses a Sparse MoE architecture with 4-bit weight-and-activation quantization (W4A4) that preserves model accuracy while dramatically reducing memory and compute requirements.

Q: What hardware do I need to run it? A: Thanks to the sparse architecture and quantization, Command A+ can run on significantly less hardware than a dense 218B model would typically require — making it accessible to mid-size enterprises.


Stay ahead of the AI curve. Follow @AiForSuccess for daily insights.

📬 Want more AI solopreneur insights?

Subscribe to our weekly newsletter →
☕ Enjoy this article? Support the author

Related Articles