Saturday, September 20, 2025
HomeTechnologiesDeep Cogito Launches 4 Open-Source AI Models: New Hybrid Reasoning System with...

Deep Cogito Launches 4 Open-Source AI Models: New Hybrid Reasoning System with Self-Learning Capabilities Challenges GPT-4

Are you looking for smarter insights delivered straight to your inbox? Subscribe to our weekly newsletters for essential updates on enterprise AI, data, and security.

Deep Cogito’s Innovative Large Language Models

Deep Cogito, a relatively obscure AI research startup based in San Francisco and founded by former Google employees, has launched four new open-ish large language models (LLMs) that aim to achieve a unique goal: improving their reasoning capabilities over time and enhancing their performance autonomously. These models, part of Cogito’s v2 family, range from 70 billion to 671 billion parameters and are available for AI developers and enterprises under a combination of limited and fully open licensing terms. The models include:

– Cogito v2-70B (Dense)
– Cogito v2-109B (Mixture-of-experts)
– Cogito v2-405B (Dense)
– Cogito v2-671B (MoE)

Model Types and Their Applications

The Dense and Mixture-of-Experts (MoE) models cater to different requirements. The Dense 70B and 405B models activate all parameters during each forward pass, making them more predictable and easier to deploy across various hardware setups. They are particularly suitable for low-latency applications, fine-tuning, and environments with limited GPU capacity. In contrast, MoE models, such as the 109B and 671B versions, utilize a sparse routing mechanism that activates only a few specialized “expert” subnetworks at a time. This allows for significantly larger overall model sizes without a corresponding increase in computational costs.

The AI Impact Series

The AI Impact Series will return to San Francisco on August 5. This event will feature leaders from Block, GSK, and SAP discussing how autonomous agents are transforming enterprise workflows—from real-time decision-making to end-to-end automation. Reserve your spot now, as space is limited: https://bit.ly/3GuuPLF

Performance and Accessibility

The flagship model in Cogito v2, the 671B MoE model, leverages its scale and routing efficiency to match or surpass leading open models in benchmarks while utilizing significantly shorter reasoning chains. These models are currently available for download on Hugging Face and can be used by enterprises on Unsloth for local applications. For those unable to host model inferences on their hardware, application programming interfaces (APIs) from Together AI, Baseten, and RunPod are also available. Additionally, a quantized “8-bit floating point (FP8)” version of the 671B model has been developed, reducing the size of the numbers representing the model’s parameters from 16-bits to 8-bits. This modification allows users to run large models more quickly and cost-effectively on more accessible hardware, often with only a minor impact on performance (95 to 99%). However, this may slightly affect model accuracy, particularly for tasks requiring fine-grained precision, such as certain mathematical or reasoning problems.

Hybrid Reasoning Systems

All four Cogito v2 models are designed as hybrid reasoning systems. They can provide immediate responses to queries or take time to reflect internally before answering. Importantly, this reflection is not merely a runtime behavior; it is integrated into the training process itself. The models are trained to internalize their reasoning, meaning that the paths they take to arrive at answers—the cognitive steps, if you will—are encoded back into the models’ weights. Over time, they learn which lines of thought are significant and which are not. As noted in Deep Cogito’s blog post, the researchers aim to “disincentivize the model from ‘meandering’ to reach an answer, instead fostering a stronger intuition for the correct reasoning trajectory.”

A Promising Future

Deep Cogito claims that this approach leads to faster, more efficient reasoning and an overall enhancement in performance, even in “standard” mode. While many in the AI community are just becoming aware of the company, Deep Cogito has been diligently developing its technology for over a year. It emerged from stealth mode in April 2025 with a series of open-source models trained on Meta’s Llama 3.2, which showed promising results following a $13 million seed funding round led by Benchmark in November 2024. Benchmark’s Eric Vishria joined the company’s board as part of this funding initiative. As previously reported by VentureBeat, the smallest Cogito v1 models (3B and 8B) outperformed their Llama 3 counterparts across various benchmarks—sometimes by considerable margins.

Deep Cogito’s CEO and co-founder, Drishan Arora, who previously served as a lead LLM engineer at Google, articulated the company’s long-term vision of creating models that can reason and improve with each iteration, akin to how AlphaGo refined its strategy through self-play. The core method employed by Deep Cogito, called iterated distillation and amplification (IDA), replaces traditional hand-written prompts or static teachers with the model’s own evolving insights. With Cogito v2, the team has scaled this feedback loop significantly.

Top Infos

Favorites