Deep Cogito Launches 4 Open-Source AI Models: New Hybrid Reasoning System with Self-Learning Capabilities Challenges GPT-4

Are you looking for smarter insights delivered straight to your inbox? Subscribe to our weekly newsletters for essential updates on enterprise AI, data, and security.

Deep Cogito’s Innovative Large Language Models

Deep Cogito, a relatively obscure AI research startup based in San Francisco and founded by former Google employees, has launched four new open-ish large language models (LLMs) that aim to achieve a unique goal: improving their reasoning capabilities over time and enhancing their performance autonomously. These models, part of Cogito’s v2 family, range from 70 billion to 671 billion parameters and are available for AI developers and enterprises under a combination of limited and fully open licensing terms. The models include:

– Cogito v2-70B (Dense)
– Cogito v2-109B (Mixture-of-experts)
– Cogito v2-405B (Dense)
– Cogito v2-671B (MoE)

Model Types and Their Applications

The Dense and Mixture-of-Experts (MoE) models cater to different requirements. The Dense 70B and 405B models activate all parameters during each forward pass, making them more predictable and easier to deploy across various hardware setups. They are particularly suitable for low-latency applications, fine-tuning, and environments with limited GPU capacity. In contrast, MoE models, such as the 109B and 671B versions, utilize a sparse routing mechanism that activates only a few specialized “expert” subnetworks at a time. This allows for significantly larger overall model sizes without a corresponding increase in computational costs.

The AI Impact Series

The AI Impact Series will return to San Francisco on August 5. This event will feature leaders from Block, GSK, and SAP discussing how autonomous agents are transforming enterprise workflows—from real-time decision-making to end-to-end automation. Reserve your spot now, as space is limited: https://bit.ly/3GuuPLF

Performance and Accessibility

The flagship model in Cogito v2, the 671B MoE model, leverages its scale and routing efficiency to match or surpass leading open models in benchmarks while utilizing significantly shorter reasoning chains. These models are currently available for download on Hugging Face and can be used by enterprises on Unsloth for local applications. For those unable to host model inferences on their hardware, application programming interfaces (APIs) from Together AI, Baseten, and RunPod are also available. Additionally, a quantized “8-bit floating point (FP8)” version of the 671B model has been developed, reducing the size of the numbers representing the model’s parameters from 16-bits to 8-bits. This modification allows users to run large models more quickly and cost-effectively on more accessible hardware, often with only a minor impact on performance (95 to 99%). However, this may slightly affect model accuracy, particularly for tasks requiring fine-grained precision, such as certain mathematical or reasoning problems.

Hybrid Reasoning Systems

All four Cogito v2 models are designed as hybrid reasoning systems. They can provide immediate responses to queries or take time to reflect internally before answering. Importantly, this reflection is not merely a runtime behavior; it is integrated into the training process itself. The models are trained to internalize their reasoning, meaning that the paths they take to arrive at answers—the cognitive steps, if you will—are encoded back into the models’ weights. Over time, they learn which lines of thought are significant and which are not. As noted in Deep Cogito’s blog post, the researchers aim to “disincentivize the model from ‘meandering’ to reach an answer, instead fostering a stronger intuition for the correct reasoning trajectory.”

A Promising Future

Deep Cogito claims that this approach leads to faster, more efficient reasoning and an overall enhancement in performance, even in “standard” mode. While many in the AI community are just becoming aware of the company, Deep Cogito has been diligently developing its technology for over a year. It emerged from stealth mode in April 2025 with a series of open-source models trained on Meta’s Llama 3.2, which showed promising results following a $13 million seed funding round led by Benchmark in November 2024. Benchmark’s Eric Vishria joined the company’s board as part of this funding initiative. As previously reported by VentureBeat, the smallest Cogito v1 models (3B and 8B) outperformed their Llama 3 counterparts across various benchmarks—sometimes by considerable margins.

Deep Cogito’s CEO and co-founder, Drishan Arora, who previously served as a lead LLM engineer at Google, articulated the company’s long-term vision of creating models that can reason and improve with each iteration, akin to how AlphaGo refined its strategy through self-play. The core method employed by Deep Cogito, called iterated distillation and amplification (IDA), replaces traditional hand-written prompts or static teachers with the model’s own evolving insights. With Cogito v2, the team has scaled this feedback loop significantly.

Deep Cogito Launches 4 Open-Source AI Models: New Hybrid Reasoning System with Self-Learning Capabilities Challenges GPT-4

Deep Cogito’s Innovative Large Language Models

Model Types and Their Applications

The AI Impact Series

Performance and Accessibility

Hybrid Reasoning Systems

A Promising Future

Navigation

Top Infos

Testo Showcases New Thermal Camera and Digital Measurement Ecosystem at Efintec 2025 – Energética 21 Insights

Dehn to Unveil Next-Gen Surge Protection Solution at Efintec 2025: Advanced Safety Features for Industrial Power Systems

Telematel Showcases Digital Solutions for Installers at Efintec 2025: Smart Tech Transforming the Energy Sector

Latest Updates on Genera/Matelec 2025: Key Insights and Developments From Energética 21 for Industry Professionals and Stakeholders

Efintec 2025: Leading Energy Innovation Summit Returns With Advanced Solutions for Sustainable Power Technology

Favorites

Spain authorizes 16 GW with wind and solar hybridization, including 540 MW by TotalEnergies

Urgent FDA Recall Alert: Celsius Energy Drink Production Error Leads to Vodka Contamination in Select Cans – What to Check

Discover Your Ideal Exercise Routine: a Guide to Personalized Fitness