Sunday, September 21, 2025
HomeTechnologiesByteDance Challenges GPT-4 with 36B Parameter AI Model: New Open-Source LLM Handles...

ByteDance Challenges GPT-4 with 36B Parameter AI Model: New Open-Source LLM Handles 512K Tokens, Rivals Industry Leaders

Are you looking for smarter insights delivered directly to your inbox? Subscribe to our weekly newsletters for essential updates on enterprise AI, data, and security.

TikTok and ByteDance’s Latest Developments

TikTok is making headlines once again, particularly after the White House’s involvement with the popular social media platform. However, its parent company, ByteDance, has also made a surprising announcement. The company’s Seed Team of AI researchers has unveiled Seed-OSS-36B on the AI code-sharing website Hugging Face. This new line of open-source large language models (LLMs) is designed for advanced reasoning and developer-focused usability, offering a longer token context compared to many competing LLMs from U.S. Tech firms, including industry leaders like OpenAI and Anthropic.

Overview of Seed-OSS-36B Models

The Seed-OSS-36B collection introduces three primary variants:

1. Seed-OSS-36B-Base with synthetic data
2. Seed-OSS-36B-Base without synthetic data
3. Seed-OSS-36B-Instruct

By releasing both synthetic and non-synthetic versions of the Seed-OSS-36B-Base model, the Seed Team aims to balance practical performance with research flexibility. The synthetic-data variant, which is trained with additional instruction data, consistently achieves higher scores on standard benchmarks and is intended as a more robust general-purpose option. In contrast, the non-synthetic model omits these enhancements, providing a cleaner foundation that avoids potential biases or distortions from synthetic instruction data. This dual offering allows applied users to access improved results while enabling researchers to maintain a neutral baseline for studying post-training methods.

The Seed-OSS-36B-Instruct model is distinct in that it is post-trained with instruction data, focusing on task execution and instruction adherence rather than serving solely as a foundational model. All three models are released under the Apache-2.0 license, permitting free use, modification, and redistribution by researchers and developers in enterprises. This means they can be utilized for commercial applications, whether internal to a company or customer-facing, without incurring any licensing fees or application programming interface (API) costs from ByteDance.

The Growing Trend of Open Source Models

This release continues the trend of Chinese companies launching powerful open-source models, with OpenAI striving to catch up with its recent open-source gpt-oss duet. The Seed Team has positioned Seed-OSS for international applications, emphasizing its versatility in reasoning, agent-like task execution, and multilingual capabilities. Established in 2023, the Seed Team has focused on developing foundation models that cater to both research and practical use cases.

Technical Specifications and Features

The architecture of Seed-OSS-36B incorporates familiar design choices, including causal language modeling, grouped query attention, SwiGLU activation, RMSNorm, and RoPE positional encoding. Each model consists of 36 billion parameters across 64 layers and supports a vocabulary of 155,000 tokens. One of its standout features is its native long-context capability, with a maximum length of 512,000 tokens, allowing it to process extensive documents and reasoning chains without performance degradation. This capability is twice the length of OpenAI’s new GPT-5 model family, roughly equivalent to about 1,600 pages of text, similar to the length of a Christian Bible.

Another notable feature is the introduction of a thinking budget, which enables developers to specify the extent of reasoning the model should perform before providing an answer. This concept has also been observed in other recent open-source models, such as Nvidia’s Nemotron-Nano-9B-v2, available on Hugging Face. This functionality allows teams to adjust performance based on task complexity and deployment efficiency requirements. Budgets are recommended in multiples of 512 tokens, with a budget of 0 providing a direct response mode.

Performance Benchmarks

Benchmarks released alongside Seed-OSS-36B position it among the leading large open-source models. The Instruct variant, in particular, achieves state-of-the-art results in several areas:

Math and reasoning: Seed-OSS-36B-Instruct scores 91.7 percent on AIME24 and 65 on BeyondAIME, both representing open-source “state-of-the-art” (SOTA).
Coding: On LiveCodeBench v6, the Instruct model records a score of 67.4, another SOTA achievement.
Long-context handling: On RULER at a 128K context length, it reaches 94.6, marking the highest open-source result reported.
Base model performance: The synthetic-data Base variant delivers a score of 65.

Stay informed and ahead of the curve in the rapidly evolving landscape of AI and data security by subscribing to our newsletter today!

Top Infos

Favorites