SenseTime's SenseNova V6: China's Most Advanced Multimodal Model with the Lowest Cost in the Industry

Integrating AI into Everyday Life 

HONG KONG,April 12, 2025/PRNewswire/ -- SenseTime launched its newly upgraded large model series, SenseNova V6, at its Tech Day event held in several locations, includingShanghaiandShenzhen. Leveraging advances in the training of multimodal long chain-of-thought (CoT), global memory, and reinforcement learning, the model delivers industry-leading multimodal reasoning capabilities while setting a new benchmark for cost efficiency.

The capabilities of the SenseNova V6 model have been greatly enhanced, with strong advantages in long CoT, reasoning, mathematical capabilities, and global memory. Its multimodal reasoning capabilities ranked first inChinawhen benchmarked against GPT-o1, while its data analysis performance outpaced GPT-4o. It also combines high performance with cost efficiency. Its multimodal training efficiency is aligned with that of language models, providing the lowest training costs in the industry. Its reasoning costs are also the lowest in the industry. The new lightweight full-modal interactive model, SenseNova V6 Omni, delivers the most advanced multimodal interactive capabilities inChina. It isChina'sfirst large model that supports in-depth analysis of 10-minute mid-to-long form videos, benchmarked against Gemini 2.5 Turbo to be among the strongest in its class.

Dr.Xu Li, Chairman of the Board and CEO of SenseTime, said,"AI's true purpose is found in our everyday lives. SenseNova V6 has pushed past the boundaries of multimodality, unlocking infinite possibilities in reasoning and intelligence."

Multimodal long-chain reasoning, reinforcement learning, and global memory: SenseNova V6 leads the way in enabling multimodal deep thinking

As a native Mixture of Experts (MoE)-based multimodal general foundation model with over 600 billion parameters, SenseNova V6 has achieved multiple technological breakthroughs. A single model is able to perform a range of tasks across text and multimodal domains, including:

  • Long CoT: Trained on over200Bhigh-quality multimodal long CoT data, with the longest CoT reaching64K;
  • Mathematical Capabilities: Significantly outperformed GPT-4o in data analysis capabilities;
  • Reasoning Capabilities: Ranked first inChinafor multimodal deep reasoning, benchmarked against GPT-o1;
  • Global Memory: First inChinato achieve long-form video understanding, supporting content of 10 minutes in length for comprehension and deep reasoning.

In leading benchmark evaluations of reasoning and multimodal capabilities, SenseNova V6 achieved state-of-the-art results across multiplemetrics.

Key indicators: SenseNova V6 demonstrated strong overall performance in language tasks, on par with leading international models. It excelled in multimodal capabilities, with outstanding results in all aspects. Both its language reasoning and multimodal reasoning capabilities are benchmarked against leading international models such as GPT-4.5 and Gemini 2.0 Pro. Strong reasoning capabilities: From SenseNova 5.5 to V6/V6 Reasoner, the SenseNova unified model demonstrated significant improvements

Based on more than200Bof high-quality multimodal long CoT data, SenseTime leverages multi-agent collaboration to synthesize and verify long CoT. SenseNova V6 has developed exceptional multimodal reasoning capabilities, supporting multimodal long CoTs up to64Ktokens, enabling the model's long-term thinking capability.

In solving complex real-world problems, SenseNova V6 utilizes its robust hybrid image and text understanding and reasoning capabilities to help users with a range of tasks.

For complex document processing scenarios, SenseNova V6 is able to help users with difficult tasks through its strong multimodal reasoning capabilities. For example, in insurance claims processing, SenseNova V6 can assess whether the submitted commercial health insurance claims meet the requirements. It can detect issues such as unnecessary prescriptions and examinations, missing documents, or incomplete submissions.

Leveraging breakthroughs in multimodal reinforcement learning, SenseTime has developed a hybrid reinforcement learning framework for various image-text tasks, based on different difficulty levels and multi-reward models.

China'sfirst model tobreak the 10-minutebarrier in video understanding, achievinganalysis ofextended contentwithin seconds

With its global memory capability, SenseNova V6 overcomes the limitations of traditional models that could only support short videos, and now supports full-framerate analysis of 10-minute videos.

With advanced comprehension capabilities, SenseNova V6 is also able to intelligently edit and extract video highlights, helping users to retain memorable moments.

SenseTime's proprietary technology aligns visual information (images), auditory information (speech and sounds), linguistic information (subtitles and spoken language), and temporal logic to form a multimodal unified sequential representation. Based on this framework, it applies fine-grained cascading compression and content-aware dynamic filtering to achieve high-ratio compression of long videos. A 10-minute video can be compressed into16Ktokens while retaining key semantics.

Human-like interaction: SenseNova V6 Omni launches with multi-industry deployment 

With the launch of SenseNova V6, SenseNova's has upgraded its real-time interactive unified large model to SenseNova V6 Omni, with deep optimizations across scenarios, including role-playing, translation and reading, cultural tourism guiding, picture book narration, and mathematical explanation. 

In translation and reading scenarios, SenseNova V6 Omni enables users to achieve precise spatial interactions with a simple finger gesture. The model also accurately understands the relationship between local and global information, providing a more intuitive and human-like interactive experience.

SenseNova V6 Omni features more human-like perceptual and expressive abilities, as well as emotional understanding. It has been deployed across multiple industries and scenarios, including embodied intelligence, becoming the first commercialized full-modality real-time interactive model inChina.

Full-featured version of SenseChat launched, now available for preview

SenseTime has released a comprehensive update to SenseChat, along with a brand-new app built on the complete capabilities of SenseNova V6. Through a single access point, users can engage in seamless multimodal interactive streaming experiences across text, images, and video.

The SenseChat app is available for preview and SenseNova V6 is now available for trial via the SenseChatweb platform athttps://chat.sensetime.com/wb/chat.

RMB100 millionin vouchers released to accelerate full-stack scenario implementation

SenseTime also announced a dedicated subsidy ofRMB100 million, aimed at advancing emerging fields such as embodied intelligence and AIGC. Through targeted and multi-dimensional initiatives, SenseTime is delivering a one-stop solution designed for high efficiency, low cost, and end-to-end AI implementation,  spanning expert consulting, model training, and reasoning validation.

- End -

About SenseTime

SenseTime is a leading AI software company focused on creating a better AI-empowered future through innovation. We are committed to advancing the state of the art in AI research, developing scalable and affordable AI software platforms that benefit businesses, people and society as a whole, while attracting and nurturing top talents to shape the future together.

With our roots in the academic world, we invest in our original and cutting-edge research that allows us to offer and continuously improve industry-leading AI capabilities in universal multimodal and multi-task models, covering key fields across perception intelligence, natural language processing, decision intelligence, AI-enabled content generation, as well as key capabilities in AI chips, sensors and computing infrastructure. Our proprietary AI infrastructure, SenseCore, integrates computing power, algorithms, and platforms, enabling us to build the"SenseNova"foundation model sets and R&D system that unlocks the ability to perform general AI tasks at low cost and with high efficiency. Our technologies are trusted by customers and partners in many industry verticals including Generative AI, Computer Vision and Smart Auto. 

SenseTime has been actively involved in the development of national and international industry standards on data security, privacy protection, ethical and sustainable AI, working closely with multiple domestic and multilateral institutions on ethical and sustainable AI development. SenseTime was the only AI company inAsiato have its Code of Ethics for AI Sustainable Development selected by the United Nations as one of the key publication references in the United Nations Resource Guide on AI Strategies, and was published inJune 2021.

SenseTime Group Inc. has successfully listed on the Main Board of the Stock Exchange of Hong Kong Limited (HKEX). We have offices in markets includingHong Kong,Shanghai,Beijing,Shenzhen,Chengdu,Hangzhou, Nanping,Qingdao,Xi'an,Macau,Kyoto,Tokyo,Singapore,Riyadh,Abu Dhabi,Dubai,Kuala LumpurandSouth Korea, etc., as well as presence inGermany,Thailand,Indonesiaandthe Philippines. For more information, please visit SenseTime's officialwebsite orLinkedIn,X,Facebook and Youtubepages. 

CisionView original content to download multimedia:https://www.prnewswire.com/apac/news-releases/sensetimes-sensenova-v6-chinas-most-advanced-multimodal-model-with-the-lowest-cost-in-the-industry-302426998.html

SOURCE SenseTime