AIArtificial IntelligenceIn the News

Tencent Unveils New Industry Benchmarks to Promote Creative AI Testing

Illustration of Tencent’s Creative Benchmark Suite evaluating AI-generated creative content across text, image, and emotional metrics
Image credit:aitechsuite.com

Chinese tech giant Tencent has released a new benchmark platform aimed at improving the assessment of creative AI models—marking a major leap forward for artificial intelligence development. This move signifies a reorientation in the role of creativity within machine-generated outputs, as well as Tencent’s emergence as a leader in the generative AI race, alongside global heavyweights such as OpenAI, Google, and Meta.

The new benchmark, referred to as the Creative Benchmark Suite (CBS) by Tencent’s AI Lab, is a standardized set of tasks for evaluating the ability of large language and multimodal models to complete tasks requiring creativity, imagination, and originality—abilities once thought exclusive to humans.


Creative Benchmark: A Requirement

As generative AI continues to evolve, there is rising demand for models that can do more than basic reasoning or factual reporting. Today, AI is used to generate:

  • Poems
  • Advertising copy
  • Storylines
  • Visual artwork
  • Music

However, evaluating creativity remains one of AI’s most persistent challenges. Conventional benchmarks focus on logical accuracy and task completion, often failing to capture the intangible qualities of novelty and emotional resonance.

To bridge this gap, Tencent researchers have spent several months developing a system capable of measuring AI’s creative capacities. Dubbed “Optimizing Creativity,” the CBS model relies on a combination of machine scoring and human judgment to provide a multidimensional, robust assessment of creativity, the company said in a statement.


How Tencent’s Benchmark Works

Tencent’s Creative Benchmark Suite (CBS) evaluates AI models across five key dimensions:

  1. Originality
    Evaluates how novel a response is compared to training data and typical outputs.
  2. Imaginative Coherence
    Assesses whether generated content maintains logical flow while exploring abstract or speculative themes.
  3. Emotional Resonance
    Measures the model’s ability to evoke emotional responses, using sentiment analysis and human reader feedback.
  4. Cultural and Contextual Appropriateness
    Tests whether the output is culturally sensitive and contextually accurate.
  5. Multimodal Creativity
    Evaluates how well text, images, or audio components align to produce a cohesive and creative result.

The benchmark features over 1,000 task prompts spanning multiple domains, such as:

  • Writing short stories
  • Answering questions
  • Explaining concepts
  • Generating jokes
  • Designing product packaging
  • Creating storyboards for cartoons

Many of these tasks are open-ended, encouraging AI models to demonstrate freeform creativity.

Importantly, Tencent confirms that CBS supports multiple languages, including Chinese, English, and other major languages—making it a global tool for developers.


Industry Implications and Global Response

Tencent’s move follows an international push toward more sophisticated and nuanced AI systems. While models like ChatGPT (OpenAI), Gemini (Google), Claude (Anthropic), and LLaMA (Meta) have demonstrated creative capacities, there remains a lack of a unified standard to compare them.

By launching CBS, Tencent is positioning itself as:

  • A developer of AI algorithms, and
  • A standard setter for the global AI community

According to Dr. Elaine Zheng, professor of computational creativity at the University of Cambridge:

“Creativity is not just making something new, it’s making something new that’s meaningful and valuable. Tencent’s benchmark could become an invaluable resource for anyone developing or assessing next-gen AI systems.”

Early trials of CBS are already being conducted by developers across the United States, Europe, and Southeast Asia. Some analysts believe the academic and open-source communities may adopt CBS as a de facto standard in the near future.


A Boon for Content Industries

Beyond research labs, CBS has practical implications across content-heavy industries, such as:

  • Advertising
  • Entertainment
  • Publishing
  • E-commerce
Practical Use Cases:
  • Ad agencies can determine which AI model produces more imaginative or emotionally effective campaign ideas.
  • Game studios can enhance narrative design and dialogues using CBS metrics.
  • Educational platforms may use CBS to evaluate AI-generated essays and creative writing tasks.

With its wide-ranging presence across gaming (Riot Games, Epic Games), social media (WeChat), and digital content, Tencent is uniquely positioned to integrate CBS into its internal development tools, accelerating creative AI across its business ecosystem.


Responsible AI and Limitations

Despite praise, experts warn that any benchmark for creativity must navigate ethical and cultural complexities.

Key Concerns:
  • Subjectivity of Art
    What resonates with one audience may not with another. Tencent addresses this by using diverse human evaluators and testing prompts across varied cultural backgrounds.
  • Benchmark Overfitting
    Models may be trained to perform well specifically on CBS tasks without truly being creative. Tencent has attempted to counter this by incorporating task variety and criteria that reward true originality.

The company acknowledges that scoring creativity will always be an evolving challenge, but believes that CBS marks a foundational step.


The Road Ahead

The release of CBS doesn’t just redefine how creativity is measured—it challenges what we consider creative thinking in machines. As AI becomes more integrated into our lives and industries, such definitions will become even more vital.

Tencent’s roadmap includes:

  • Regular updates to the CBS suite
  • Community involvement for new task suggestions and feedback
  • Development of custom creative AI tools based on CBS evaluations

These tools could power future innovations in:

  • Digital art
  • Writing
  • Game design
  • Creative graphics

Conclusion

Tencent’s Creative Benchmark Suite (CBS) is a landmark innovation long awaited by the AI community. By introducing a structured and nuanced framework for evaluating AI-generated creative content, the company has:

  • Elevated the global conversation around generative AI, and
  • Laid the foundation for more responsible and imaginative development in the years ahead.

As generative AI continues to evolve, tools like CBS will be essential for distinguishing between automated repetition and authentic artificial creativity. In doing so, Tencent may have reshaped the world’s approach to testing—and trusting—creative AI.

Your AI journey starts here—keep visiting AI Latest Byte for trusted insights, trending tools, and the latest breakthroughs in artificial intelligence.  

Leave a Response

Prabal Raverkar
I'm Prabal Raverkar, an AI enthusiast with strong expertise in artificial intelligence and mobile app development. I founded AI Latest Byte to share the latest updates, trends, and insights in AI and emerging tech. The goal is simple — to help users stay informed, inspired, and ahead in today’s fast-moving digital world.