Dec 1, 2025
The world’s largest library of Khmer-language television from Cambodian Broadcasting Service (CBS) will power more inclusive AI models and expand Protege’s uniquely diverse, six-continent audio-visual dataset.
Protege, a leading global supplier of training data for artificial intelligence, today announced a new partnership with Cambodian Broadcasting Service (CBS), adding Khmer-language television content to Protege’s growing catalogue sourced from high-quality media suppliers across six continents.
Through this partnership, CBS’s wide-ranging library of scripted dramas, competition formats, dance and cooking shows, talk shows, and other entertainment programming — all in Khmer — will be made available for responsible AI development. The collaboration is designed to ensure that Khmer language and Cambodian culture are more accurately reflected in the next generation of AI models.
“Khmer is a vibrant language with a deep cultural heritage, yet it remains largely invisible in today’s AI systems,” said Xenia Shevnina, Protege’s Head of EMEA Media Licensing. “Because we work with independent partners around the world, we’re uniquely positioned to change that. By bringing CBS’s exceptional Khmer-language catalogue into our dataset, we’re one step closer to representing the world as it is, not just the parts of it that are already over-represented online.”
Despite the advancement of large language models, many languages are still underrepresented in training advanced AI for commercial and enterprise applications. By connecting CBS’s content to Protege’s training-data infrastructure, the two companies aim to help AI models improve with Khmer language content, while also ensuring representation and understanding of Cambodian stories, references, and cultural contexts.
“At CBS, we believe that Cambodian stories, culture, and language should be part of the global conversation, including in AI,” shared David Ulmer, Chief Executive Officer at the Cambodian Broadcasting Service. “We envision a world where future AI models have a full understanding of Cambodian culture, text, and speech. Partnering with Protege allows us to bring our library into frontier technologies in a way that respects our content and amplifies Khmer culture.”
The Protege and CBS partnership continues to deepen Protege’s commitment to aggregating and supplying diverse datasets that are representative of the world. This diverse and content-rich catalog of high quality content continues to expand to meet AI model builder needs.
About Cambodian Broadcasting Service (CBS)
Cambodian Broadcasting Service (CBS) is Cambodia’s leading broadcaster and media company, producing and distributing high-quality Khmer-language television and online content across genres, including scripted dramas, competition shows, dance and music formats, cooking programmes, and talk shows. CBS is committed to promoting Khmer language and culture at home and abroad and is now extending that mission into the AI era by bringing its content library to global technology partners. CBS is a member of the Royal Group of Companies.
About Protege
Protege is the trusted source for finding and sharing AI training data, enabling seamless and compliant data exchange. By empowering data holders and connecting them with AI developers, Protege supports the creation of thoughtful AI solutions. Protege’s scientific & strategic approach allows AI teams to quickly discover and license a wide array of curated datasets across industries, expediting the time to obtain AI-ready data for model development.
The Protege Media team curates, licenses, and prepares diverse catalogues of film, television, and other media so that AI models can learn from a broader spectrum of languages, cultures, and content. Protege’s vision is that the right training and evaluation data is used to reflect the entirety of the human experience, reducing bias and increasing representation.
To learn more about how your organization can unlock new revenue by ethically licensing your content for AI, fill out our partner information form or contact the Protege team at contact@withprotege.ai.

