HistAI and Protege Partner to Deliver One of the Largest Whole-Slide Pathology Datasets to AI Developers

HistAI and Protege Partner to Deliver One of the Largest Whole-Slide Pathology Datasets to AI Developers

Oct 29, 2025

HistAI, a cutting-edge pathology data provider, and Protege, the platform for AI training data, have partnered to bring HistAI’s comprehensive dataset of whole-slide pathology images (WSIs) to the Protege platform. By integrating HistAI’s curated pathology dataset into Protege’s secure and compliant data exchange platform, the partnership enables researchers and AI developers to license and access diverse, diagnostic-grade pathology data at scale. 

The HistAI dataset includes hundreds of thousands of WSIs accompanied by detailed case-based pathology reports. With both Hematoxylin and Eosin (H&E) and Immunohistochemistry (IHC) stained slides across more than 400 unique stains, this offering delivers an unprecedented level of granularity and depth for AI training in digital pathology.

This offering is particularly valuable for training and validating AI models in oncology, rare disease detection, histological subtyping, and other applications requiring pixel-level image data aligned with clinical context.

HistAI’s collection spans a wide array of organs and tissue types, ensuring that models trained on this data reflect a diverse range of biological features and pathological conditions. In addition, case-level reports and rich metadata provide critical annotation support, making it easier for model developers to link image content to clinical diagnoses, prognostic indicators, and therapeutic implications.

“We’re thrilled to welcome HistAI to the Protege platform, as their combination of high-resolution whole-slide images and detailed case reports unlocks new possibilities for AI in pathology,” said Bobby Samuels, CEO and Co-Founder of Protege. “Developers building models for tasks like cancer classification, stain normalization, and subcellular feature detection will now have access to some of the most comprehensive, diverse data available. This partnership furthers our mission of connecting AI builders with the high-quality training data they need to deliver clinical impact.”

“Partnering with Protege allows us to bring HistAI’s pathology data to the broader AI ecosystem in a scalable, compliant, and mission-aligned way,” said Alex Pchelnikov, CEO of HistAI. “Digital pathology is poised to transform diagnostics, and we believe that by enabling easy access to this rich dataset, we’re equipping innovators to build smarter, more accurate, and more equitable tools for pathology workflows worldwide.”

About HistAI

HistAI curates one of the world’s most robust pathology datasets, featuring high-resolution whole-slide images and matched case-level reports. With over 400 immunohistochemistry stains and broad organ and tissue coverage, HistAI supports research and development across cancer diagnostics, rare disease identification, biomarker discovery, and beyond. Learn more at hist.ai

About Protege

Protege is the trusted source for finding and sharing  AI training data, enabling seamless and compliant data exchange. By empowering data holders and connecting them with AI developers, Protege supports the creation of thoughtful AI solutions. Protege’s scientific & strategic approach  allows AI teams to quickly discover and license a wide array of curated datasets across industries, expediting the time to obtain AI-ready data for model development. Learn more at withprotege.ai