MGPATH: Vision-Language Model with Multi-Granular Prompt Learning for Few-Shot WSI Classification
MGPATH: Vision-Language Model with Multi-Granular Prompt Learning for Few-Shot WSI Classification
The paper introduces MGPATH, a vision-language model designed for few-shot classification of whole slide pathology images (WSI). This approach employs a multi-granular prompt learning method to enhance the capabilities of large vision-language models in scenarios with limited annotated data. MGPATH builds upon the Prov-GigaPath model, which has been pre-trained on an extensive dataset, aiming to leverage this foundation to improve generalization performance. The research addresses the significant challenge posed by gigapixel-sized pathology images, which complicate model training and inference due to their scale and complexity. By integrating prompt learning techniques, the model seeks to better adapt to the few-shot learning context inherent in medical image analysis. The proposed method is positioned as a positive advancement in improving model generalization despite the constraints of limited annotations and large image sizes. This work contributes to ongoing efforts in applying advanced AI methods to pathology, potentially facilitating more effective diagnostic tools.
