Scaling Self-Supervised and Cross-Modal Pretraining for Volumetric CT Transformers

arXiv — cs.CVMonday, November 24, 2025 at 5:00:00 AM
  • A new foundation model named SPECTRE has been introduced, utilizing a fully transformer-based architecture for volumetric computed tomography (CT). This model employs self-supervised and cross-modal pretraining strategies to effectively learn CT representations, addressing challenges such as extreme token scaling and weak clinical supervision.
  • The development of SPECTRE is significant as it demonstrates the potential for high-performing, generalizable CT representations trained exclusively on openly available datasets. This could enhance diagnostic capabilities in medical imaging.
  • The introduction of SPECTRE aligns with ongoing advancements in AI-driven medical imaging, where models like X-WIN and PoCGM are also addressing limitations in traditional imaging techniques. These developments highlight a broader trend towards improving image quality and diagnostic accuracy through innovative AI frameworks.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
ReBrain: Brain MRI Reconstruction from Sparse CT Slice via Retrieval-Augmented Diffusion
PositiveArtificial Intelligence
A new framework named ReBrain has been introduced for reconstructing brain MRI from sparse CT slices using a retrieval-augmented diffusion approach. This method utilizes a Brownian Bridge Diffusion Model to synthesize MRI slices and retrieves similar CT slices from a prior database to enhance reconstruction accuracy.
A statistical method for crack pre-detection in 3D concrete images
PositiveArtificial Intelligence
A new statistical framework for crack pre-localization in 3D concrete images has been introduced, addressing the challenges of effectively segmenting cracks in large-scale computed tomography (CT) images. This method utilizes a Hessian-based filter and geometric descriptors to identify regions likely to contain cracks, relying on minimal calibration data rather than extensive annotated datasets.
Automated Muscle and Fat Segmentation in Computed Tomography for Comprehensive Body Composition Analysis
PositiveArtificial Intelligence
A new publicly accessible model for automated segmentation of muscle and fat in computed tomography (CT) images has been introduced, enhancing body composition analysis. This model effectively segments skeletal muscle, subcutaneous adipose tissue, and visceral adipose tissue in axial CT images, addressing a significant gap in available tools for clinical applications.
CleverDistiller: Simple and Spatially Consistent Cross-modal Distillation
PositiveArtificial Intelligence
The introduction of CleverDistiller marks a significant advancement in self-supervised cross-modal knowledge distillation, enabling the transfer of features from 2D vision foundation models to 3D LiDAR-based models. This framework utilizes a direct feature similarity loss and a multi-layer perceptron projection head, enhancing the learning of complex semantic dependencies in autonomous driving applications.