Two-Stage Vision Transformer for Image Restoration: Colorization Pretraining + Residual Upsampling

Two-Stage Vision Transformer for Image Restoration: Colorization Pretraining + Residual Upsampling

arXiv — cs.CV•Wednesday, December 3, 2025 at 5:00:00 AM

A new technique called ViT-SR has been introduced to enhance Single Image Super-Resolution (SISR) using a two-stage training strategy. This method involves a self-supervised pretraining phase focused on colorization, followed by adjustments for 4x super-resolution, achieving notable results on the DIV2K benchmark dataset with an SSIM of 0.712 and PSNR of 22.90 dB.
The development of ViT-SR signifies a significant advancement in image restoration techniques, showcasing the effectiveness of self-supervised learning in improving visual representation and potentially paving the way for further innovations in the field of artificial intelligence and computer vision.

— via World Pulse Now AI Editorial System

Two-Stage Vision Transformer for Image Restoration: Colorization Pretraining + Residual Upsampling