Purrturbed but Stable: Human-Cat Invariant Representations Across CNNs, ViTs and Self-Supervised ViTs
Purrturbed but Stable: Human-Cat Invariant Representations Across CNNs, ViTs and Self-Supervised ViTs
A recent study investigates the differences in ocular anatomy between humans and cats and how these differences influence visual representations. The research introduces a new benchmark designed to assess cross-species representational alignment, providing a standardized way to compare visual processing across species. To conduct this assessment, the study employs various neural network architectures, including convolutional neural networks (CNNs), vision transformers (ViTs), and self-supervised vision transformers. This approach allows for a comprehensive analysis of invariant representations in visual data despite anatomical differences. By focusing on these architectures, the study contributes to understanding how artificial neural networks can model or bridge species-specific visual processing. The findings may have implications for both biological vision research and the development of more adaptable computer vision systems. This work adds to ongoing efforts to explore the intersection of neuroscience and artificial intelligence.
