Selective Sinkhorn Routing for Improved Sparse Mixture of Experts

arXiv — cs.LGThursday, November 13, 2025 at 5:00:00 AM
Selective Sinkhorn Routing (SSR) has been proposed as a new routing mechanism to improve Sparse Mixture-of-Experts (SMoE) models, which are known for their scalability and computational efficiency. Traditional SMoE models often face challenges due to their dependence on auxiliary losses and additional parameters that complicate their architecture. SSR addresses these issues by formulating token-to-expert assignments as an optimal transport problem, ensuring balanced expert utilization without the need for auxiliary balancing losses. This innovative approach not only enhances the performance of SMoE models but also simplifies the training process by minimizing the reliance on the computationally expensive Sinkhorn algorithm. The theoretical and empirical results supporting SSR indicate a promising direction for future developments in AI architectures, making it a noteworthy advancement in the field.
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it