AI/ML
Compartilhar

Mixture of Experts (MoE)

Também chamado de:MoESparse MoE
📖O que é

A neural network architecture that routes each input to a subset of specialized 'expert' sub-networks rather than activating all parameters, dramatically improving efficiency. Only a fraction of total parameters are active per token (e.g., DeepSeek-V3 has 671B total but ~37B active). MoE enables training much larger models at manageable compute costs. Used in production models like Mixtral, Jamba, and DeepSeek-V3.

Sua exploração

0 termos visitados no total
Termos relacionados explorados0/3

Termos Relacionados