AI/ML
Mixture of Experts (MoE)
Também chamado de:MoESparse MoE
📖O que é
A neural network architecture that routes each input to a subset of specialized 'expert' sub-networks rather than activating all parameters, dramatically improving efficiency. Only a fraction of total parameters are active per token (e.g., DeepSeek-V3 has 671B total but ~37B active). MoE enables training much larger models at manageable compute costs. Used in production models like Mixtral, Jamba, and DeepSeek-V3.
Sua exploração
0 termos visitados no totalTermos relacionados explorados0/3
Termos Relacionados
TransformerAI/ML
The neural network architecture underlying modern LLMs, introduced in 'Attention Is All Yo…
Ver termo →State Space Model (Mamba)AI/ML
An alternative to the Transformer architecture that processes sequences with linear O(n) c…
Ver termo →LLM (Modelo de Linguagem Grande)AI/ML
A neural network trained on vast text corpora to understand and generate human language. L…
Ver termo →