AI/ML

Mixture of Experts (MoE)

Também chamado de:MoESparse MoE

📖O que é

A neural network architecture that routes each input to a subset of specialized 'expert' sub-networks rather than activating all parameters, dramatically improving efficiency. Only a fraction of total parameters are active per token (e.g., DeepSeek-V3 has 671B total but ~37B active). MoE enables training much larger models at manageable compute costs. Used in production models like Mixtral, Jamba, and DeepSeek-V3.

Sua exploração

0 termos visitados no total

Termos relacionados explorados0/3

Termos Relacionados

TransformerAI/ML

The neural network architecture underlying modern LLMs, introduced in 'Attention Is All Yo…

Ver termo →

State Space Model (Mamba)AI/ML

An alternative to the Transformer architecture that processes sequences with linear O(n) c…

Ver termo →

LLM (Modelo de Linguagem Grande)AI/ML

A neural network trained on vast text corpora to understand and generate human language. L…

Ver termo →

Voltar ao glossário