Reasoning Model
A class of LLMs trained with reinforcement learning to generate step-by-step internal chain-of-thought before producing a final answer, enabling stronger performance on complex math, coding, and logic tasks. Pioneered by OpenAI's o1 (September 2024) and followed by o3, DeepSeek-R1, and Claude's extended thinking mode. Unlike standard LLMs that answer directly, reasoning models produce a variable-length internal CoT, allowing controllable compute at inference time.
Sua exploração
0 termos visitados no totalTermos Relacionados
A prompting technique or model-native capability where the LLM produces intermediate reaso…
Ver termo →A neural network trained on vast text corpora to understand and generate human language. L…
Ver termo →The process of running a trained model on new inputs to generate predictions or outputs. I…
Ver termo →