AIExplainer

mixture of experts

A scheme to increase neural network efficiency by using only a subset of its parameters (known as an expert) to process a given input token or example.

A scheme to increase neural network efficiency by using only a subset of its parameters (known as an expert) to process a given input token or example. A gating network routes each input token or example to the proper expert(s). For details, see either of the following papers: - Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer - Mixture-of-Experts with Expert Choice Routing

Practitioners refer to mixture of experts when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.