What is an offline inference?
The process of a model generating a batch of predictions and then caching (saving) those predictions.
offline inference explained in plain English
The process of a model generating a batch of predictions and then caching (saving) those predictions. Apps can then access the inferred prediction from the cache rather than rerunning the model. For example, consider a model that generates local weather forecasts (predictions) once every four hours. After each model run, the system caches all the local weather forecasts. Weather apps retrieve the forecasts from the cache. Offline inference is also called static inference. Contrast with online inference. See Production ML systems: Static versus dynamic inference in Machine Learning Crash Course for more information.
Example
Practitioners refer to offline inference when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.
People also read
- A/B testing
A statistical way of comparing two (or more) techniques—the A and the B.
- ablation
A technique for evaluating the importance of a feature or component by temporarily removing it from a model.
- accuracy
The number of correct classification predictions divided by the total number of predictions.
- activation function
A function that enables neural networks to learn nonlinear (complex) relationships between features and the label.
- active learning
A training approach in which the algorithm chooses some of the data it learns from.
- adaptation
Synonym for tuning or fine-tuning.
- agglomerative clustering
See hierarchical clustering.
- anomaly detection
The process of identifying outliers.
- area under the PR curve
See PR AUC (Area under the PR Curve).
- area under the ROC curve
See AUC (Area under the ROC curve).