Machine Learning Intermediate 1 min read

What is an offline inference?

The process of a model generating a batch of predictions and then caching (saving) those predictions.

offline inference explained in plain English

The process of a model generating a batch of predictions and then caching (saving) those predictions. Apps can then access the inferred prediction from the cache rather than rerunning the model. For example, consider a model that generates local weather forecasts (predictions) once every four hours. After each model run, the system caches all the local weather forecasts. Weather apps retrieve the forecasts from the cache. Offline inference is also called static inference. Contrast with online inference. See Production ML systems: Static versus dynamic inference in Machine Learning Crash Course for more information.

Example

Practitioners refer to offline inference when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.

offline inference explained in plain English

Example

People also read