Creating ML Pipelines using xpresso.ai¶

Pipelines are sets of components (usually of type Job) that are executed in sequence.

Pipelines can be created when creating a solution. Pipelines can be built and deployed on Kubeflow or Spark clusters.

Machine Learning (ML) pipelines are specific instances of pipelines which have the following additional features:

Developers can create and run experiments on ML pipelines
Developers can pause, restart and terminate experiments on Kubeflow pipelines
Developers can terminate experiments on Spark pipelines (pause and restart experiment features are not available for Spark pipelines at present)
During an experiment run, pipeline components can report their status and relevant metrics back to the xpresso.ai Contoller
Results of an experiment (i.e., the trained models) can be stored and versioned using the xpresso.ai Controller
Developers can compare results of different experiments

To avail the above features, developers must instrument their code appropriately, and follow certain guidelines. These guidelines are described in these pages.

The process of creating ML pipelines is different for Kubeflow and Spark deployments.