Spark ML Pipeline Component Base ClassesΒΆ
There are two basic classes which developers need to know about.
1. XprPipeline Class
The XprPipeline class represents an xpresso.ai Spark ML pipeline. It provides the following methods:
Name |
Description |
Parameters |
---|---|---|
constructor |
initializes the pipeline |
name (String) - pipeline name spark (SparkSession) - Spark session within which to run pipeline run_id - Experiment Run ID for pipeline stages - pipeline stages (viz. components) - each an Estimator or Transformer - see below |
fit |
runs the pipeline |
dataset - dataset object on which to run pipeline |
2. AbstractSparkPipelineEstimator Class
The AbstractSparkPipelineEstimator class represents an Estimator in the pipeline. It is a thin wrapper around the more general AbstractPipelineComponent class. Estimators created by developers should extend any Spark ML Estimator as well as AbstractSparkPipelineEstimator
3. AbstractSparkPipelineTransformer Class
The AbstractSparkPipelineTransformer class represents a Transformer in the pipeline. It is a thin wrapper around the more general AbstractPipelineComponent class. Transformers created by developers should extend any Spark ML Transformer as well as AbstractSparkPipelineTransformer