Data Management

Solution Name: sample_project_data_management

This solution demonstrates a pipeline to fetch data from a file, explore it, and visualize the results, without writing a single line of code, by using components from the xpresso.ai Component Library.

The solution has the following components:

How to use this solution

You will work on a clone of this solution. The steps to be followed are:

  1. Clone the solution.

  2. Note that there is no need to copy solution code since all the components are from the xpresso.ai Component Library

  3. Note that there is no need to build any of the components since there is no coding required

  4. Before deploying the pipeline, you need to upload the parameters file and data file to the shared drive of the solution

    1. Download /pipelines/data_con_exp_viz_pl/data_management_params.json from the NFS Drive of the original solution and upload it to the NFS Drive of the cloned solution, to the /pipelines/data_con_exp_viz_pl folder.

    2. Download /pipelines/data_con_exp_viz_pl/participant_data.csv from the NFS Drive of the original solution and upload it to the NFS Drive of the cloned solution, to the /pipelines/data_con_exp_viz_pl folder.

  5. Deploy the pipeline of the cloned solution. Specify the following deployment parameters for the components

    1. data_connection

      1. Advanced Settings (Custom Docker Image) = docker image specified in the component documentation, as per the instance you are working on

      2. Advanced Settings (Args) - as below

Dynamic?

Name

No

-component-name

No

data_connection

  1. data_exploration

  2. Advanced Settings (Custom Docker Image) - docker image specified in the component documentation, as per the instance you are working on

  3. Advanced Settings (Args) - as below

Dynamic?

Name

No

-component-name

No

data_exploration

c. data_visualization

  1. Advanced Settings (Custom Docker Image) - docker image specified in the component documentation, as per the instance you are working on

  2. Advanced Settings (Args) - as below

Dynamic?

Name

No

-component-name

No

data_visualization

Note that any other parameters required by any component of the pipeline will be taken from the parameters file specified when running an experiment on the deployed pipeline

5. The pipelines have now been deployed, but has not run. To run the pipeline,

start an experiment using the deployed version of the pipeline. Specify the following parameters during the run:

  • Name of the pipeline - <name of the pipeline>

  • Version - latest deployed version

  • Run Name - any run name of your choice (do not use a name which you have already used)

  • Run Description - any description of your choice

  • parameters_filename - data_management_params.json (this file contains values for parameters required by components of the pipeline)

6. To ensure the pipeline has run properly, view the run details. You should see the exploration and visualization results in the output folders specified in the parameters file.