Features and Benefits

How can xpresso.ai help me?

xpresso.ai has a number of features aimed at standardizing and accelerating delivery of Analytics solutions. Some of these are listed in Table 1, along with the intended benefits to the development team.

Area

Feature

Benefits

Solution Development

xpresso.ai Development workflow imposes best principles of software architecture and design

Modular, reusable Object-Oriented code that follows standard design patterns; Standardized code structure resulting in easy documentation and knowledge transfer to new team members.

Enables visual creation of solutions

xpresso.ai Solution Builder enables visual creation of solutions, using drag-and-drop.

Enforces best-of-breed software development practices

Code check-in, DevOps pipelines and containerization are an integral part of the xpresso.ai solution development workflow.

Integration with JupyterHub

Each developer gets access to a private workspace, as well as shared solution workspace. Developers can code xpresso.ai components and pipelines using Jupyter notebooks and push these into the project code repository by just clicking a button in the customized notebook.

Component Library

The xpresso.ai Component Library provides ready-to-use components for a variety of tasks, saving hours of effort for developers. Special components provide support for popular ML libraries like scikit-learn, xgboost, lightgbm and keras.

Rich APIs

All activities in xpresso.ai can be performed through the API as well as by accessing a rich set of REST APIs, enabling automated scripting of all activities.

Data Ops

Data Connectivity libraries to enable data fetch from disparate sources

Developers can make use of libraries to fetch data from sources as diverse as databases, file systems, SFTP sites, S3 buckets, etc.

Data Exploration libraries to explore and clean data, and provide guidelines on choice of independent variables for analytics solutions

Python libraries available to categorize data attributes, perform uni-variate and bi-variate analyses of structured data, perform various analyses on unstructured data (Bag Of Words, etc.), clean data using standard techniques, and analyze independent variables with respect to a target variable. These enable data scientists to hone in on the correct attributes for analysis. Most exploration functions are supported on Spark clusters as well, thus enabling big data exploration.

Data Visualization libraries to visualize exploration results

Visualize exploration results using standard graphing packages. Export visualization results into well-formatted PDF reports. Add customized explo ration/visualization using MatPlotLib. View results of exploration and visualization (including custom visualization) within the xpresso.ai UI.

Data Versioning libraries to maintain data versions

Developers can manage data versions using a versioning system, similar to managing code versions in Bitbucket. Versions of data can be pushed, pulled, listed and compared.

Data Repository

The data repository can be explored using the xpresso.ai UI. Developers can create new datasets, commit versions, and upload and download data into datasets.

Data sharing

NFS and HDFS drives available out-of-the box enable team members to share data for a solution.

Dev Ops

Automated DevOps is integral to the xpresso.ai solution development workflow. Code is checked out of a repository and run through a DevOps pipeline (Jenkins), finally resulting in the creation of Docker images for each component of the solution

The build process is available out of the box, and requires no code changes on the part of the developer. Thus, each solution gets industry-standard best DevOps practices with minimal effort. Extensive details of each step of the process are available through integrated dashboards. Unit testing is part of the process, with unit test reports available after each build.

Components can be promoted from one environment to another

Once a component has been tested in a DEV environment, it can be promoted to a higher environment with the click of a button. Optionally, an approval workflow can be defined for the promotion process.

Model Ops

Support for experiments on Analytics pipelines

Data Scientists can run multiple experiments for their analytics projects on both Spark and Kubeflow clusters, with no change in the build and deployment process.

Complete control over experiments

Data Scientists can pause, restart and terminate experiments.

Experiment scheduling

Model training can be scheduled as required, to be run either one-time or periodically.

Experiment monitoring

Data scientists can report KPIs and metrics during their experiment back to the xpresso.ai Controller, which displays these graphically on the UI in real time. Special support for popular ML libraries ensures automated reporting of KPIs and metrics specific to these libraries.

Model Versioning

After an experiment has been run successfully, trained models are automatically stored in the xpresso.ai Model Repository for future use.

Model Repository

The Model Repository can be explored using the xpresso.ai UI. Data Scientists can browse different model versions and download them using the Model Explorer.

Experiment Tracking

All experiments are tracked end-to-end, with respect to the training data, pipeline version, and hyperparameters. Thus, experiments can be reproduced whenever required.

Experiment Comparison

Experiments can be compared with each other, in terms of their performance, accuracy, etc. This enables selection of the best model to be deployed.

Model Deployment

Selected models get deployed to a highly available environment (Kubernetes cluster) at the click of a button. Use of Docker images ensures deployment without software compatibility issues. Use of Kubernetes guarantees a high availability environment with easy scale out to ensure high performance during the inferencing process.

A/B Testing

Multiple versions of trained models can be selected and automatically deployed for A/B testing at the click of a button.

Solution Monitoring

Extensive Logging

Logging libraries enable developers to create and monitor production logs easily, using standard dashboards such as Kibana.

Model Monitoring

Model monitoring dashboards enable data scientists to keep an eye on deployed models, watching for signs of model decay. Alerts can be set individually for each model, and monitored on the integrated dashboard.

Security

Role Based Access Control

Instance administrators can define custom roles with granular access to xpresso.ai features. Users can be assigned to one or more roles, thus ensuring complete mapping to the organization hierarchy

Table 1: Benefits of using xpresso.ai IDDME