Chargemaster Analytics

Health Insurance providers bill Insurance companies by utilizing a charge description master (CDM). CDM is a massive and comprehensive list of every procedure. Each item has associated codes for billing and tracking purposes. While a patient undergoes treatment at a hospital, each procedure is documented, and a code for records and claim submission is generated. These codes and documentation are translated via charge capture to CDM rates, following which the charges are used to bill the patient and create a claim for insurers. The CDM manages communication between hospitals and insurance companies about the costs of each procedure.  Everything from the service, supply, prescription drugs, diagnostic tests provided at the hospital, and fees (such as equipment fees and room charges) associated with services. Additionally, providers can utilize CDM data to create a summary of services provided, track service volume, costs, and revenue. For provider services linked to the CDM, there can be an increase in the amount billed QoQ that needs to be either under the cap (agreed to by payers and providers) or agreed to through attestation (supplied by providers). An efficient Healthcare industry depends on how efficient the CDM they utilize is.

However, patients and insurers are technically on the receiving end of CDM prices, and healthcare stakeholders outside of the hospital can rarely access these. Because of this, hospitals can mark up prices from Insurance companies and private payers. A study showed that hospitals with 50+ beds had a charge-to-cost ratio of 4.32. It means that hospitals were charging $432 for a service that cost them $100. Studies in 2013 from Johns Hopkins Carey Business School and Bloomberg School of Public Health show that, on average, hospitals charged over 20 times their own for CT scans and anesthesiology services. 

But consumers are now more aware, resulting in increased public disclosure of CDM rates. Consumers with high-deductible health plans and more financial responsibility for their medical bills are now shopping more frequently for cost-effective services. 


The primary challenge is that the raw data that hospitals provide are filled with symbols and acronyms that few people are familiar with. Even if the data was decoded, it would still be nearly impossible for an average person to use it to put together the individual procedures they would be billed for. A prospective patient does not know the range of tests, procedures, supplies, drugs, fees, and equipment that might be used during their stay.

Another major challenge is maintaining the accuracy of the CDM because the Centers for Medicare and Medicaid Services (CMS) update rules each quarter. Hospitals make changes to rules yearly, and payers also update rules to cater to service areas they want to track and bill differently. This challenge is made more difficult when the CDM is incomplete due to incorrect updates since quarterly updates instead of the usual annual cycle demand labor, investment, and foresight.

Also, the CMS may replace or split a single CPT code that was linked to a procedure and billed under a single line item. But, only one of those charges can be set up in the CDM system; hence lack of monitoring and subsequent update to the CDM could mean endless comparisons to figure out the nature and extent of changes needed in the CDM. A similar confusion can arise if procedures that can be reported using multiple codes are not assigned correctly.

Another major challenge occurs when line-item descriptions mismatch the CPT/HCPCS and revenue codes. It happens with missing or inaccurate modifiers for radiology, physical/occupational/speech therapy, and other procedures. It also happens when assigning an unlisted HCPCS code when a specific code is available, missing HCPCS codes for separately paid drugs, assigning a deleted or non-billable code, assigning a CPT code when an HCPCs code is necessary for Medicare billing.

Solution Approach

Our chargemaster analytics solution verifies whether bills to payers are within permissible limits as laid down by the contract language and the CDM. platform provides out-of-the-box development frameworks.  The project was started with the relevant environments, which were then created automatically. Development images configured based on pre-defined templates were installed on-premises or in a development VM within the infrastructure. This enabled authentication using LDAP, seamless project setup using Bitbucket, Jenkins, and Docker (ensuring build and deployment without software compatibility issues).’s MLOps platform allows establishing high-end Alluxio and Presto-based efficient data connectivity and collecting data from diverse sources. 

The platform made available by leverages the latest ML and DL tools while preparing models. It includes Pachyderm-based data versioning, Kubernetes, Kubeflow, and Spark-based ML and DL. It also includes an Istio-based service mesh-enabled microservice architecture, and ELK-based monitoring capability, contributing to a reduction in latency time.’s MLOps platform allows establishing high-end data connections and collecting data from diverse sources. This was leveraged to analyze the extensive data repository of claims’ data and providers’ historical data after collecting details. We used advanced statistical and machine learning algorithms – both in the discrete and continuous domains – to analyze these and detect exceptional increases in billing amounts.

We analyzed all claims data and calculated the increase in quarter-over-quarter billing for different revenue and CPT codes. Increases determined quantitatively were checked against contract language to verify if they were allowed. For cost increases outside permissible limits, we constructed high confidence intervals to support the results. The final outcomes from the negotiations were incorporated as a feedback loop into the algorithms to improve accuracy. 

The details collected were added as exploratory variables by using libraries and analyzed. The attributes obtained were used for categorization (employing Pachyderm-based data versioning) and then performing univariate, bi-variate, and Bag of Words analysis — for both structured and unstructured datasets through xpresso Exploratory Data Analysis (Data and Statistical Analysis).  Different datasets and their different versions were easily controlled and stored into xpresso Data Model (XDM)-enabled data store that enabled easy retrieval and storage of datasets/ files into internal XDM. This was achieved by using two excellent features of

  1. Data Connectivity Marketplace libraries
  2. Data Versioning

Finally, we drew in-depth insights based on the historical data obtained from various providers’ behavior and detected over $50 million as a potential increase in billings over permissible limits. These insights provided much-needed clarity while negotiating payer-provider contracts.

How can help Healthcare Organizations transform their journey to cognitive AI solutions is an AI/ML Application Lifecycle Management Platform. enables complete lifecycle management of AI/ML solutions, addressing the AI transformation journey of enterprises on any cloud platform of choice. offers functionality essential for building AI/ML solutions – primarily enabling data scientists to rapidly build predictive and prescriptive models. The platform provides a user-friendly interface to develop, deploy, and manage AI/ML solutions at scale. In addition, supports the incorporation of these solutions into business processes, surrounding infrastructure, products and applications. 

Key benefits of include: 

  • Empowers data scientists to transform AI/ML research into solutions  
  • Improves the productivity of data scientists by enabling them to focus on the business problem, developing algorithms and rapid experimentation of models  
  • Addresses the shortage of skilled data science resources with automated workflows, toolkits and frameworks  
  • Manages AI transformation journey costs without any wastage of R&D efforts  
  • Provides an enterprise-ready and secure environment for complete lifecycle management of AI/ML applications 
  • Enables at-scale deployment of enterprise AI/ML applications on-premise, cloud (AWS, GCP, Azure), or hybrid environments 

Additional details on can be found at: . We can schedule a demo of the platform for anyone interested in learning more.

Have Any Questions?

Need more information about the platform?