Edit

Department of Health and Human Services

Migrating Complex SAS Processes to Databricks | CMS

Type: Case Study
Department: Division of Information Systems at Center for
Medicaid And CHIP Services
Agency: Department of Health and Human Services

Background

Each month, states submit T-MSIS (Transformed Medicaid Statistical Information System) files to CMS that contain critical Medicaid and CHIP data on beneficiary eligibility, enrollment, service utilization, and many other program functions. CMS utilizes SAS jobs to perform ~3K business rules and ~1.5K data quality checks on each state submission.

Problem

The SAS jobs only allow 1 state to be run at one time as opposed to being able to run multiple states concurrently. Hence it takes days to complete the full run of business rules and data quality checks for all 52 state files. SAS jobs are difficult to maintain, trouble-shoot, and update. They are not optimized for cloud utilization and are esoteric/complex given many years of developer contributions.

Solution

Akira used AWS cloud and modern, open-source/open-standard data technologies (Databricks & Python/Pandas/PySpark, Flake8) to migrate critical data processes from SAS to Databricks. We also created a library of reusable APIs for analytical use cases, detailed documentation (Sphinx), and conducted knowledge transition to the Data Connect staff.

Outcomes

  • Akira developed data products produced results faster without reliance on proprietary constructs of the SAS language, with more  scalability, reliability, concurrency, documentation, and in a manner that can more easily ingest and govern new data quality routines.
  • Akira enabled CMCS (The Center for Medicaid and CHIP Services) to share information back to the states in a timely manner to allow for quicker remediation of data submission problems.

Capabilities Shown

  • Data Engineering
  • Data Analytics

Want to Hire Our Team of Experts?