Background Image
CASE STUDIES

Healthcare Services Company

Data warehouse upgrade to enable scalable analytics
Image - Healthcare Services Company

Overview

This Fortune 500 Healthcare Services Company engaged Improving to upgrade their underperforming data warehouse into a modern data platform enabling scalable analytics. Their new, more valuable data architecture provides a higher-quality data environment, resulting in faster queries and data updates.

Challenge

Our client relied on a SQL Server instance to support their business operations, essentially using custom data analyses through a web portal.

With 10s of millions of rows, thousands of lines of ETL code, and dozens of different search and aggregation patterns, the SQL Server was overloaded, which resulted in latency in live queries and delays in data updates. This complexity was also apparent in the data and code management for all the web and reporting systems.

Additionally, the passing of new healthcare regulations to provide better patient/provider financial protections accelerated the demand for our client’s services, and their team needed additional support.

Solution

Our client needed to scale their reporting solution to accommodate more business cases, more data sources, and use more real-time sources. Improving's data engineering team worked to accelerate the business and technical scale, improved the reporting structure, and designed a modern data warehouse solution in Snowflake.

Improving provided multiple resources to modernize its data platform on two fronts:

  • Migrate to a scalable modern data warehouse (Snowflake) to reduce the analytical load on the operational SQL Server

  • Accelerate SQL Server ELT development and warehouse performance to handle more use cases with higher complexity

 

Solution Part 1: Snowflake Migration

Improving recognized the scalability issues and complexities of using a transactional RDBMS for such complex reporting, especially since much of the reporting needed to be optimized using denormalization and pre-aggregations, which reduced the fidelity of the data used in reporting. Improving recommended Snowflake and brokered a conversation between our client, Snowflake, and internal architects.

Improving is rebuilding the data warehouse in Snowflake using modern data engineering tools and practices, with the following highlights:

  • Ingested data from multiple sources (Salesforce, CDC, Kafka, BDPaaS, Hive, SQL Server)

  • Integrated multiple technologies as part of the stack (Snowflake, Kafka Connect, Talend, Airflow, Kubernetes)

  • Extended API connectivity to use Snowflake, in addition to SQL Server

  • Implemented data patterns storage (raw → consumption layers), historical reporting (time travel)

 

Solution Part 2: SQL Server ETL/Warehouse

Improving onboarded quickly to own and develop SQL Server ETL improvements, expand the warehouse, and organize the development process. We identified that with the speed of development, a lot of technical debt was left behind. While adding business features, Improving implemented a modern development process including:

  • Helped improve the agile processes (story writing, acceptance testing, and story organization) to fit data analytics use cases better

  • Implemented source control and change management for the database, procedure logic, and ETL in Github

  • Migrated deployments to use automated CI/CD process in Azure DevOps instead of cumbersome manual deployments

  • Developed Python-based automated testing framework used for test-driven development of new database features

  • Refactored ETL and slow-running processes using clean coding principles

  • Implemented data cataloging microsite using markdown and Github pages for users to discover, search and understand database technical design

  • Trained team on data engineering best practices like test automation, CI/CD, and documentation

Business Benefits

Improving suggested various solutions, helped analyze the best options, recommended the best solutions for this specific situation, and then implemented those solutions. The result for our client is a more valuable data architecture and higher quality data environment.

The benefits of bringing skilled data engineers with experience in modern distributed data processing meant that our client could transition from maintaining existing SQL data to scaling human and system resources more effectively. The development improvements made it possible to scale from 1-2 data engineers to enable multiple resources to contribute in a scalable, quality-focused manner. Specifically:

  • Using Snowflake reduced data merges from over 6 hours to just minutes. Other queries went from 6-10 minutes down to just seconds.

  • Using data that was available in Kafka instead of using the batch-loaded data lake reduced data latency.

  • ETL Processes that took over 48 hours to complete were refactored (and tested) to run in 10 hours or less.

  • Failed ETL could be started from the last stage instead of having to start from the beginning.

  • Developers and users have understood the system and answered data questions quicker with improved structure, standards, and documentation.

  • Implemented database development standards improved code quality, consistency, and maintainability.

Why Improving

Most data engineering teams treat testing, documentation, and the agile process as secondary concerns. Improving treats them as primary concerns because they are the foundation of having a trustworthy data system. Clean coding and proper engineering standards increase the velocity of the team in the long term while also improving the reliability and predictability of data. Our consultants build production-ready systems and not just code that gets put into production.

Improving understands that we shouldn’t build complex systems that the client is unable to understand or own. We work very hard to prove they meet the acceptance criteria, i.e., that we built the right system. We’ve seen too many technologists at other clients build novel but very complex solutions, and the client couldn’t effectively evaluate if they got what they needed. We really care about delivering what the client wanted and needed because consultants should do more than “implement the requirements.”

When Improving builds out a data system, we solve problems with a holistic product view rather than focusing on one specific engineering problem.

Ready to Get Started?

We have been modernizing software systems and teams for Fortune 500s since 2006. Talk to our experts to get started on your digital transformation today.