Background Image
MODERN DATA

Data Warehouse Upgrade to Enable Scalable Analytics

Data Warehouse Upgrade to Enable Scalable Analytics (Naviguard) logo

The Customer

National Healthcare Services Company

The Project

Data Warehouse Upgrade to Enable Scalable Analytics

Overview

This Fortune 500 Healthcare Services Company engaged Improving to upgrade their underperforming data warehouse into a modern data platform to enable scalable analytics. Their new, more valuable data architecture provides a higher quality data environment, resulting in faster queries and data updates.

The Challenge

Our client relied on a SQL Server instance to support their business operations, essentially using custom data analyses through a web portal. The SQL Server couldn’t handle the tens of millions of rows, thousands of lines of ETL code, and dozens of different search and aggregation patterns, which resulted in latency in live queries and delays in data updates. This complexity was also apparent in the data and code management for all the web and reporting systems. Additionally, the passing of new healthcare regulations to provide better patient/provider financial protections accelerated the demand for our client’s services, and their team needed additional support

The Solution

Our client needed to scale their reporting solution to accommodate more business cases, more data sources, and use more real-time sources. Our data engineering team worked to accelerate the business and technical scale, improved the reporting structure, and designed a modern data warehouse solution in Snowflake. Improving provided multiple resources to modernize their data platform on two fronts:

  • Migrate to a scalable modern data warehouse (Snowflake) to reduce the analytical load on the operational SQL Server

  • Accelerate SQL Server ELT development and warehouse performance to handle more use cases with higher complexity

Solution Part 1: Snowflake Migration

We recognized the scalability issues and complexities of using a transactional RDBMS for such complex reporting, especially since much of the reporting needed to be optimized using denormalization and preaggregations, which reduced the fidelity of the data used in reporting. Improving recommended Snowflake and brokered a conversation between our client, Snowflake, and internal architects.

Improving is rebuilding the data warehouse in Snowflake using modern data engineering tools and practices:

  • Ingested data from multiple sources (Salesforce, CDC, Kafka, BDPaaS, Hive, SQL Server)

  • Integrated multiple technologies as part of the stack (Snowflake, Kafka Connect, Talend, Airflow, Kubernetes)

  • Extended API connectivity to use Snowflake, in addition to SQL Server

  • Implemented data patterns storage (raw to consumption layers), historical reporting (time travel)

Solution Part 2: SQL Server ETL/Warehouse

Improving onboarded quickly to own and develop SQL Server ETL improvements, expand the warehouse, and organize the development process. We identified that with the speed of development, there was a lot of technical debt left behind. While adding business features, Improving implemented a modern development process including:

  • Improved Agile processes (story writing, acceptance testing, and story organization) to fit data analytics use cases better

  • Implemented source control and change management for the database, procedure logic, and ETL in Github

  • Migrated deployments to use automated CI/CD process in Azure DevOps instead of cumbersome manual deployments

  • Developed python-based automated testing framework used for test-driven development of new database features

  • Refactored ETL and slow-running processes using clean coding principles

  • Implemented data cataloging microsite using markdown and Github pages for users to discover, search and understand database technical design

  • Trained team on data engineering best practices like test automation, CI/CD, and documentation

Business Benefit

Our client transitioned from maintaining existing SQL data to scaling human and system resources more effectively. The development improvements made it possible to scale from 1-2 data engineers to enable multiple resources to contribute in a scalable, quality-focused manner. Specifically:

  • Using Snowflake reduced data merges from over six hours to just minutes. Other queries went downfrom 6-10 minutes to just seconds

  • Using data that was available in Kafka instead of using the batch-loaded data lake reduced data latency

  • ETL Processes that took over 48 hours to complete were refactored (and tested) to run in 10 hours or less

  • Failed ETL could be started from the last stage instead of having to start from the beginning

  • Developers and users have understood the system and answered data questions quicker with improved structure, standards, and documentation

  • Implemented database development standards improved code quality, consistency, and maintainability

Why Improving?

We treat testing, documentation, and the Agile process as primary concerns because they are the foundation of a trustworthy data system. Clean coding and proper engineering standards increase the velocity of the team for the long term while improving the reliability and predictability of data. Our consultants built production-ready systems and not just code that was put in production.

When Improving builds out a data system, we solve problems with a holistic product view rather than focused on one specific engineering problem. We build systems that meet the acceptance criteria that are not too complex for our clients to understand and own. We care about delivering the right solution because consultants should do more than “implement the requirements".

Modern Data
Healthcare
Snowflake

Get Started

Learn more about how Improving can help you get started by contacting us today at sales@improving.com or filling out a contact form through the link below.

Most Recent Case Studies

Explore our case studies and get inspired from thought leaders throughout our enterprises.