top of page

Seamless Cloud Migration and Data Optimisation Using Databricks and Google Cloud Platform

  • Writer: Hollie Moran
    Hollie Moran
  • Apr 16
  • 2 min read

Updated: May 24


ree

The Challenge

The client required a full-scale migration of its data warehouse system to a new platform within a fixed timeframe. The project demanded a seamless transition to Databricks on Google Cloud Platform (GCP) while maintaining operational continuity, adhering to stringent SLAs, and minimising downtime.

Key Issues:

  • Tight project timeline with less than nine months to completion.

  • Migration of 21 source systems and approximately 175 TB of historical data.

  • Inconsistent system documentation and unstructured legacy processes.

  • Dependency on new technologies (Databricks and GCP), which required niche expertise.

  • Access delays due to remote infrastructure setups and security protocols.

 

The Solution

Wiz Digital Services implemented a robust and phased migration approach, combining cutting-edge technologies and industry best practices to ensure seamless execution and delivery.

Key Actions:

  1. Discovery Phase:

    • Conducted parallel discovery to analyse the “as-is” system architecture for all 21 source systems.

    • Categorised systems into priority levels (P1, P2, P3) to address critical components first.

    • Proactively identified access needs and reverse-engineered undocumented processes.

  2. Design and Development Phase:

    • Implemented a Proof of Concept (PoC) for one critical system to validate feasibility and effort estimates.

    • Standardised ETL processes across systems using Python, PySpark, and SQL.

    • Optimised code for performance, ensuring compliance with SLAs and cost-effectiveness.

  3. QA and Historical Data Migration:

    • Automated testing using the open-source framework Great Expectations to validate data integrity and minimise QA costs.

    • Staggered historical data migration ensured a smooth transition to the new platform without downtime.

  4. Deployment and Monitoring:

    • Designed automated workflows using Databricks and Google Cloud Composer to handle daily batch processes.

    • Enabled real-time monitoring and alerting for seamless operations.

 

The Results

The successful migration to Databricks on GCP delivered significant operational and performance improvements while adhering to the strict project timeline.

Key Outcomes:

  • Improved Performance: Daily batch jobs ran 42% faster, significantly reducing processing time.

  • Cost Optimisation: Leveraged GCP’s advanced features and eliminated reliance on costly legacy tools.

  • Seamless Migration: Successfully migrated all source systems and historical data without downtime.

  • Enhanced Monitoring: Automated alerts and reports ensured reliable operations with minimal manual intervention.

 

Key Highlights

  • Migrated 175 TB of data across 21 source systems, adhering to critical SLAs.

  • Used Databricks’ Photon clusters and adaptive query execution to optimise query performance.

  • Implemented Google Cloud services such as Kubernetes Engine, Cloud Run, and API Gateway for seamless integration and orchestration.

  • Proactively identified and resolved bottlenecks in legacy ETL pipelines, ensuring higher reliability.

 

Why Choose Wiz Digital Services?

Our expertise in large-scale cloud migration projects ensures seamless transitions while optimising costs and performance. By leveraging advanced platforms like Databricks and Google Cloud, we deliver scalable, efficient, and secure solutions tailored to your business needs.

 


ree

Comments


bottom of page