Gagan Goyal

Data Engineer

Experienced, self-motivated, result-oriented, problem-solving Data Engineer with good communication and leadership skills.

About Me

I am a Data Engineer with nearly 6 years of experience working on modern technologies like PySpark, ADF, Databricks, Delta Lake, Cosmos DB, Hadoop, SQL, and Python. I have B1/B2 USA VISA valid till 2028.

Noida, Uttar Pradesh

+91 8561920884

gagangoyal.cs@gmail.com

Experience

WNS Analytics

JUN 2023 - PRESENT

Store Profile and Behaviour

Created reports for store profile and store behavior KPIs like Revenue, ACV, ABV, ABU, AUP, customers etc.

  • Developed SQL scripts to implement business logic
  • Created step function workflow
  • Created Glue jobs to perform ETL operation
AWS Glue Step Function Redshift SQL Python Qlik

All Inclusive (Hospitality)

Created packages by including hotel amenities that customer buys together frequently and identify the relation between encrypted Json payloads.

  • Created All Inclusive packages of amenities that customer buys together
  • Enhanced the data model and added additional layer of staging tables
  • Identified the links between tables and pull out the Stay Enhancements on reservation level
  • Developed python code to extract payloads, performing decryption, formatting and flattening Json
Google Big Query VS Code Jupyter Notebook Python

Global Canvas – MDR Build

Migrated from Dataroma reports (Salesforce) to Power BI using Delta Lake in Databricks and implemented UI-level changes.

  • Migrated datasets from Dataroma (Salesforce) to Azure Databricks
  • Analyzed over 22 datasets from Dataroma, determining join conditions
  • Developed PySpark code to read and join datasets with varying schemas
  • Created workflow jobs and alert system for tracking failures
  • Built Power BI reports from scratch using Databricks gold tables
Azure Databricks Delta Lake Dataroma Azure DevOps Power BI

Abacus Migration

Migrated Power BI reports from IBM Abacus to Delta Lake in Databricks and implemented UI-level changes.

  • Migrated IBM Abacus SQL scripts to Databricks
  • Created workflow jobs and alert system for tracking failures
  • Updated Power BI report connections and made UI adjustments
  • Developed PySpark code for file transfers between storage containers
Azure Databricks Delta Lake Azure DevOps Power BI ADLS2

Concentrix

OCT 2021 - MAY 2023

Build Modern Data Warehouse

Transitioned from traditional data storage to a modern data warehouse.

  • Developed Azure Data Factory (ADF) pipelines for ETL tasks
  • Created automated, generic pipeline using procedural framework parameters
  • Developed stored procedures for data transfer between layers
ADF SSMS Azure DevOps Visual Studio

Epicor Migration

Migrated 8 databases from on-premises SQL Server to Azure SQL Server.

  • Developed ADF pipelines for data migration
  • Created automated pipeline for table structure and data transfer
  • Used ADLS Gen2 as intermediate storage
  • Developed Python script for data validation
ADF Azure Databricks Azure SQL DB Azure DevOps

Hadoop-Azure Migration

Migrated data from Hadoop to Azure environment, replicating existing tables and processes.

  • Migrated ETL pipeline stages to Databricks jobs
  • Converted Jython code into Spark
  • Implemented data insertion logic for Cosmos DB
  • Developed ADF pipelines for data transfer
  • Created validation scripts and transitioned Flume jobs
ADF Azure Databricks Delta Lake Cosmos DB ADLS gen2 Azure Blob

Mindtree

JUN 2019 - OCT 2021

OSA (On Shelf Availability)

Addressed revenue loss due to out-of-stock products through data analysis and reporting.

  • Developed PySpark modules for data processing
  • Managed data ingestion and optimization
  • Set up Azure services and configured ADF components
  • Designed and developed Delta and SQL tables
  • Created validation scripts
Databricks Pyspark ADF Microsoft SQL server LogicApp Azure DevOps

Data Garage

Managed large volumes of data using Azure services.

  • Processed Avro files
  • Performed data wrangling and ingestion
  • Created and configured ADF components
  • Developed validation scripts
Databricks Pyspark Python Azure DW ADF Microsoft SQL server

Skills

Cloud Services

Azure Databricks Azure Data Factory Azure SQL Database Azure Storage Explorer Azure Data Lake Storage Azure Blob Storage Cosmos DB Azure DevOps Google Big Query Jupyter Notebook Amazon Glue Step Functions Redshift Event Bridge

Programming Languages

PySpark Python SQL

Visualization Tools

Power BI Tableau Qlik

Databases

MS SQL Server Cosmos DB Google Big Query Redshift

ETL Tools

Azure Data Factory Amazon Glue Step Functions

Frameworks & Tools

Procfwk Delta Lake Hadoop Git Azure DevOps VS Code Jupyter Notebook

Data Processing

Data Wrangling Data Migration ETL Pipeline Development Data Validation Data Modeling Performance Tuning

Other Skills

Problem Solving Team Leadership Communication Project Management Documentation Testing

Education

B.Tech (Computer Science)

JECRC Foundation, Jaipur

May 2019

Contact

Noida, Uttar Pradesh

+91 8561920884

gagangoyal.cs@gmail.com