Udayan Sawant | Data Engineer

Hi, I'm Udayan Sawant.

A

Self-driven, quick starter, passionate programmer with a curious mind who enjoys solving a complex and challenging real-world problems.

About

Innovative Data Engineer | AWS Certified | Product Management Enthusiast

I'm Udayan Sawant, a passionate Data Engineer with over 5 years of hands-on experience revolutionizing data infrastructure and driving business success through insightful analytics.

💡 Innovative Solutions Architect: I thrive on designing and implementing cutting-edge solutions leveraging the latest technologies to optimize data processing efficiency. From developing COBOL Copybook parsers in Python to architecting modernized platforms with Apache Airflow and K8's clusters, I've consistently delivered transformative solutions driving significant cost savings and revenue uplifts.

🌐 Cloud & Big Data Expertise: With AWS certifications, I bring extensive expertise in leveraging cloud platforms to architect robust data lakes & scalable infrastructures. Proficient in Big Data frameworks including Hadoop, MapReduce, and Apache Spark, I effectively manage & analyze high-volume data to extract actionable insights.

🔧 ETL & Automation Maestro: I excel in streamlining processes and enhancing operational efficiency through automation. Whether developing automation scripts in Python or utilizing ETL tools like AWS Glue and Informatica PowerCenter, I simplify complex workflows & drive data-centric decision-making.

📊 Strategic Analytics & Visualization: Leveraging advanced statistical techniques & tools such as Databricks and Tableau, I've a track record of uncovering hidden trends, building predictive models, & presenting actionable insights through compelling data visualizations. Empowering organizations to make informed decisions & propel business growth.

🌟 Cross-Functional Leadership: A knack for leading cross-functional teams in designing & implementing real-time data pipelines, fostering collaboration and innovation to drive business-critical initiatives. Bridging the gap between technical complexities & business objectives, instrumental in delivering successful outcomes.

📚 Continuous Learning & Growth: Deeply passionate about staying abreast of emerging technologies and industry trends. Obtaining certifications in cloud computing and data science while pursuing ongoing professional development opportunities, committed to expanding my skill set & driving continuous improvement.

Let's Connect! Unlock the full potential of your data and drive actionable insights! Let's collaborate to propel your organization towards data-driven excellence. Reach out to explore how my expertise can innovate, optimize, and achieve success together!

Experience

J. P. Morgan

Data Engineer

Led the migration of FINRA's data collection infrastructure from a legacy XML-based relational database to Amazon DocumentDB, resulting in a modern, scalable, and efficient data management system.
Applied advanced optimization techniques and performance tuning strategies to improve query execution, data retrieval speed, and resource utilization in Amazon DocumentDB.
Leveraged DocumentDB’s features to enhance system performance and scalability, achieving a 50% reduction in development cycles and a 50% cost saving with AWS Graviton2 instances.
Developed and maintained robust data pipelines using AWS Glue and Amazon Kinesis for data ingestion, transformation, and streaming, ensuring reliable and timely processing of large data volumes while maintaining data integrity and consistency across multiple sources.
Established proactive monitoring and maintenance protocols utilizing AWS CloudWatch and custom tools, implementing real-time issue detection and mitigation strategies to ensure ongoing system health, data availability, and compliance with regulatory requirements.

July 2023 - Present | San Francisco, CA

First Republic Bank

Data Engineer

Designed and implemented an ETL strategy utilizing PySpark and Snowflake, reducing processing time by 70% and saving $3.2M annually.
Led the migration from on-premise data warehouses to Amazon Redshift, optimizing data management for a petabyte-scale environment, employing AWS Redshift for high-performance querying and data analytics.
Managed end-to-end data migration using AWS Schema Conversion Tool (SCT), converting and deploying Snowflake schemas to Amazon Redshift.
Leveraged Redshift’s MPP architecture and columnar storage for efficient data querying, reducing load times and implementing best practices for data integrity and cost-efficiency.
Developed SQL queries and utilized Redshift Spectrum for advanced data analysis and market trend insights., creating comprehensive reports for business decision-making and strategic planning.

November 2021 - July 2023 | San Francisco, CA

Netflix

Data Engineer

Developed Metaflow, a Python library designed to streamline data science workflows by integrating various layers of the data science stack, including modeling, deployment, versioning, orchestration, compute, and data management.
Defined and scheduled complex data workflows with Metaflow, managing up to 20,000 tasks and ensuring high availability and scalability.
Developed and tested workflows using Metaflow’s local scheduler for rapid prototyping and transitioned successful prototypes to production-grade schedulers like AWS Step Functions.
Built a forecasting model for Netflix subscription numbers using advanced time series analysis methods including ARIMA, ETS, and STL.
Conducted EDA to uncover trends, patterns, and correlations in the subscription data, utilizing visualization techniques to present insights into historical data and identify key features for model improvement.
Employed evaluation metrics such as Mean Absolute Error (MAE), Mean Squared Error (MSE), and Root Mean Squared Error (RMSE) for model assessment. Implemented cross-validation techniques to ensure model robustness and generalizability.

August 2020 - November 2021 | Los Gatos, CA

Pace University

Data Modeler - Graduate Research Assistant

Created a Python scraper for structuring data on 5M+ publications and used Natural Language Toolkit for abstract searches, evaluating performance using RMSE.
Provided ongoing support for application issues, resulting in a 47% increase in user satisfaction through bug fixes and feature improvements.

September 2019 - July 2020 | New York, NY

UNICEF

Data Analytics Associate

Led data-driven initiatives and worked closely with Senior Leaders across product, sales, and operations to guide product development, creating strategic decisions and launching new business development projects.
Automated historical data persistence in DynamoDB using Python, and analyzed market data to support strategic initiatives in the Asia Pacific and US regions.

May 2019 – August 2019 | New York, NY