Name: Parth Gawande

Experience: 2 Years

Address: New York, USA

Open to Full-Time Roles

Skills

SQL 95%
PYTHON 92%
Data Visualization 90%
Statistical Analysis 92%
Machine Learning 85%
Data Engineering and ETL 90%
Generative AI 82%

About

About Me

I’m a data analyst with a Master’s degree in Information Technology and Analytics from Rochester Institute of Technology (GPA: 3.9). I specialize in transforming complex datasets into actionable insights through ETL automation, interactive dashboards, and machine learning models. My experience spans across transportation, telecom, and retail sectors, where I’ve applied tools like Python, SQL, Snowflake, Tableau, Power BI, Airflow, and AWS to solve business-critical challenges. I’m passionate about data storytelling, cloud analytics, and building intelligent systems that drive smarter, faster decisions.

  • Profile: Data Science & Analytics
  • Education: MS in IT & Analytics, RIT
  • Languages: Python, SQL, R
  • BI Tools: Tableau, Power BI, Looker Studio, QuickSight, Streamlit
  • Analytics: Snowflake, Excel, Google Sheets, dbt
  • Data Engineering: ETL Pipelines, Airflow, AWS Glue, Git, CI/CD
  • Cloud Platforms: AWS Redshift, BigQuery, S3, Lambda, SageMaker, Databricks
  • Strengths: KPI Design, Communication, Data Storytelling

0 +   Projects completed

Github

Resume

Resume

Seasoned Data Analyst with 2+ years of experience driving business strategies through data-driven insights. Proven expertise in data science, statistical analysis, machine learning algorithms and project managemen

Skills

Skills & Expertise

Structured skillset grouped by domain and proficiency, combining tools, platforms, and techniques across analytics, engineering, and cloud technologies.

Programming & Analytics

Python

SQL

R

Java

C++

PySpark

Business Intelligence & Visualization

Tableau

Power BI

Looker Studio

QuickSight

D3.js

Reporting & Data Tools

Excel

Google Sheets

dbt

Data Engineering & Automation

Apache Spark

Apache Kafka

Kubernetes

Docker

PostgreSQL

MongoDB

Cloud & Platforms

AWS

Google Cloud

Azure

Snowflake

Databricks

AWS Technologies

Amazon EC2

Amazon S3

AWS Lambda

Amazon RDS

Amazon Redshift

AWS Glue

Machine Learning & Data Science

Jupyter

TensorFlow

PyTorch

Scikit-learn

Keras

Kaggle

Web Technologies

HTML5

CSS3

JavaScript

React

Node.js

Version Control & Deployment

Git

CI/CD

Experience


Graduate Teaching Assistant

January 2024 - May 2025 RIT Logo Rochester Institute of Technology

Rochester Institute of Technology (RIT) is a top-ranked U.S. university known for its strong focus on technology, innovation, and experiential learning. It offers leading programs in data science, computing, and analytics with a strong industry connection.

  • Supported 150+ students in applied math, bridging theory with real-world analytics, including regression and data normalization.
  • Improved exam scores by 20% through data-driven review sessions and targeted concept workshops.
  • Conducted 200+ hours of tutoring using Excel, Desmos, and Python simulations to strengthen quantitative reasoning.
  • Updated course rubrics and assignments to reflect analytics use cases like forecasting and model fitting.
  • Introduced visual aids to simplify complex math topics such as curve fitting and scaling in real-world applications.
  • Built custom learning materials and engaged students across forums and office hours to boost STEM confidence.

Data Analyst

July 2022 - January 2023 StandardWings Logo StandardWings Technologies Pvt. Ltd.

StandardWings Technologies Pvt. Ltd., founded in 2014 and based in Nashik, India, is a digital solutions provider specializing in web and mobile app development, IoT, AI/ML systems, and SAP services. They serve diverse industries, including government, manufacturing, BFSI, healthcare, and logistics, delivering innovative and cost-effective technology solutions. Their expertise spans frontend and backend development, CMS, cross-platform mobile apps, and enterprise mobility solutions.

  • Automated 8+ ETL pipelines with AWS Glue & PySpark, processing 9,000+ weekly trip records from 150+ GPS/RFID-enabled vehicles.
  • Visualized fleet KPIs across 14 Tableau/QuickSight dashboards, cutting fuel cost variance by 15% and saving 30+ hours/month.
  • Built alerting workflows via AWS Lambda, SNS, and CloudWatch to flag route anomalies, reducing delays and improving field response by 22%.
  • Guided 5 interns on data engineering tasks, scaling output across 6 pipelines and 4 dashboards while eliminating 20+ manual hours/week.
  • Converted 20+ business needs into actionable metrics and dashboard features, improving stakeholder adoption by 40%.
  • Monitored metric reliability using Python health checks and CloudWatch, cutting reporting discrepancies by 31%.

Data Analyst Intern

May 2021 - May 2022 StandardWings Logo StandardWings Technologies Pvt. Ltd.

StandardWings Technologies Pvt. Ltd., founded in 2014 and based in Nashik, India, is a digital solutions provider specializing in web and mobile app development, IoT, AI/ML systems, and SAP services. They serve diverse industries, including government, manufacturing, BFSI, healthcare, and logistics, delivering innovative and cost-effective technology solutions. Their expertise spans frontend and backend development, CMS, cross-platform mobile apps, and enterprise mobility solutions.

  • Engineered 5 ETL pipelines using AWS Glue & Airflow to unify 10 city departments into a central Snowflake data hub.
  • Mapped service gaps using GIS and Redshift Spectrum, helping city officials target delays and reduce incident response time by 5 minutes.
  • Automated 20+ reports with Python and SES, improving weekly communication between analytics and civic leadership.
  • Reduced pipeline runtimes by 4.5+ hours/day by enhancing Airflow DAG scheduling and implementing fault-tolerant logic.
  • Developed Tableau dashboards to monitor SLA performance and resolution trends across 16 urban wards.
  • Embedded SQL-based quality checks in pipelines, reducing null metrics and data issues by over 80%.



Education


MS in Information Technology and Analytics

January 2023 - May 2025 RIT Logo Rochester Institute of Technology

Grade: 3.9/4.0

Relevant Coursework: Data Science & Analytics, Database Design, Non-Relational Databases, Data Warehousing, Visual Analytics, Information Retrieval & Text Mining, Time Series Forecasting, Knowledge Discovery

B.Tech in Computer Science and Engineering

July 2018 - July 2022 MIT Logo MIT World Peace University

Grade: 9.27/10.00

Relevant Coursework: Data Structures & Algorithms, Design & Analysis of Algorithms, Theory of Computation Database Management Systems, Operating Systems, Web Technologies, Cloud Computing, Software Engineering, Artificial Intelligence, Machine Learning, Data Analytics, Big Data & Hadoop, Data Mining, Applied Mathematics

Projects

Projects

Below are the sample Data Analytics projects on SQL, Python, Power BI & ML.

HealthLens: Monitoring Healthcare Performance and Trends

HealthLens is a data-driven analytics project that uncovers operational inefficiencies in healthcare using Python-based EDA and interactive Power BI dashboards. By integrating multiple tables like billing, prescriptions, and diagnoses, it enables hospital administrators to track KPIs, spot anomalies, and make evidence-based decisions. Tools used include Pandas, Seaborn, Plotly, Power BI, and SQL.

CareAllocate: Healthcare Resource Allocation

CareAllocate models optimal distribution of limited healthcare resources (beds, staff, ventilators) across regions using linear programming and constraint optimization. Built with Python, Pandas, PuLP, and Streamlit, it helps policymakers or hospital admins make critical real-time decisions during pandemics or emergencies. Visual outputs support transparency and faster strategic planning.

ForecastPro: Scalable Sales Forecasting via MLOps

ForecastPro is a robust MLOps pipeline that automates sales forecasting using Prophet and Scikit-learn. It includes CI/CD, model registry, and experiment tracking via MLflow and DVC. Designed for scalability, the system forecasts product demand and supports better inventory planning. The stack includes Python, FastAPI, Docker, GitHub Actions, and Streamlit.

AcuMedica: Medical Performance Dashboard

AcuMedica is a Power BI–driven dashboard that visualizes key healthcare performance metrics such as patient outcomes, diagnosis frequency, and treatment trends. Using DAX, Power Query, and structured healthcare data, it provides stakeholders with actionable insights to improve clinical decision-making, resource planning, and operational efficiency in hospitals or clinics.

InfoSnip: Tailored News Summarization

InfoSnip is a tailored news summarization tool designed to tackle the growing problem of information overload by transforming extensive news articles into concise and easily digestible summaries. Leveraging advanced Natural Language Processing (NLP) models, InfoSnip efficiently condenses long news pieces, making it easier for users to stay informed without spending excessive time reading full-length articles.

Dyslexia Detection Using Machine Learning

This project contains a machine learning-based solution for detecting dyslexia using behavioural and cognitive data. Dyslexia Detection leverages advanced machine learning models to identify dyslexic patterns in individuals, aiding in early diagnosis and intervention. The project demonstrates how data-driven approaches can enhance diagnostic processes, potentially providing more accurate and faster results than traditional methods.

GeoEnvision: Satellite Image Enhancement and Topography Detection for Urban Planning

Deep learning has revolutionized the analysis and interpretation of satellite and aerial imagery, addressing unique challenges such as vast image sizes and a wide array of object classes. This repository provides an exhaustive overview of deep learning techniques specifically tailored for satellite and aerial image processing. It covers a range of architectures, models, and algorithms suited for key tasks like classification, segmentation, and object detection.

Black Gold Horizon: Projecting America's Oil Future

This repository is dedicated to the project "Black Gold Horizon: Projecting America's Oil Future", a comprehensive analysis and forecast of crude oil production in the United States, both at the national level and for specific regions. The project utilizes advanced time-series analysis and forecasting models in R to provide predictions of future production levels. The goal of the project is to aid policymakers, industry professionals, and researchers in making data-driven decisions in the energy sector.

YouTrendify: Youtube Video Insights Engine

YouTrendify is a machine learning-based tool designed to provide predictive insights into YouTube video metrics, with a focus on subscriber growth, video ranking, and revenue estimation. By leveraging regression models, feature engineering, and advanced hyperparameter tuning, YouTrendify allows YouTube content creators and analysts to make data-driven decisions that can improve content strategy and performance.

Analyzing GASTech Building Operations Data

This project, completed as part of the VAST Challenge 2016, focuses on analyzing operational data from the GASTech building. The data includes employee movements, HVAC sensor readings, and environmental parameters such as CO2 and Hazium gas levels. The goal of the project is to identify patterns, detect anomalies, and understand causal relationships between employee behavior and building conditions..

StyleSync: AI-Driven Personalized Outfit and Shopping Assistant

StyleSync is an AI-powered fashion compatibility system that analyzes user-uploaded clothing images and recommends visually and contextually compatible outfits. Built on the Maryland Polyvore dataset using a ResNet-50 and Multi-Layered Comparison Network (MCN) architecture, the system scores outfit compatibility and suggests matching items. It incorporates SerpAPI for real-time product recommendations with clickable links and includes a user feedback loop to personalize suggestions based on thumbs up/down ratings. The solution balances deep learning, API integration, and intuitive UI design to deliver a smart, personalized styling experience.

Data Analysis on Airbnb in New York City

This project involves the creation of a visually informative dashboard to analyze Airbnb listings in New York City. The goal is to provide insights into rental prices, availability, and other key metrics across the boroughs of NYC using data visualization techniques. The dashboard enables users to interact with the data, making it easier to derive meaningful patterns and trends in the rental market.

F1Track – Real-Time Performance Analytics Using Databricks

Built a Formula 1 Data Engineering project using Spark on Azure Databricks and Delta Lake architecture. Formula 1 season happens once a year roughly 20 races. Each race happens over a weekend. Roughly 10 teams (constructors) participate in a season. Each team have two drivers who participate in the race. Two drivers get qualified from the entire team and they get to start the race earlier. Each driver can have multiple pit stops to change tires or fix damaged car. Based on the race results, driver standings and constructor standings are decided. The top of the drivers standings becomes the drivers' champion and the team that tops the constructor standings, becomes the constructors' champion.

Churnlytics: Telecom Customer Retention Risk Modeling

This project focuses on analyzing customer churn within a telecommunications company using the Telco Customer Churn Dataset. The primary goal is to develop predictive models to assess customer churn and monthly charges. Several machine learning techniques, such as regression, classification, and clustering, were employed to extract insights and predict customer behavior.

Publications

Research Work

Authored technical publications on data analytics and machine learning, focusing on forecasting models, performance analysis, and real-world applications..

IEEE CONIT: Video-Based Human Activity Detection

Co-authored and published research at IEEE CONIT 2022 on human activity recognition using deep learning and computer vision. Built a 3D CNN with incremental learning to classify human actions (walking, jogging, running, boxing, waving, clapping) on the KTH dataset. Achieved 98.88% accuracy, outperforming prior methods while preventing catastrophic forgetting and enabling scalable real-time analysis.

A Survey Paper on Malware Detection Methods

This paper delves into the complexities of modern malware, which has advanced from simple, single-purpose software to sophisticated polymorphic variants, posing significant challenges to cybersecurity efforts. Traditional malware detection methods, reliant on signature-based classification, are increasingly ineffective against these evolving threats. Our research provides a comprehensive overview of contemporary malware detection strategies, emphasizing cutting-edge approaches such as artificial intelligence (AI), machine learning (ML) classification, deep learning, autoencoders, and IoT cloud environments.

Combination of Transfer Learning and Incremental Learning for Anomalous Activity Detection in Videos

Proposed a system combining transfer learning and incremental learning for human action recognition in videos. The model addresses challenges like catastrophic forgetting and retraining costs by learning new actions without losing prior knowledge. Evaluated on KTH and UCF101 datasets, it shows improved accuracy and efficiency over existing approaches.

Climate Change: Preliminary Analysis and Prediction Using Global Temperature Data

Published research in American Journal of Electronics & Communication (2022) on predictive climate analytics. Analyzed 60+ years of global temperature and emissions data (NASA, UN) using regression models to forecast long-term temperature trends. Delivered insights on correlations between CO2 emissions, deforestation, and temperature anomalies through advanced data wrangling and statistical modeling.

0 Achievements
0 Projects
0 Mentored Students
0 Cups of Tea

More projects on Github

I love to solve business problems & uncover hidden data stories


GitHub

Contact

Contact Me

Below are the details to reach out to me!