I am a data professional with a Master’s in Information Technology and Analytics from Rochester Institute of Technology (GPA: 3.9), combining strong academic expertise with industry experience to design, build, and optimize end-to-end data solutions. My work bridges data analytics and data engineering, allowing me to manage the full data lifecycle from ingestion and transformation to visualization and actionable insights.
Over the past 2+ years, I have delivered measurable business impact across transportation, telecom, logistics, retail, and public sector projects, including:
My expertise spans Python, SQL, and R for data manipulation and modeling; Snowflake, Redshift, BigQuery, and Databricks for scalable warehousing; and Airflow, AWS Glue, Terraform, and CI/CD for automation and infrastructure. On the analytics side, I create KPI-rich, interactive dashboards using Tableau, Power BI, QuickSight, and Looker Studio, enabling leaders to make confident, data-driven decisions.
I excel at aligning technical execution with strategic goals, ensuring data is accurate, insightful, and impactful, empowering organizations to act faster and smarter.
0 + Projects completed
Results-driven Data Analyst & Engineer with 2+ years of experience delivering measurable business impact by designing scalable data solutions, applying advanced analytics and machine learning, and translating complex datasets into actionable strategies that optimize performance and drive growth.
Structured skillset grouped by domain and proficiency, combining tools, platforms, and techniques across analytics, engineering, and cloud technologies.
Python
SQL
R
Java
C++

PySpark
Tableau
Power BI

Looker Studio

QuickSight
D3.js
Excel
Google Sheets
dbt
Apache Spark
Apache Kafka
Kubernetes
Docker
PostgreSQL
MongoDB

AWS
Google Cloud
Azure
Snowflake
Databricks
Amazon EC2
Amazon S3
AWS Lambda
Amazon RDS
Amazon Redshift
AWS Glue
Jupyter
TensorFlow
PyTorch
Scikit-learn
Keras
Kaggle
HTML5
CSS3
JavaScript
React
Node.js
Git
CI/CD
Saayam For All
Saayam For All is a mission-driven nonprofit organization dedicated to empowering underserved communities by providing access to essential resources, education, and healthcare services. It leverages data-driven strategies and innovative technology to optimize program outcomes and maximize social impact across multiple regions.
Rochester Institute of Technology
Rochester Institute of Technology (RIT) is a top-ranked U.S. university known for its strong focus on technology, innovation, and experiential learning. It offers leading programs in data science, computing, and analytics with a strong industry connection.
StandardWings Technologies Pvt. Ltd.
StandardWings Technologies Pvt. Ltd., headquartered in Nashik, India, is a leading IT solutions provider specializing in web and mobile development, IoT, AI/ML systems, and SAP implementation. With over 10 years of experience, 500+ successful projects, and 400+ satisfied clients across government, logistics, healthcare, and fintech sectors, StandardWings delivers innovative, scalable, and user-centric solutions through agile methodologies, transparent communication, and a strong commitment to quality.
StandardWings Technologies Pvt. Ltd.
StandardWings Technologies Pvt. Ltd., headquartered in Nashik, India, is a leading IT solutions provider specializing in web and mobile development, IoT, AI/ML systems, and SAP implementation. With over 10 years of experience, 500+ successful projects, and 400+ satisfied clients across government, logistics, healthcare, and fintech sectors, StandardWings delivers innovative, scalable, and user-centric solutions through agile methodologies, transparent communication, and a strong commitment to quality.
Rochester Institute of Technology
Grade: 3.9/4.0
Overview: Interdisciplinary program blending computing foundations with analytical rigor, equipping me to solve complex business problems through data-driven insights. Emphasis on advanced data technologies, statistical modeling, visualization, and decision science.
Key Coursework: Data Science & Analytics Foundations, Database Design & Implementation, Non-Relational Data Management (MongoDB, NoSQL), Visual Analytics (Tableau, Power BI), Information Retrieval & Text Mining, Time Series Analysis & Forecasting, Data Warehousing, Data-Driven Knowledge Discovery (ML-based analytics), Capstone in Information Technology & Analytics.
Highlights: Honed proficiency in Python, R, SQL, and modern data ecosystems; actively participated in tech seminars, analytics meetups, and interdisciplinary research events hosted by the Golisano College of Computing and Information Sciences, gaining exposure to emerging technologies and collaborative innovation.
MIT World Peace University
Grade: 9.27/10.00
Overview: Strong technical foundation in computing and analytics, combining engineering principles with practical problem-solving, software development, and algorithmic thinking. Program emphasized hands-on, implementation-first learning.
Key Coursework: Data Structures & Algorithms, Object-Oriented Programming (OOP), Operating Systems, Database Management Systems, Computer Networks, Machine Learning, Artificial Intelligence, Software Engineering & Agile Methodologies, Web Development & Cloud Infrastructure, Cybersecurity, Blockchain Fundamentals, Applied Mathematics.
Highlights: Consistently achieved first-class standing; engaged in coding contests, technical paper presentations, peer mentoring, and student development clubs, refining teamwork, public speaking, and leadership abilities while fostering a collaborative engineering mindset.
Below are the sample Data Analytics projects on SQL, Python, Power BI & ML.
HealthLens is a data-driven analytics project that uncovers operational inefficiencies in healthcare using Python-based EDA and interactive Power BI dashboards. By integrating multiple tables like billing, prescriptions, and diagnoses, it enables hospital administrators to track KPIs, spot anomalies, and make evidence-based decisions. Tools used include Pandas, Seaborn, Plotly, Power BI, and SQL.
CareAllocate models optimal distribution of limited healthcare resources (beds, staff, ventilators) across regions using linear programming and constraint optimization. Built with Python, Pandas, PuLP, and Streamlit, it helps policymakers or hospital admins make critical real-time decisions during pandemics or emergencies. Visual outputs support transparency and faster strategic planning.
ForecastPro is a robust MLOps pipeline that automates sales forecasting using Prophet and Scikit-learn. It includes CI/CD, model registry, and experiment tracking via MLflow and DVC. Designed for scalability, the system forecasts product demand and supports better inventory planning. The stack includes Python, FastAPI, Docker, GitHub Actions, and Streamlit.
AcuMedica is a Power BI–driven dashboard that visualizes key healthcare performance metrics such as patient outcomes, diagnosis frequency, and treatment trends. Using DAX, Power Query, and structured healthcare data, it provides stakeholders with actionable insights to improve clinical decision-making, resource planning, and operational efficiency in hospitals or clinics.
InfoSnip is a tailored news summarization tool designed to tackle the growing problem of information overload by transforming extensive news articles into concise and easily digestible summaries. Leveraging advanced Natural Language Processing (NLP) models, InfoSnip efficiently condenses long news pieces, making it easier for users to stay informed without spending excessive time reading full-length articles.
This project contains a machine learning-based solution for detecting dyslexia using behavioural and cognitive data. Dyslexia Detection leverages advanced machine learning models to identify dyslexic patterns in individuals, aiding in early diagnosis and intervention. The project demonstrates how data-driven approaches can enhance diagnostic processes, potentially providing more accurate and faster results than traditional methods.
Deep learning has revolutionized the analysis and interpretation of satellite and aerial imagery, addressing unique challenges such as vast image sizes and a wide array of object classes. This repository provides an exhaustive overview of deep learning techniques specifically tailored for satellite and aerial image processing. It covers a range of architectures, models, and algorithms suited for key tasks like classification, segmentation, and object detection.
This repository is dedicated to the project "Black Gold Horizon: Projecting America's Oil Future", a comprehensive analysis and forecast of crude oil production in the United States, both at the national level and for specific regions. The project utilizes advanced time-series analysis and forecasting models in R to provide predictions of future production levels. The goal of the project is to aid policymakers, industry professionals, and researchers in making data-driven decisions in the energy sector.
YouTrendify is a machine learning-based tool designed to provide predictive insights into YouTube video metrics, with a focus on subscriber growth, video ranking, and revenue estimation. By leveraging regression models, feature engineering, and advanced hyperparameter tuning, YouTrendify allows YouTube content creators and analysts to make data-driven decisions that can improve content strategy and performance.
This project, completed as part of the VAST Challenge 2016, focuses on analyzing operational data from the GASTech building. The data includes employee movements, HVAC sensor readings, and environmental parameters such as CO2 and Hazium gas levels. The goal of the project is to identify patterns, detect anomalies, and understand causal relationships between employee behavior and building conditions..
StyleSync is an AI-powered fashion compatibility system that analyzes user-uploaded clothing images and recommends visually and contextually compatible outfits. Built on the Maryland Polyvore dataset using a ResNet-50 and Multi-Layered Comparison Network (MCN) architecture, the system scores outfit compatibility and suggests matching items. It incorporates SerpAPI for real-time product recommendations with clickable links and includes a user feedback loop to personalize suggestions based on thumbs up/down ratings. The solution balances deep learning, API integration, and intuitive UI design to deliver a smart, personalized styling experience.
This project involves the creation of a visually informative dashboard to analyze Airbnb listings in New York City. The goal is to provide insights into rental prices, availability, and other key metrics across the boroughs of NYC using data visualization techniques. The dashboard enables users to interact with the data, making it easier to derive meaningful patterns and trends in the rental market.
Built a Formula 1 Data Engineering project using Spark on Azure Databricks and Delta Lake architecture. Formula 1 season happens once a year roughly 20 races. Each race happens over a weekend. Roughly 10 teams (constructors) participate in a season. Each team have two drivers who participate in the race. Two drivers get qualified from the entire team and they get to start the race earlier. Each driver can have multiple pit stops to change tires or fix damaged car. Based on the race results, driver standings and constructor standings are decided. The top of the drivers standings becomes the drivers' champion and the team that tops the constructor standings, becomes the constructors' champion.
This project focuses on analyzing customer churn within a telecommunications company using the Telco Customer Churn Dataset. The primary goal is to develop predictive models to assess customer churn and monthly charges. Several machine learning techniques, such as regression, classification, and clustering, were employed to extract insights and predict customer behavior.
Authored technical publications on data analytics and machine learning, focusing on forecasting models, performance analysis, and real-world applications..
Co-authored and published research at IEEE CONIT 2022 on human activity recognition using deep learning and computer vision. Built a 3D CNN with incremental learning to classify human actions (walking, jogging, running, boxing, waving, clapping) on the KTH dataset. Achieved 98.88% accuracy, outperforming prior methods while preventing catastrophic forgetting and enabling scalable real-time analysis.
This paper delves into the complexities of modern malware, which has advanced from simple, single-purpose software to sophisticated polymorphic variants, posing significant challenges to cybersecurity efforts. Traditional malware detection methods, reliant on signature-based classification, are increasingly ineffective against these evolving threats. Our research provides a comprehensive overview of contemporary malware detection strategies, emphasizing cutting-edge approaches such as artificial intelligence (AI), machine learning (ML) classification, deep learning, autoencoders, and IoT cloud environments.
Proposed a system combining transfer learning and incremental learning for human action recognition in videos. The model addresses challenges like catastrophic forgetting and retraining costs by learning new actions without losing prior knowledge. Evaluated on KTH and UCF101 datasets, it shows improved accuracy and efficiency over existing approaches.
Published research in American Journal of Electronics & Communication (2022) on predictive climate analytics. Analyzed 60+ years of global temperature and emissions data (NASA, UN) using regression models to forecast long-term temperature trends. Delivered insights on correlations between CO2 emissions, deforestation, and temperature anomalies through advanced data wrangling and statistical modeling.
Below are the details to reach out to me!