Skip to content
Open to Data/ML roles & consulting

Hi, I'm Mayur Dalvi.

I’m a Data Scientist with 3.5+ years of professional experience building end-to-end data science products — from problem scoping and data contracts to feature pipelines, model training/validation, CI/CD, and monitored deployment. I’ve shipped forecasting, anomaly detection, and decision intelligence solutions used by business leaders.

I love tackling hard, ambiguous problems and turning them into usable, data-driven products. My goal is to leverage my skills to help companies empower users and confidently go after what’s next.

Python SQL Power BI (DAX) Snowflake Airflow dbt Spark AWS XGBoost Prophet Pandas NumPy Scikit-learn TensorFlow PyTorch MLflow Docker Git Linux / CLI Tableau Excel Power Query BigQuery Redshift Databricks Azure GCP Looker Studio Matplotlib Seaborn Plotly Feature Engineering EDA Statistics A/B Testing Time Series NLP Computer Vision Model Deployment Monitoring Data Modeling ETL / ELT

What I like working on

End-to-end Data Science

Planning → data contracts → Airflow/dbt/Spark features → training/validation (scikit-learn • XGBoost • PyTorch) → packaging & CI/CD → monitored deploy with MLflow.

Forecasting, Anomaly & Causal

Hierarchical/time-series (Prophet, ARIMA), change-point & anomaly detection, uplift modeling and causal inference (DoWhy) with backtesting & guardrails.

Decision Intelligence & BI

Executive dashboards in Power BI with curated DAX, drill-through, RLS, and a maintainable semantic layer aligned to business metrics.

LLM Systems & Agents

RAG architectures (chunking, embeddings, vector DBs), tool-use AI agents, eval harnesses, prompt safety, and observability for production quality.

Work Experience

City & County of Denver — Data Analyst

Dec 2024 – Present • Denver, CO
  • Automated audit workflows across 200+ locations with geospatial validations & policy guardrails.
  • Shipped forecasting pack (Prophet / XGBoost) with backtesting & CV → accuracy +28%.
  • Integrated Snowflake Document AI pipelines (streams/tasks) → key-field accuracy ~95%.
Python SQL Snowflake Power BI (DAX) Airflow dbt

Deque Systems — Data Analyst

Jan 2025 – May 2025 • Boulder, CO
  • Built learner analytics for 10k+ users; identified modules with +18% completion uplift.
  • Delivered executive dashboards & retention cohorts → supported +15% renewals QoQ.
PythonSQLPower BICohort Analysis

ICR Inc — Data Scientist

Aug 2024 – Dec 2024 • Boulder, CO
  • Ingested 10-year aviation + weather history; engineered features & trained XGBoost (ROC-AUC ≈ 0.80).
  • Established reproducible pipelines with drift & anomaly monitoring for reliability.
Feature EngineeringModel EvalTime-series

NICE — Data Engineer

Jul 2021 – Jul 2023 • Pune, India
  • Designed ETL across 10TB+ call data; parallelized Spark jobs → runtime 12h → 8h.
  • Built Redshift analytics over billions of rows; storage & partitioning tuned for cost/perf.
SparkAirflowRedshiftData Modeling

Selected Projects

Credit Card Fraud Detection

Python • XGBoost • Scikit-learn • Power BI

Production-style pipeline for imbalanced classification: cost-sensitive training, threshold tuning, PR-AUC monitoring, and analyst workflow integration. GitHub →

Weather-Aware Flight Delay Modeling

Feature Engineering • Time-Series

Joined FAA/ASOS weather with flight histories; engineered lag features, evaluated with ROC-AUC across seasonal splits, and designed reproducible preprocessing. GitHub →

SageMaker Flight Price Prediction

AWS SageMaker • Jupyter • MLOps

End-to-end MLOps with experiment tracking, model registry, packaged inference, and deployment patterns for batch + near-real-time prediction. GitHub →

Movie Recommender System

Python • Similarity Search • TMDB

Content-based retrieval with vectorization + cosine similarity; fast prototyping, explainable results, and clean offline evaluation notebook. GitHub →

IPL Win Probability Predictor

Sports Analytics • Logistic Regression

Real-time feature updates drive calibrated win probability; showcases feature selection, calibration, and intuitive model outputs. GitHub →

Healthcare KPI Dashboard

Power BI • DAX • RLS

Executive KPIs for Patients, LOS, Cost/Stay with semantic model, curated DAX measures, and secure row-level security. GitHub →

A/B Testing — Digital Ads

Python • statsmodels

Statistically sound experimentation: Z-tests, lift & CIs, sanity checks, outlier handling, and reproducible workflow for marketing teams. GitHub →

Olympics Data Analysis

Jupyter • EDA • Viz

Historical Olympics EDA with tidy joins, feature discovery, and clear visual storytelling for non-technical stakeholders. GitHub →

Exoplanets — Data Visualization

Python • Matplotlib • Seaborn

Clean visualization of exoplanet catalogs covering scales, outliers, and correlations with disciplined chart design. GitHub →

Hollywood Market Synopsis Viz

Pandas • Plotting

Market KPIs for revenue, genres, ROI; robust CSV cleaning and reusable visualization helpers. GitHub →

Profitability Dashboard (Tableau)

Tableau • Data Modeling

Cohort and margin drill-downs; optimized extracts and pragmatic fact/dim semantics for speed. GitHub →

AI-Powered Automated Data Analysis

Python • LLM • Viz

LLM-assisted EDA & charting for CSVs with guardrails; prompts → insights + visuals for faster exploration. GitHub →

SourceSense — AI Agent

Agents • Tools • Retrieval

RAG agent with tool use; strong observability, error-handled boundaries, and clean tracing for debugging. GitHub →

AI Agents Chatbot (LangChain)

Streamlit • FastAPI • LangChain

Multi-model agent (Groq + OpenAI) with optional web search; clean API boundary and helpful UI. GitHub →

Flight Analytics App

Streamlit • MySQL • Plotly

Query & visualize airline mix, busiest airports, and daily frequencies with portable DB helpers. GitHub →

SmartTube Summarizer

Video • Summarization

End-to-end pipeline: transcription, chunking, ranked summaries, and a clean “summary UX”. GitHub →

Watch-Wise Tool

Scraping • Parsing

Streaming discovery helper using resilient scraping + parsing; lightweight CLI/app patterns. GitHub →

Click a category to filter. Images live in assets/projects/.

Let’s work together

Email: mayurdalvi.5@gmail.com
Location: Denver, CO (Open to relocate)