Hello, I'm

Nayim Ali Khan

Data Scientist | Systems Thinker | First-Principles Builder

Probably overfitting this portfolio. But hope, it generalizes well anyways.

Bridging ML/DL, business, and analytics to craft scalable solutions that move the needle

Scroll down

About Me

Nayim Ali Khan Profile Picture

Data Scientist with 4+ years of experience in machine learning, deep learning, causal inference & experimentation, stakeholder consulting, and solution delivery. I specialize in predictive modeling, statistically grounded data enrichment, and analytics-driven strategies that drive measurable business outcomes. My approach is rooted in first-principles thinking - I deconstruct problems to their core and rebuild them into scalable, decision-enabling systems.

Work Experience

Senior Data Scientist

Definitive Healthcare

Aug 2024 - Present

  • Built GenAI pipeline extracting insights from 25K+ patient forum posts using LLM-driven thematic clustering
  • Led data enrichment initiative improving patient data fill rate by 30% using ML-based matching algorithms
  • Applied canopy clustering, similarity scoring, and XGBoost classification for precision data matching

Data Scientist

Definitive Healthcare

Jul 2023 - Jul 2024

  • Developed predictive models achieving 60% projected market share increase for orphan drug targeting
  • Built XGBoost models in SageMaker with BIRCH clustering for high-probability patient identification
  • Integrated multi-modal features using TensorFlow CNNs with structured patient attributes

Data Science Analyst

Definitive Healthcare

Jul 2022 - Jun 2023

  • Designed physician segmentation strategy using clustering and network centrality analysis on Databricks
  • Built automated COVID-19 dashboards using SQL and Power BI enabling biweekly stakeholder updates
  • Achieved projected 12% market share increase and 10% reduction in marketing costs through data-driven insights

Business Process Analyst

IBM

Jun 2021 - Apr 2022

  • Designed SQL-based solution in SAP BODS for inventory reconciliation across APAC data centers
  • Migrated 4.2B+ records from SAP ECC to S/4HANA ensuring zero data loss and audit traceability

Analytics Intern

Unacademy

Oct 2019 - Dec 2019

  • Proposed a hierarchical scheduling model using historical data to address mentor session inefficiencies
  • Transitioned to Analytics, utilizing SQL insights to enhance scheduling efficiency and user experience
  • Explored various channels to boost student engagement on the Unacademy platform

Featured Projects

Healthcare Data Enrichment Platform

ML-powered data enrichment system achieving 30% improvement in patient data fill rate using advanced clustering and XGBoost classification

Python XGBoost Dataiku

GenAI Patient Insights Engine

Built comprehensive pipeline processing 25K+ patient forum posts with GPT-3.5 driven sentiment analysis and thematic clustering

GPT-3.5 Selenium NLP

SAP Data Migration System

Engineered enterprise-scale data migration solution transferring 4.2B+ records from SAP ECC to S/4HANA with zero data loss

SAP BODS SQL

Skills & Technologies

Languages & Frameworks

Python
SQL
PySpark
Scikit-learn
XGBoost
TensorFlow
Keras

Modeling & Algorithms

Classification & Regression
Clustering (K-Means, BIRCH)
Neural Networks
Time Series Forecasting

Analytical & Business

Cohort Design
Attribution Modeling
Hypothesis & A/B Testing
Commercial & Behavioral Analytics
Influence Network Modeling
Data Storytelling
Stakeholder Management
First-Principles Thinking

Platforms & Tools

Snowflake
Databricks
AWS (SageMaker, Glue, Redshift)
Dataiku
Power BI

Co-Curricular Activities

GenAI Patient Insights Pipeline

Built GenAI pipeline processing 25K+ patient forum posts with sentiment analysis

2024 Technical Project

Data Enrichment Initiative

Led cross-functional initiative achieving 30% increase in patient data fill rate

2024 Leadership

Academic Excellence Recognition

CBSE All India Topper in Engineering Design demonstrating analytical excellence

2015 Achievement

Best Employee Recognition

Consistently recognized for exceptional performance in Q2 2023 and Q2 2024

2023-2024 Professional Excellence

Predictive Modeling for Healthcare

Built XGBoost models achieving 60% projected market share increase

2023-2024 Technical Innovation

SAP Migration Leadership

Led migration of 4.2B+ records from SAP ECC to S/4HANA with zero data loss

2021-2022 Technical Leadership

Achievements & Scores

GRE

323 (Quant: 170/170)

Engineering Design

CBSE All India Topper

Best Employee

Q2 2023, Q2 2024

Engineering Degree

Electronics & Communication

Get In Touch

Reach out any time. I'd love to hear from you.

nayimalikhan@gmail.com
+91 7625030270