Hello, I'm

Nayim Ali Khan

Bridging analytics, data science, and business through first principles to create solutions that move the needle

Probably overfitting this portfolio, but hoping it generalizes well

Scroll down

About Me

Nayim Ali Khan Profile Picture

Data Scientist with 4+ years of experience applying analytics, machine learning, deep learning, experimentation, and stakeholder engagement to solve healthcare and business challenges. My work spans predictive modeling, time series analysis, probabilistic data enrichment, market sizing, and targeted marketing strategies—driving earlier rare-disease diagnosis, mapping physician networks, identifying key influencers, and delivering measurable commercial impact for global life sciences clients.

Beyond technical depth, I bring a systems-oriented, first-principles mindset: breaking problems down to their essentials and reconstructing them into grounded frameworks that guide action. I have a proven track record of delivering solutions that align technical innovation with business objectives to create commercial and clinical impact.

Work Experience

Senior Data Scientist

Definitive Healthcare

Aug 2024 - Present

  • Led a data enrichment initiative integrating two vendor datasets on anonymized IDs, resulting in a 30% increase in data fill rate
  • Built a GenAI-powered pipeline analysing 25K+ patient forum posts (Reddit, Inspire) using GPT-3.5 to generate insights that informed the client's patient engagement strategy
  • Developed a mentoring framework for junior team members to track progress in technical and business skills

Data Scientist

Definitive Healthcare

Jul 2023 - Jul 2024

  • Developed a patient predictive model for a rare disease orphan drug client, resulting in 50% diagnosis among highly predicted patients within six months, projecting a 60% market share increase
  • Designed a multimodal predictive model by integrating image-based features with structured patient attributes using TensorFlow CNNs, expanding predictive accuracy for a new disease area
  • Parameterized and formalized the modeling methodology into an adaptive framework, enabling scalability across disease areas and serving as the basis for successful pitches that secured additional client projects

Data Science Analyst

Definitive Healthcare

Jul 2022 - Jun 2023

  • Designed a physician segmentation strategy to optimize resource allocation in marketing, achieving a projected 12% market share increase and 10% reduction in marketing costs
  • Built and automated COVID-19 dashboards providing biweekly updates to stakeholders on COVID-19 US market
  • Identified a region of interest and proposed a territory-specific plan that improved market visibility and sales performance within two months

Business Process Analyst

IBM

Jun 2021 - Apr 2022

  • Transformed a 6-month manual inventory reconciliation (product expiry) process into an automated enhancement, reducing manual intervention by 50%+
  • Contributed to migration of 4.2B+ records from legacy SAP ECC systems to SAP S/4HANA, ensuring zero data loss and full audit traceability

Featured Projects

Healthcare Data Enrichment Platform

ML-powered data enrichment system achieving 30% improvement in patient data fill rate using advanced clustering and XGBoost classification

Python XGBoost Dataiku

GenAI Patient Insights Engine

Built comprehensive pipeline processing 25K+ patient forum posts with GPT-3.5 driven sentiment analysis and thematic clustering

GPT-3.5 Selenium NLP

SAP Data Migration System

Engineered enterprise-scale data migration solution transferring 4.2B+ records from SAP ECC to S/4HANA with zero data loss

SAP BODS SQL

Skills & Technologies

Languages & Frameworks

Python
SQL
PySpark
Pandas
NumPy
Scikit-learn
Statsmodels
TensorFlow
Keras
Matplotlib
Seaborn

Modeling & Algorithms

Classification & Regression
Tree-Based Ensembles (XGBoost, Random Forest)
Clustering & Segmentation (K-Means, BIRCH)
Neural Networks & CNNs
Time Series Forecasting
Causal & Uplift Modeling

Analytical & Business

Casual Inference
Attribution Modeling
Segmentation & Persona Development
Commercial & Behavioral Analytics
Network Influence & Centrality
Experimental Experimentation
KPI Design & Performance

Platforms & Tools

AWS (SageMaker, Glue, Redshift)
Databricks
Snowflake
Jupyter Notebooks
Dataiku
Power BI
Git/GitHub

Strategic Enablement

Strategic Problem Framing
First-Principles Systems Thinking
Analytical Decision-Making
Data Storytelling
Stakeholder Management
Cross-Functional Collaboration
Influence Without Authority
Mentoring & Team Enablement

Co-curricular & Internships

College Ambassador, KrazyBee (RVCE)

Promoted KrazyBee on campus; designed creative campaigns to drive student engagement and sign-ups.

Sept 2016 - Oct 2018

Football Team, RVCE

Vice-Captain and right-back for RVCE Football Team. Led the RVCE team to its first-ever VTU(Zonal) Finals appearance

Aug 2017 - May 2019

Research Intern, DARE(DRDO)

Contributed to signal processing research in a defense-grade R&D setting. Gained hands-on exposure to radar systems and military-grade simulation tools.

Jan 2019 - Jun 2019

Analytics Intern, Unacadamy

Proposed and implemented a demand-based scheduling model that improved traffic and satisfaction for students and teachers.

Oct 2019 - Dec 2019

Volunteer, CUPA

Supported animal rescue, rehabilitation, and adoption initiatives, spreading awareness on responsible pet ownership.

Jan 2020 - Oct 2022

Achievements & Scores

GRE

323 (Quant: 170/170)

Engineering Design

CBSE All India Topper

Best Employee

Q2 2023, Q2 2024

Engineering Degree

Electronics & Communication

Get In Touch

Reach out any time. I'd love to hear from you.

nayimalikhan@gmail.com
+91 7625030270