Prathyush KR Lebaku

Data Scientist — Machine Learning | Analytics | Optimization | Forecasting Texas, USA

prathyushreddy55@gmail.com

소개

Data Scientist with 4 years of experience building search, ranking, personalization and NLP/LLM-driven systems. Delivered measurable gains through hybrid retrieval, fine-tuned Transformers, relevance evaluation, and A/B-driven product improvements. Strong in metrics design, experimentation, and turning behavioral data into ML-powered features.

경력

The Home Depot

TX

Data Scientist

Aug. 2024 - 현재

  • Translated business problems into ML solutions by defining objectives, success metrics, & data requirements with cross-functional teams.
  • Built a hybrid retrieval and personalized ranking system using semantic embeddings, FAISS, BM25, and fine-tuned Transformer models (Hugging Face, PyTorch) and Ranker(XGBoost) to improve search relevance, driving a 15% lift in CTR across the e-commerce platform.
  • Analyzed large-scale search logs, customer behavior data, and product metadata (SQL, Python, PySpark) to identify ranking gaps, relevance issues, & opportunities for personalization.
  • Enhanced query understanding with fine-tuned Transformer models (BERT/Sentence-BERT) for semantic rewriting, intent classification, and synonym expansion, improving long-tail search recall and user relevance.
  • Developed offline relevance evaluation frameworks (NDCG, Recall@K, MRR) and ran controlled A/B tests to measure ranking improvements, ensuring statistically sound validation of search and recommendations.
  • Developed an LLM-powered RAG pipeline by implementing document chunking, vector indexing, & retrieval workflows (Python, Vertex AI, Hugging Face, & BigQuery), enabling automated customer support responses & reducing ticket resolution time by 10%.
  • Built customer segmentation and demand forecasting models by developing XGBoost/LSTM time-series predictors and RFM/K-means customer clusters to identify high-value cohorts, enabling targeted campaigns that improved engagement and reduced marketing waste.
  • Implemented and maintained scalable ML pipelines using PySpark, Airflow, BigQuery, and Vertex AI, enabling automated model training, large-scale feature processing, batch & real-time inference, and production monitoring.
  • Partnered with data engineering team to define data schemas, feature requirements, and quality checks for POS, CRM, and web analytics pipelines on AWS (Glue, Lambda, S3), ensuring reliable, model-ready datasets and timely refresh cycles.
  • Developed Tableau dashboards for search analytics, customer segmentation, and inventory forecasting using Snowflake/BigQuery data pipelines, improving self-service analytics for marketing and operations teams.

HCLTech

India

Product Data Scientist

Apr. 2021 - Jul. 2023

  • Defined north-star metrics and feature-level KPIs for interview analytics, user engagement, and payout workflows, enabling consistent measurement and faster decision-making across product and engineering teams.
  • Designed and executed A/B tests and quasi-experiments to evaluate scoring logic, funnel optimizations, and payout adjustments—turning statistical results into actionable product decisions within the same sprint.
  • Built dashboards and lightweight analytical data models using SQL and Tableau/Looker to support self-serve insights on user behavior, funnel conversion, and operational performance.
  • Defined tracking requirements & validated event instrumentation to improve data quality, coverage, reliability for downstream analytics.
  • Prototyped lightweight ML models (logistic regression, decision trees, XGBoost) to improve matching, scoring, and operational workflows, providing quick baselines for product experimentation.
  • Evaluated NLP/chatbot-powered features by designing scoring rubrics, human review workflows, and quality assessments to measure accuracy, intent classification performance, and robustness.

학력

University of Houston

Texas, USA

Master of Science in Engineering Data Science

  • Worked as Research Assistant under Prof.Lu Gao (Aug 2023 – Aug 2024)

Vellore Institute of Technology

Vellore, Tamil Nadu, India

Bachelor of Science in CSE with Specialization in Data Science

기술

Programming Languages

  • Python
  • SQL

Machine Learning

  • Scikit-Learn
  • PyTorch
  • TensorFlow
  • Clustering (K-Means, DBSCAN)
  • Time Series
  • Forecasting

Natural Language Processing & LLM

  • Hugging Face Transformers (BERT, RoBERTa, SBERT, T5)
  • Text Classification
  • NER
  • Semantic Search
  • RAG
  • LLM Evaluation (HITL, rubrics)
  • LangChain
  • MCP

Experimentation & Analytics

  • A/B Testing
  • Quasi-Experiments
  • Causal Inference (basic)
  • Funnel Analytics
  • KPI/North-Star Metric Design
  • Behavioural Analytics

Big Data & Distributed Computing

  • PySpark
  • Spark SQL
  • Databricks
  • Hadoop Ecosystem

Data Visualization & BI Tools

  • Tableau
  • Power BI

Cloud Platforms

  • AWS (S3, Redshift, EC2, SageMaker)
  • GCP (BigQuery, Vertex AI)
  • Azure (basic)

자격증

Microsoft Power BI Associate (PL-300)

Microsoft

Microsoft Azure Data Scientist Associate (DP-100)

Microsoft

AWS Cloud Practitioner

Amazon

Deep Learning by Andrew Ng

출판물

연락처

이메일

prathyushreddy55@gmail.com