HCL Technologies

Senior Data Scientist

RoleData Science

LevelSenior

LocationErnakulam, India

WorkOn-site

TypeFull-time

Posted1 month ago

Apply now

About the role

Job Summary

As a Senior Data Scientist, you will focus on internal and external commercially facing analytics development projects, typically involving large, complex data sets. This typically includes statisticians, computer scientists, software developers, engineers, product managers, and end users, working in concert with partners in Baker Hughes business units.

As a Senior Data Scientist, you will be responsible for:

Design, train, and fine-tune generative models (LLMs, transformers, diffusion) and traditional ML models for oil and gas use cases
Build data pipelines for training/evaluation, manage embedding large text corpuses, and integrate structured/unstructured data.
Design and implement LLM-powered applications using Lang Chain and Lang Graph for workflow orchestration and agent-based systems.
Develop pipelines for prompt engineering, retrieval-augmented generation (RAG), and knowledge graph integration.
Build data pipelines for training/evaluation, manage embedding large text corpuses, and integrate structured/unstructured data.
Optimize inference performance and scalability for large-scale deployments.
Translate complex findings for stakeholders, work with cross-functional teams (engineers, product managers).

To be successful in this role you should have:

Bachelor’s or master’s degree in Computer Science, Data Science, Machine Learning or related field
5+ years of experience in Data Science Roles
3+ years of experience in streaming data platforms such as Apache Kafka, Apache Flink, Spark Streaming, Kinesis
Deep understanding of LLM architectures, transformers, and vector databases Proficiency in programming languages like Python, and in deep learning frameworks like Tensor Flow, Py Torch or Keras
Solid understanding of machine learning, deep learning, and statistical modeling concepts
Familiarity with RAG pipelines, knowledge graphs, and graph-based reasoning.
Proficiency in cloud platforms (AWS, Azure, GCP) and containerization (Docker, Kubernetes).
Experiences in monitoring, logging, and observability solutions (using tools like Prometheus, Grafana, or Datadog) to track AI services in production
Experienced in SQL, NoSQL, PostgreSQL databases
Experience with GPU-accelerated compute environments and AI-specific tools like NVIDIA Triton, Kubeflow, or MLFlow
Strong problem-solving skills
Ability to work effectively in agile, cross-functional teams
Strong written and verbal English communication skills (B2+)

Key Responsibilities

5+ years of experience in Data Science Roles
3+ years of experience in streaming data platforms such as Apache Kafka, Apache Flink, Spark Streaming, Kinesis
Deep understanding of LLM architectures, transformers, and vector databases Proficiency in programming languages like Python, and in deep learning frameworks like Tensor Flow, Py Torch or Keras
Solid understanding of machine learning, deep learning, and statistical modeling concepts
Familiarity with RAG pipelines, knowledge graphs, and graph-based reasoning.
Proficiency in cloud platforms (AWS, Azure, GCP) and containerization (Docker, Kubernetes).
Experiences in monitoring, logging, and observability solutions (using tools like Prometheus, Grafana, or Datadog) to track AI services in production
Experienced in SQL, NoSQL, PostgreSQL databases
Experience with GPU-accelerated compute environments and AI-specific tools like NVIDIA Triton, Kubeflow, or MLFlow

Skill Requirements

5+ years of experience in Data Science Roles
3+ years of experience in streaming data platforms such as Apache Kafka, Apache Flink, Spark Streaming, Kinesis
Deep understanding of LLM architectures, transformers, and vector databases Proficiency in programming languages like Python, and in deep learning frameworks like Tensor Flow, Py Torch or Keras
Solid understanding of machine learning, deep learning, and statistical modeling concepts
Familiarity with RAG pipelines, knowledge graphs, and graph-based reasoning.
Proficiency in cloud platforms (AWS, Azure, GCP) and containerization (Docker, Kubernetes).
Experiences in monitoring, logging, and observability solutions (using tools like Prometheus, Grafana, or Datadog) to track AI services in production
Experienced in SQL, NoSQL, PostgreSQL databases
Experience with GPU-accelerated compute environments and AI-specific tools like NVIDIA Triton, Kubeflow, or MLFlow

Other Requirements

null

Required skills

Machine learning

Statistics

Data analysis

About HCL Technologies

HCL Technologies

Ernakulam

Headquarters