HCL Technologies
HCL Technologies

Senior Data Scientist

RoleData Science
LevelSenior
LocationErnakulam, India
WorkOn-site
TypeFull-time
Posted1 month ago
Apply now

About the role

Job Summary

As a Senior Data Scientist, you will focus on internal and external commercially facing analytics development projects, typically involving large, complex data sets. This typically includes statisticians, computer scientists, software developers, engineers, product managers, and end users, working in concert with partners in Baker Hughes business units.

As a Senior Data Scientist, you will be responsible for:

  • Design, train, and fine-tune generative models (LLMs, transformers, diffusion) and traditional ML models for oil and gas use cases

  • Build data pipelines for training/evaluation, manage embedding large text corpuses, and integrate structured/unstructured data.

  • Design and implement LLM-powered applications using Lang Chain and Lang Graph for workflow orchestration and agent-based systems.

  • Develop pipelines for prompt engineering, retrieval-augmented generation (RAG), and knowledge graph integration.

  • Build data pipelines for training/evaluation, manage embedding large text corpuses, and integrate structured/unstructured data.

  • Optimize inference performance and scalability for large-scale deployments.

  • Translate complex findings for stakeholders, work with cross-functional teams (engineers, product managers).

To be successful in this role you should have:

  • Bachelor’s or master’s degree in Computer Science, Data Science, Machine Learning or related field

  • 5+ years of experience in Data Science Roles

  • 3+ years of experience in streaming data platforms such as Apache Kafka, Apache Flink, Spark Streaming, Kinesis

  • Deep understanding of LLM architectures, transformers, and vector databases Proficiency in programming languages like Python, and in deep learning frameworks like Tensor Flow, Py Torch or Keras

  • Solid understanding of machine learning, deep learning, and statistical modeling concepts

  • Familiarity with RAG pipelines, knowledge graphs, and graph-based reasoning.

  • Proficiency in cloud platforms (AWS, Azure, GCP) and containerization (Docker, Kubernetes).

  • Experiences in monitoring, logging, and observability solutions (using tools like Prometheus, Grafana, or Datadog) to track AI services in production

  • Experienced in SQL, NoSQL, PostgreSQL databases

  • Experience with GPU-accelerated compute environments and AI-specific tools like NVIDIA Triton, Kubeflow, or MLFlow

  • Strong problem-solving skills

  • Ability to work effectively in agile, cross-functional teams

  • Strong written and verbal English communication skills (B2+)

Key Responsibilities

  • 5+ years of experience in Data Science Roles

  • 3+ years of experience in streaming data platforms such as Apache Kafka, Apache Flink, Spark Streaming, Kinesis

  • Deep understanding of LLM architectures, transformers, and vector databases Proficiency in programming languages like Python, and in deep learning frameworks like Tensor Flow, Py Torch or Keras

  • Solid understanding of machine learning, deep learning, and statistical modeling concepts

  • Familiarity with RAG pipelines, knowledge graphs, and graph-based reasoning.

  • Proficiency in cloud platforms (AWS, Azure, GCP) and containerization (Docker, Kubernetes).

  • Experiences in monitoring, logging, and observability solutions (using tools like Prometheus, Grafana, or Datadog) to track AI services in production

  • Experienced in SQL, NoSQL, PostgreSQL databases

  • Experience with GPU-accelerated compute environments and AI-specific tools like NVIDIA Triton, Kubeflow, or MLFlow

Skill Requirements

  • 5+ years of experience in Data Science Roles

  • 3+ years of experience in streaming data platforms such as Apache Kafka, Apache Flink, Spark Streaming, Kinesis

  • Deep understanding of LLM architectures, transformers, and vector databases Proficiency in programming languages like Python, and in deep learning frameworks like Tensor Flow, Py Torch or Keras

  • Solid understanding of machine learning, deep learning, and statistical modeling concepts

  • Familiarity with RAG pipelines, knowledge graphs, and graph-based reasoning.

  • Proficiency in cloud platforms (AWS, Azure, GCP) and containerization (Docker, Kubernetes).

  • Experiences in monitoring, logging, and observability solutions (using tools like Prometheus, Grafana, or Datadog) to track AI services in production

  • Experienced in SQL, NoSQL, PostgreSQL databases

  • Experience with GPU-accelerated compute environments and AI-specific tools like NVIDIA Triton, Kubeflow, or MLFlow

Other Requirements

null

Required skills

Machine learning

Statistics

Data analysis

About HCL Technologies

Ernakulam

Headquarters