HCL Technologies
HCL Technologies

Senior Administrator - ELK - Elastic Search, Windows PowerShell

RoleInfrastructure
LevelSenior
LocationIndia
WorkOn-site
TypeFull-time
Posted2 days ago
Apply now

About the role

Job Summary

  • Senior Administrator
  • ELK - Elastic Search, Windows PowerShell

Developer role for supporting GOON and Octo Bus platforms under unified operational model

: Primary Skill: Grafana, Prometheus, Programming languages (Python or Java)\\r\\n Secondary Skill: Kafka\\r\\n Good to have: Kubernetes, Alert Manager

Key Responsibilities

  • Design, develop, and maintain observability platform components and integrations across Prometheus, Thanos, Grafana, Open Telemetry, and streaming telemetry systems.
  • Contribute to architecture and technical design of scalable monitoring solutions running on Kubernetes, Docker, and cloud-native environments.
  • Implement standardized instrumentation using Open Telemetry SDKs, collectors, exporters, and agents across services and infrastructure.
  • Build and optimize telemetry pipelines for metrics, logs, and traces using Prometheus, OTEL Collector, Kafka/streaming pipelines, and time-series backends.
  • Develop advanced PromQL queries, recording rules, and Alertmanager logic for complex monitoring scenarios.
  • Create reusable dashboards and visualization templates using Grafana (and Perses if applicable).
  • Automate deployments and configuration using Git, GitHub/GitLab, Jenkins, ArgoCD, Helm, and Infrastructure-as-Code practices.
  • Troubleshoot and optimize performance across collectors, exporters, storage backends, and query layers.
  • Support performance testing, load validation, and reliability analysis of observability components.
  • Collaborate with engineering and SRE teams to onboard services and improve telemetry coverage across platforms.
  • Document implementations, standards, and operational procedures.

Skill Requirements

  • Strong programming experience in Go, Python, or Java with focus on backend or platform engineering.
  • Hands-on expertise with Prometheus ecosystem (Prometheus, Alertmanager, exporters, Pushgateway) and PromQL.
  • Experience implementing Open Telemetry instrumentation, collectors, processors, and pipelines.
  • Strong knowledge of Kubernetes, containers, Helm, and microservices architecture.
  • Experience with CI/CD tools such as Jenkins, GitHub Actions, GitLab CI, or ArgoCD.
  • Understanding of distributed systems, performance tuning, debugging, and profiling techniques.
  • Familiarity with streaming and messaging systems (e.g., Kafka or equivalent) and time-series databases.
  • Experience building or integrating REST/gRPC APIs.
  • Proficiency in Git workflows, scripting (Bash/Python), and automation frameworks.
  • Understanding of SNMP, exporters, and infrastructure/device telemetry collection.
  • Awareness of security, RBAC, secrets management, and compliance requirements in platform environments.

Other Requirements

  • Responsibilities • Design, develop, and maintain observability platform components and integrations across Prometheus, Thanos, Grafana, Open Telemetry, and streaming telemetry systems.
  • Contribute to architecture and technical design of scalable monitoring solutions running on Kubernetes, Docker, and cloud-native environments.
  • Implement standardized instrumentation using Open Telemetry SDKs, collectors, exporters, and agents across services and infrastructure.
  • Build and optimize telemetry pipelines for metrics, logs, and traces using Prometheus, OTEL Collector, Kafka/streaming pipelines, and time-series backends.
  • Develop advanced PromQL queries, recording rules, and Alertmanager logic for complex monitoring scenarios.
  • Create reusable dashboards and visualization templates using Grafana (and Perses if applicable).
  • Automate deployments and configuration using Git, GitHub/GitLab, Jenkins, ArgoCD, Helm, and Infrastructure-as-Code practices.
  • Troubleshoot and optimize performance across collectors, exporters, storage backends, and query layers.
  • Support performance testing, load validation, and reliability analysis of observability components.
  • Collaborate with engineering and SRE teams to onboard services and improve telemetry coverage across platforms.
  • Document implementations, standards, and operational procedures. Required Skills and Expertise • Strong programming experience in Go, Python, or Java with focus on backend or platform engineering.
  • Hands-on expertise with Prometheus ecosystem (Prometheus, Alertmanager, exporters, Pushgateway) and PromQL.
  • Experience implementing Open Telemetry instrumentation, collectors, processors, and pipelines.
  • Strong knowledge of Kubernetes, containers, Helm, and microservices architecture.
  • Experience with CI/CD tools such as Jenkins, GitHub Actions, GitLab CI, or ArgoCD.
  • Understanding of distributed systems, performance tuning, debugging, and profiling techniques.
  • Familiarity with streaming and messaging systems (e.g., Kafka or equivalent) and time-series databases.
  • Experience building or integrating REST/gRPC APIs.
  • Proficiency in Git workflows, scripting (Bash/Python), and automation frameworks.
  • Understanding of SNMP, exporters, and infrastructure/device telemetry collection.
  • Awareness of security, RBAC, secrets management, and compliance requirements in platform environments.

Required skills

Prometheus

Grafana

OpenTelemetry

Kafka

Kubernetes

Docker

Jenkins

ArgoCD

About HCL Technologies

Others

Headquarters