
HCL Technologies
Senior Administrator - ELK - Elastic Search, Windows PowerShell
RoleInfrastructure
LevelSenior
LocationIndia
WorkOn-site
TypeFull-time
Posted2 days ago
About the role
Job Summary
- Senior Administrator
- ELK - Elastic Search, Windows PowerShell
Developer role for supporting GOON and Octo Bus platforms under unified operational model
: Primary Skill: Grafana, Prometheus, Programming languages (Python or Java)\\r\\n Secondary Skill: Kafka\\r\\n Good to have: Kubernetes, Alert Manager
Key Responsibilities
- Design, develop, and maintain observability platform components and integrations across Prometheus, Thanos, Grafana, Open Telemetry, and streaming telemetry systems.
- Contribute to architecture and technical design of scalable monitoring solutions running on Kubernetes, Docker, and cloud-native environments.
- Implement standardized instrumentation using Open Telemetry SDKs, collectors, exporters, and agents across services and infrastructure.
- Build and optimize telemetry pipelines for metrics, logs, and traces using Prometheus, OTEL Collector, Kafka/streaming pipelines, and time-series backends.
- Develop advanced PromQL queries, recording rules, and Alertmanager logic for complex monitoring scenarios.
- Create reusable dashboards and visualization templates using Grafana (and Perses if applicable).
- Automate deployments and configuration using Git, GitHub/GitLab, Jenkins, ArgoCD, Helm, and Infrastructure-as-Code practices.
- Troubleshoot and optimize performance across collectors, exporters, storage backends, and query layers.
- Support performance testing, load validation, and reliability analysis of observability components.
- Collaborate with engineering and SRE teams to onboard services and improve telemetry coverage across platforms.
- Document implementations, standards, and operational procedures.
Skill Requirements
- Strong programming experience in Go, Python, or Java with focus on backend or platform engineering.
- Hands-on expertise with Prometheus ecosystem (Prometheus, Alertmanager, exporters, Pushgateway) and PromQL.
- Experience implementing Open Telemetry instrumentation, collectors, processors, and pipelines.
- Strong knowledge of Kubernetes, containers, Helm, and microservices architecture.
- Experience with CI/CD tools such as Jenkins, GitHub Actions, GitLab CI, or ArgoCD.
- Understanding of distributed systems, performance tuning, debugging, and profiling techniques.
- Familiarity with streaming and messaging systems (e.g., Kafka or equivalent) and time-series databases.
- Experience building or integrating REST/gRPC APIs.
- Proficiency in Git workflows, scripting (Bash/Python), and automation frameworks.
- Understanding of SNMP, exporters, and infrastructure/device telemetry collection.
- Awareness of security, RBAC, secrets management, and compliance requirements in platform environments.
Other Requirements
- Responsibilities • Design, develop, and maintain observability platform components and integrations across Prometheus, Thanos, Grafana, Open Telemetry, and streaming telemetry systems.
- Contribute to architecture and technical design of scalable monitoring solutions running on Kubernetes, Docker, and cloud-native environments.
- Implement standardized instrumentation using Open Telemetry SDKs, collectors, exporters, and agents across services and infrastructure.
- Build and optimize telemetry pipelines for metrics, logs, and traces using Prometheus, OTEL Collector, Kafka/streaming pipelines, and time-series backends.
- Develop advanced PromQL queries, recording rules, and Alertmanager logic for complex monitoring scenarios.
- Create reusable dashboards and visualization templates using Grafana (and Perses if applicable).
- Automate deployments and configuration using Git, GitHub/GitLab, Jenkins, ArgoCD, Helm, and Infrastructure-as-Code practices.
- Troubleshoot and optimize performance across collectors, exporters, storage backends, and query layers.
- Support performance testing, load validation, and reliability analysis of observability components.
- Collaborate with engineering and SRE teams to onboard services and improve telemetry coverage across platforms.
- Document implementations, standards, and operational procedures. Required Skills and Expertise • Strong programming experience in Go, Python, or Java with focus on backend or platform engineering.
- Hands-on expertise with Prometheus ecosystem (Prometheus, Alertmanager, exporters, Pushgateway) and PromQL.
- Experience implementing Open Telemetry instrumentation, collectors, processors, and pipelines.
- Strong knowledge of Kubernetes, containers, Helm, and microservices architecture.
- Experience with CI/CD tools such as Jenkins, GitHub Actions, GitLab CI, or ArgoCD.
- Understanding of distributed systems, performance tuning, debugging, and profiling techniques.
- Familiarity with streaming and messaging systems (e.g., Kafka or equivalent) and time-series databases.
- Experience building or integrating REST/gRPC APIs.
- Proficiency in Git workflows, scripting (Bash/Python), and automation frameworks.
- Understanding of SNMP, exporters, and infrastructure/device telemetry collection.
- Awareness of security, RBAC, secrets management, and compliance requirements in platform environments.
Required skills
Prometheus
Grafana
OpenTelemetry
Kafka
Kubernetes
Docker
Jenkins
ArgoCD
About HCL Technologies
Others
Headquarters