HCL Technologies
HCL Technologies

Subject Matter Expert (Support&Ops)

RoleOperations
LevelSenior
LocationSholinganallur, India
WorkOn-site
TypeFull-time
Posted3 days ago
Apply now

About the role

Job Summary

Key Skills & Requirements

  • Strong hands-on experience in Grafana administration, including dashboard development, alert configuration, notification policies, RBAC, user management, and data source integration.

  • Expertise in Grafana plugin installation, configuration, troubleshooting, upgrades, and performance optimization across enterprise-scale monitoring environments.

  • Experience designing and maintaining observability solutions using Grafana Alloy, Grafana and Open Telemetry frameworks.

  • Hands-on experience with Grafana Alloy configuration, telemetry collection pipelines, log/metric forwarding, relabeling, filtering, and performance tuning.

  • Strong knowledge of Bind Plane administration, including collector deployment, gateway configuration, telemetry routing, load balancing, high availability, and troubleshooting.

  • Experience configuring and optimizing telemetry ingestion pipelines from on-premises and cloud-based infrastructure into centralized observability platforms.

  • Good understanding of Google Cloud Platform (GCP) services, with hands-on experience in GKE cluster administration, workload deployment, pod management, scaling, and troubleshooting.

  • Experience using Google Cloud Monitoring tools such as Metrics Explorer, Logs Explorer, dashboards, alerting policies, and observability best practices.

  • Strong Kubernetes administration skills, including deployments, services, ingress controllers, daemonsets, statefulsets, namespaces, resource management, and cluster troubleshooting.

  • Experience managing and monitoring Azure Kubernetes Service (AKS) environments and implementing observability solutions for containerized workloads.

  • Knowledge of Azure cloud services, networking concepts, identity management, and infrastructure monitoring.

  • Hands-on experience with Ansible for infrastructure automation, configuration management, deployment automation, and operational tasks.

  • Strong scripting and automation skills using Python and Shell Scripting for monitoring, API integrations, and operational efficiency improvements.

  • Experience integrating monitoring platforms with Service Now, REST APIs, webhook-based alerting, SQL , and third-party enterprise applications.

  • Strong understanding of Linux system administration, troubleshooting, process management, networking fundamentals, and performance analysis.

  • Ability to perform root cause analysis, capacity planning, performance optimization, and reliability improvements for large-scale monitoring platforms.

  • Experience supporting enterprise observability environments with thousands of monitored servers, applications, and cloud-native workloads.

  • Excellent analytical, troubleshooting, documentation, and stakeholder communication skills.

Cloud & Container Technologies

  • Google Cloud Platform (GCP)/Google Kubernetes Engine (GKE)

  • Kubernetes Administration

  • Azure Cloud/Azure Kubernetes Service (AKS)

Monitoring & Observability

  • Grafana

  • Grafana Alloy

  • Open Telemetry

  • Bind Plane

  • Cloud Monitoring

  • Log Management Solutions

  • Prometheus

Automation & Development

  • Python

  • Shell Scripting (Bash)

  • Ansible

  • REST APIs

  • Git/GitHub

Key Responsibilities

Key Skills & Requirements

  • Strong hands-on experience in Grafana administration, including dashboard development, alert configuration, notification policies, RBAC, user management, and data source integration.

  • Expertise in Grafana plugin installation, configuration, troubleshooting, upgrades, and performance optimization across enterprise-scale monitoring environments.

  • Experience designing and maintaining observability solutions using Grafana Alloy, Grafana and Open Telemetry frameworks.

  • Hands-on experience with Grafana Alloy configuration, telemetry collection pipelines, log/metric forwarding, relabeling, filtering, and performance tuning.

  • Strong knowledge of Bind Plane administration, including collector deployment, gateway configuration, telemetry routing, load balancing, high availability, and troubleshooting.

  • Experience configuring and optimizing telemetry ingestion pipelines from on-premises and cloud-based infrastructure into centralized observability platforms.

  • Good understanding of Google Cloud Platform (GCP) services, with hands-on experience in GKE cluster administration, workload deployment, pod management, scaling, and troubleshooting.

  • Experience using Google Cloud Monitoring tools such as Metrics Explorer, Logs Explorer, dashboards, alerting policies, and observability best practices.

  • Strong Kubernetes administration skills, including deployments, services, ingress controllers, daemonsets, statefulsets, namespaces, resource management, and cluster troubleshooting.

  • Experience managing and monitoring Azure Kubernetes Service (AKS) environments and implementing observability solutions for containerized workloads.

  • Knowledge of Azure cloud services, networking concepts, identity management, and infrastructure monitoring.

  • Hands-on experience with Ansible for infrastructure automation, configuration management, deployment automation, and operational tasks.

  • Strong scripting and automation skills using Python and Shell Scripting for monitoring, API integrations, and operational efficiency improvements.

  • Experience integrating monitoring platforms with Service Now, REST APIs, webhook-based alerting, SQL , and third-party enterprise applications.

  • Strong understanding of Linux system administration, troubleshooting, process management, networking fundamentals, and performance analysis.

  • Ability to perform root cause analysis, capacity planning, performance optimization, and reliability improvements for large-scale monitoring platforms.

  • Experience supporting enterprise observability environments with thousands of monitored servers, applications, and cloud-native workloads.

  • Excellent analytical, troubleshooting, documentation, and stakeholder communication skills.

Cloud & Container Technologies

  • Google Cloud Platform (GCP)/Google Kubernetes Engine (GKE)

  • Kubernetes Administration

  • Azure Cloud/Azure Kubernetes Service (AKS)

Monitoring & Observability

  • Grafana

  • Grafana Alloy

  • Open Telemetry

  • Bind Plane

  • Cloud Monitoring

  • Log Management Solutions

  • Prometheus

Automation & Development

  • Python

  • Shell Scripting (Bash)

  • Ansible

  • REST APIs

  • Git/GitHub

Skill Requirements

null

Other Requirements

null

Benefits and perks

Learning Budget

About HCL Technologies

Sholinganallur

Headquarters