HCL Technologies
HCL Technologies

Track Manager - Monitoring Tools, Event Monitoring

RoleInfrastructure
LevelSenior
LocationGautam Buddha Nagar, India
WorkOn-site
TypeFull-time
Posted2 days ago
Apply now

About the role

Job Summary

  • Senior Network Monitoring Architect (AKIPS Specialist) Job Overview The AKIPS platform is expected to provide end-to-end network visibility with near real-time operational insight across routers, switches, firewalls, WAN links, and critical infrastructure devices. The solution should support 60-second polling for key performance indicators, 15-second reachability checks where needed, and scalable monitoring of large interface volumes without creating unnecessary network overhead. It should ingest and correlate SNMP, ping, syslog, SNMP traps, and flow telemetry such as Net Flow to enable faster fault detection and isolation. We also expect strong dashboarding and reporting capabilities for operations teams, service owners, and leadership, including device health, interface utilization, packet drops, errors, latency trends, flap history, and top talkers. The platform should support custom alert thresholds, event filtering, suppression of noise, historical trend retention, and the ability to segment views by business service, site, device group, or region. Additionally, the tool should enable integration through APIs or scripts for automation, ticketing, and downstream analytics, while supporting secure monitoring practices such as SNMPv3, backup and restore, and upgrade lifecycle management. ________________________________________ Key Responsibilities • Architecture & Strategy: Define the target state architecture for network visibility across hybrid cloud, data center, and campus edge environments using AKIPS.
  • Scalability Management: Engineer the server capacity, thick-provisioned VM infrastructure, or cloud-based instances (such as AKIPS on AWS) to support up to 60-second polling intervals across 1M+ interfaces.
  • Automation & Site Scripting: Develop and maintain automated infrastructure discovery and AKIPS Site Scripting features using Perl/API Integrations to extend default MIB/CLI parsing functionalities.
  • Dashboard & Reporting Design: Architect role-based operational dashboards and granular event threshold frameworks (Ping, SNMPv3, Syslog, and Traps).
  • Vendor & MIB Management: Compile and maintain custom multi-vendor MIB files to track complex telemetry (e.g., optical power trends, hardware vitals, and switch port mappings).
  • Lifecycle & Security: Manage software upgrades, backup/restore routines, OS hardening (FreeBSD underlying platform), and secure credential handling via SNMPv3 (SHA/AES). ________________________________________ Required Technical Skills • Core NMS Expertise: Advanced architecture level experience with AKIPS Network Monitoring Software. Additional experience with complementary platforms (e.g., Zabbix, Nagios, or Kentik) is a plus.
  • Network Protocols: Expert knowledge of SNMP (v2c/v3), Syslog telemetry, Net Flow/s Flow, CDP/LLDP, and deep knowledge of MIB structure mapping.
  • Enterprise Infrastructure: Strong understanding of multi-vendor routing and switching hardware (Cisco, Arista, Juniper) across SD-WAN and Data Center layers.
  • Scripting & APIs: Proficiency in Perl (for native AKIPS site scripting) or Python/Bash for parsing AKIPS API endpoints.
  • Systems Administration: Solid experience managing the underlying server layer (FreeBSD/Linux architectures), VM configurations (VMware thick-provisioned storage optimizations), or public cloud hosting deployments. ________________________________________ Qualifications & Experience • Experience: 8+ years in Network Engineering or Architecture roles, with at least 3+ years specifically focused on engineering enterprise Network Management Systems (NMS).
  • Education: Bachelor’s degree in Computer Science, Network Engineering, Information Technology, or equivalent practical experience.
  • Certifications (Preferred): CCIE/CCNP (Enterprise/Data Center), systems-level certifications. ________________________________________ Key Perf

Key Responsibilities

  1. Lead integrated event monitoring using enterprise tools such as IBM Netcool, Splunk, and Solar Winds, ensuring proactive identification and resolution of critical incidents across client environments.

  2. Optimize batch job monitoring processes with tools like Control-M and Autosys, implementing advanced scheduling and alerting strategies to minimize downtime and SLA breaches.

  3. Drive continuous improvement initiatives in monitoring workflows and escalation procedures, leveraging ITIL frameworks and automation platforms to enhance operational efficiency.

  4. Guide and mentor the monitoring team in best practices for event correlation, incident triage, and root cause analysis using platforms such as Service Now and BMC Remedy.

  5. Collaborate with stakeholders to align monitoring solutions with evolving client requirements, delivering tailored dashboards and reporting via tools like Grafana and Kibana.

  6. Innovate monitoring processes by evaluating and integrating emerging technologies, ensuring the command center remains at the forefront of operational excellence.

  7. Ensure compliance with security and governance standards in all monitoring and event management activities, utilizing SIEM solutions where appropriate.

Skill Requirements

  1. Excellent Command Of Itil Based Incident, Problem, And Change Management Within A Network Or Technical Operations Center Environment.

  2. Advanced Proficiency In Automation Scripting (Python, Powershell, Shell) For Monitoring Optimization And Workflow Automation.

  3. Excellent Ability To Design, Implement, And Optimize Dashboards And Reporting Using Grafana, Kibana, Or Similar Tools.

  4. Strong Expertise In Root Cause Analysis, Event Correlation, And Escalation Management Using Servicenow Or Bmc Remedy.

  5. Excellent Leadership And Mentoring Skills For Guiding Technical Teams In High Pressure Operational Settings.

  6. Advanced Proficiency In Aligning Monitoring Solutions With Business Objectives And Client Slas.

Other Requirements

  1. ITIL Expert or Intermediate Certification (optional but valuable)

  2. Certified in IBM Netcool, Splunk, or equivalent monitoring platforms (optional but valuable)

  3. ControlM or Autosys certification (optional but valuable

Required skills

network monitoring

AKIPS

SNMP

NetFlow

dashboarding

alerting

automation

About HCL Technologies

Gautam Buddha Nagar

Headquarters