
SeniorAdministrator - Monitoring Tools, Event Monitoring
About the role
Job Summary
Lead SAP operational monitoring and observability initiatives to ensure 24x7 system availability, proactive issue detection, and improved service reliability across SAP landscapes (ECC, S/4, HANA, PI/PO, BO, interfaces).
E2.2 – SAP Operational Monitoring Lead (JD)\\r\\n1. Job Summary\\r\\n Lead SAP operational monitoring and observability initiatives to ensure 24x7 system availability, proactive issue detection, and improved service reliability across SAP landscapes (ECC, S/4, HANA, PI/PO, BO, interfaces).\\r\\n Drive centralized monitoring governance, alert optimization, and incident prevention through tools such as SAP Sol Man, Moogsoft, and AIOps platforms\\r\\n\\r\\n2. Key Responsibilities\\r\\nA. Monitoring Operations & Governance\\r\\n• Own end-to-end SAP monitoring (Availability, Performance, Interface, Batch jobs)\\r\\n• Ensure proactive alerting and incident detection across SAP landscapes\\r\\n• Define and maintain monitoring SOPs, runbooks, and escalation matrix\\r\\n• Ensure adherence to SLAs, OLAs, and operational KPIs\\r\\n Internal reference: Monitoring stabilization and alignment across tools like Sol Man & Moogsoft has been a key operational focus\\r\\n________________________________________\\r\\nB. Alert Management & Optimization\\r\\n• Analyze alerts to identify noise, redundancies, and false positives\\r\\n• Drive alert rationalization, correlation, and threshold tuning\\r\\n• Implement event correlation & AIOps-driven improvements\\r\\n• Reduce alert volume and improve MTTR through automation\\r\\n Internal insight: SAP alert volumes are high (~16k+/year) with optimization opportunities via automation and correlation. \\r\\n________________________________________\\r\\nC. Incident & Problem Management\\r\\n• Lead monitoring-driven incident triage and ensure quick resolution\\r\\n• Identify recurring patterns and support Root Cause Analysis (RCA)\\r\\n• Work with L2/L3 teams to drive permanent fixes and elimination of repeat incidents\\r\\n• Support Major Incident Management (MIM) bridge calls\\r\\n\\r\\nD. Platform & Tool Management\\r\\n• Manage SAP monitoring tools: \\r\\no SAP Solution Manager (Sol Man)\\r\\no Moogsoft / AIOps tools\\r\\no Interface monitoring tools\\r\\n• Maintain dashboards for: \\r\\no Real-time health monitoring\\r\\no Performance metrics\\r\\no Availability reporting\\r\\n Internal practice includes centralized dashboards and monitoring metric expansion\\r\\n\\r\\nE. Automation & Continuous Improvement\\r\\n• Drive monitoring automation initiatives (auto-remediation, self-healing)\\r\\n• Implement: \\r\\no Predictive alerting\\r\\no Automated first response actions\\r\\n• Improve operational maturity from reactive → proactive → predictive monitoring\\r\\n Internal transformation roadmap includes AIOps and autonomous operations maturity. \\r\\n________________________________________\\r\\nF. Stakeholder & Leadership Reporting\\r\\n• Provide regular operational insights to leadership\\r\\n• Highlight: \\r\\no Risks\\r\\no Trends\\r\\no Improvement opportunities\\r\\n• Coordinate with: \\r\\no SAP Basis\\r\\no Infra teams\\r\\no Application teams\\r\\n________________________________________\\r\\nG. Team Leadership (E2.2 scope)\\r\\n• Act as shift/track lead for monitoring operations\\r\\n• Guide L1/L2 teams on monitoring best practices\\r\\n• Drive knowledge transfer and skill improvement\\r\\n________________________________________\\r\\n3. Required Ski
Key Responsibilities
Own end-to-end SAP monitoring (Availability, Performance, Interface, Batch jobs) • Ensure proactive alerting and incident detection across SAP landscapes • Define and maintain monitoring SOPs, runbooks, and escalation matrix • Ensure adherence to SLAs, OLAs, and operational KPIs
Skill Requirements
SAP monitoring tools (Sol Man / CCMS / Focused Run) • SAP Basis fundamentals (HANA, ECC, S/4) • Monitoring tools (SLACK / Moogsoft / Dynatrace / Splunk – preferred) • Incident & problem management tools (Service Now)
Other Requirements
Incident management & RCA • Alert tuning & noise reduction • SLA/KPI tracking & reporting • Automation mindset (AIOps preferred)
Benefits and perks
•Learning Budget
Required skills
Systems administration
Troubleshooting
Service operations
About HCL Technologies
Others
Headquarters