Infra Team Manager
About the role
About the role
We are looking for a Senior DevOps / Platform Engineer to own the end-to-end cloud infrastructure, reliability, security, and CI/CD platforms for our platform products. This is a high-ownership, hands-on role responsible for building and operating production-grade infrastructure that supports high availability, regulatory compliance, scalability, and cost efficiency. You will act as the platform backbone for fast-moving product and engineering teams.
Key ResponsibilitiesInfrastructure & Platform
- Design, deploy, and operate cloud-native infrastructure on AWS or Azure
- Own Kubernetes-based microservices infrastructure across environments (prod, staging, dev)
- Define infra standards, environments, networking, and release guardrails
- Manage databases, caches, queues, and supporting infra from a platform perspective
Reliability & Availability
- Define and enforce SLOs / SLAs for critical user and payment flows
- Build for high availability, auto-scaling, and fault tolerance
- Implement zero-downtime deployments and safe rollout strategies
- Own incident response, postmortems (RCA), and preventive action plans
- Plan capacity and scaling for traffic spikes and seasonal peaks
CI/CD & Release Engineering
- Build and maintain CI/CD pipelines for backend, frontend, and mobile applications
- Standardize deployment pipelines with rollback, approvals, and auditability
- Enable faster, safer releases without compromising reliability
Observability & Monitoring
- Own the observability stack across services and infrastructure
- Implement monitoring for:
- Latency, throughput, error rates
- Infrastructure health
- Payment and checkout flows
- Define alerting strategies aligned with business impact, not noise
Security & Compliance
- Own platform-level security posture and compliance readiness
- Implement and manage:
- Secrets management and key rotation
- IAM, RBAC, and least-privilege access
- Network security, TLS, WAFs, and audit logging
- Partner with security/compliance stakeholders during audits and reviews
- Ensure infrastructure is audit-ready (logs, access trails, change history)
Cost & Efficiency
- Monitor and optimize cloud costs across environments
- Design infra with cost-efficiency in mind without sacrificing reliability
- Provide visibility into infra usage and cost drivers
Required Experience
- 4+ years of experience in DevOps / SRE / Platform Engineering
- Hands-on experience running production-grade Kubernetes workloads
- Experience supporting high-availability, high-throughput systems
- Strong understanding of microservices infrastructure
- Experience working with fintech, payments, or transaction-heavy systems
- Proven ownership of uptime, reliability, and incident management
Expected Tech Exposure
- Cloud: AWS or Azure
- Containers: Docker, Kubernetes
- CI/CD: GitHub Actions, Jenkins, Fastlane
- Infrastructure as Code: Terraform / ARM / Pulumi
- Databases & Caches: PostgreSQL, Redis
- Messaging: Kafka / Event Hubs
- Security: Vaults, SSL/TLS, key rotation, network policies
- Observability: Prometheus, Grafana, Azure Monitor, Sentry
Good-to-Have
- Experience with UPI, wallets, payment gateways, or banking systems
- Exposure to multi-cloud or hybrid environments
- Familiarity with compliance standards (PCI-DSS, SOC-style controls, audit workflows)
- Experience defining SLOs, error budgets, and release safety mechanisms
You’ll Be Successful If
- You think in systems, reliability, and failure scenarios
- You proactively prevent outages, not just react to them
- You enjoy owning infrastructure end-to-end, not just tooling
- You build platforms that enable product teams to move fast, safely
- You’re comfortable being accountable for uptime, security, and scale
Benefits and perks
•Home Office Setup
•Learning Budget
•Wellness Programs
•Performance Bonus
•Free Meals
•Sabbatical Leave
•Parental Leave
•Healthcare
•Paid Time Off
•Retirement Plan
Required skills
Data engineering
Data pipelines
About Krafton
Bengaluru
Headquarters