Senior Cloud Site Reliability Engineer

SkyHive

SkyHive

Software Engineering
Mumbai, Maharashtra, India
Posted on Apr 1, 2026
We're looking for a

Senior Cloud Site Reliability Engineer

This role is Office Based, Mumbai Office

Job Title: Senior Cloud Reliability Engineer (SRE)

Role Overview

We are looking for a Senior Cloud Reliability Engineer (SRE) to ensure the reliability, scalability, and performance of our cloud-based platforms and services. The role focuses on building highly resilient systems, improving operational efficiency through automation, and driving best practices across monitoring, incident management, and infrastructure reliability.
The ideal candidate will have strong experience in cloud platforms, automation, observability, and production operations, with a mindset focused on reducing toil and improving system reliability at scale.

In this role you will..

  • Design, implement, and maintain highly available and scalable cloud infrastructure across production and non-production environments.]
  • Improve system reliability, performance, and uptime by implementing SRE best practices.
  • Build and maintain automation for infrastructure provisioning, deployments, and operational tasks.
  • Develop and maintain observability frameworks including monitoring, alerting, logging, and tracing.
  • Lead incident management, troubleshooting, and root cause analysis (RCA) for production issues.
  • Implement and maintain SLIs, SLOs, and error budgets to measure service reliability.
  • Collaborate with engineering, DevOps, security, and platform teams to improve system architecture and resilience.
  • Drive capacity planning, performance optimization, and disaster recovery strategies.
  • Reduce operational overhead by implementing self-healing systems and automation.
  • Participate in on-call rotations and critical incident response.

You’ve got what it takes if you have…

  • 6–10 years of experience in Site Reliability Engineering, Cloud Engineering, or DevOps roles.
  • Strong hands-on experience with AWS, Azure, or GCP cloud platforms.
  • Experience with containerization and orchestration technologies such as Docker and Kubernetes.
  • Strong scripting or programming skills in Python, Go, Bash, or similar languages.
  • Experience with Infrastructure as Code (IaC) tools such as Terraform, CloudFormation, or Pulumi.
  • Strong experience with CI/CD pipelines and deployment automation.
  • Hands-on experience with observability tools such as Prometheus, Grafana, ELK, Splunk, or New Relic.
  • Strong troubleshooting skills across distributed systems, networking, and cloud infrastructure.
  • Experience with incident management, post-incident reviews, and reliability improvements.

Preferred Qualifications:

  • Experience managing large-scale production environments.
  • Knowledge of security best practices in cloud environments.
  • Experience implementing chaos engineering, resilience testing, or reliability frameworks.
  • Familiarity with service mesh, microservices architectures, and API-driven systems.
  • Experience working in high-availability SaaS environments.

Key Traits:

  • Strong ownership mindset and problem-solving skills.
  • Ability to work in high-pressure production environments.
  • Passion for automation and eliminating manual operational work.
  • Strong collaboration and communication skills across engineering teams.

#LI-Onsite

Our Culture:

Spark Greatness. Shatter Boundaries. Share Success. Are you ready? Because here, right now – is where the future of work is happening. Where curious disruptors and change innovators like you are helping communities and customers enable everyone – anywhere – to learn, grow and advance. To be better tomorrow than they are today.

Who We Are:

Cornerstone powers the potential of organizations and their people to thrive in a changing world. Cornerstone Galaxy, the complete AI-powered workforce agility platform, meets organizations where they are. With Galaxy, organizations can identify skills gaps and development opportunities, retain and engage top talent, and provide multimodal learning experiences to meet the diverse needs of the modern workforce. More than 7,000 organizations and 100 million+ users in 180+ countries and in nearly 50 languages use Cornerstone Galaxy to build high-performing, future-ready organizations and people today.

Check us out on LinkedIn, Comparably, Glassdoor, and Facebook!