Site Reliability Engineering Manager
1 day ago
Senior Associate - Talent Acquisition - Corporate Strategy Hiring | Specialized in APAC We are seeking a highly skilled Site Reliability Engineering (SRE) Subject Matter Expert (SME) to lead and advance our observability, performance engineering, reliability, and AIOps practices. The SME will be responsible for designing, implementing, and evangelizing modern SRE capabilities that improve system reliability, scalability, and efficiency across our IT ecosystem. This role requires deep technical expertise, hands‑on problem‑solving skills, and the ability to influence cross‑functional teams. Key Responsibilities Observability & Monitoring Define and implement observability frameworks across logs, metrics, traces, and events. Establish SLOs, SLIs, and error budgets in collaboration with product and engineering teams. Drive proactive incident detection and root cause analysis. Performance Engineering Lead performance benchmarking, load/stress testing, and scalability assessments of applications and infrastructure. Build performance models and capacity planning strategies for critical business systems. Partner with development teams to identify performance bottlenecks and optimize application/infrastructure efficiency. Reliability Engineering Design and implement automation for incident response, disaster recovery, and self‑healing systems. Lead Chaos Engineering and Resilience testing initiatives. Drive reliability reviews, postmortems, and blameless RCA culture. Ensure best practices for fault tolerance, availability, and resilience are embedded in system design. Define AIOps strategy and deploy ML/AI‑driven observability and incident response capabilities. Leverage anomaly detection, event correlation, and predictive analytics for proactive IT operations. Integrate AIOps platforms with ITSM tools for intelligent ticketing, alert suppression, and automated remediation. Act as a thought leader in SRE practices, mentoring engineers and influencing leadership decisions. Partner with development, infrastructure, and business teams to embed SRE principles across the enterprise. Drive continuous improvement culture for availability, scalability, and operational excellence. Required Qualifications 10+ years of experience in IT Operations, Reliability Engineering, or Performance Engineering. Deep expertise in observability and monitoring platforms (Prometheus, Grafana, Splunk, Datadog, Dynatrace, ELK, AppDynamics, etc.). Strong background in performance testing tools (JMeter, LoadRunner, Gatling, k6, etc.) and capacity planning. Hands‑on experience in cloud platforms (AWS, Azure, GCP) and containerized environments (Kubernetes, Docker, OpenShift). Experience with AIOps platforms (Moogsoft, BigPanda, Dynatrace Davis AI, ServiceNow AIOps, etc.) and ML‑driven IT operations. Strong understanding of distributed systems, networking, CI/CD, and DevOps practices. Preferred Qualifications Prior experience leading enterprise‑wide SRE/Observability transformations. Knowledge of Chaos Engineering platforms (Gremlin, Chaos Mesh, Litmus). Exposure to ITSM/ITIL processes and modern incident management practices. Strong communication skills with ability to influence CxO‑level stakeholders. Certifications: Google SRE, AWS DevOps Engineer, Azure SRE Expert, Dynatrace/Datadog certifications (preferred). Mandaluyong, National Capital Region, Philippines #J-18808-Ljbffr
-
Site Reliability Engineer
1 day ago
Manila, Philippines Tata Consultancy Services Full timeHuman Resources Executive at Tata Consultancy Services Job Description: Site Reliability Engineering (SRE) SME Position Overview We are seeking a highly skilled Site Reliability Engineering (SRE) Subject Matter Expert (SME) to lead and advance our observability, performance engineering, reliability, and AIOps practices. The SME will be responsible for...
-
Site Reliability Engineer
12 hours ago
Manila, National Capital Region, Philippines HGS Offshore Staffing Solutions Full time ₱2,000,000 - ₱2,500,000 per yearSENIOR SITE RELIABILITY ENGINEERPOSITION OVERVIEWWe are seeking an experienced Senior AWS Site Reliability Engineer to join our cross-functionalcloud platform team. Working alongside a diverse group of DevOps and Site ReliabilityEngineers, you will combine deep technical expertise in AWS cloud infrastructure with strongleadership capabilities in incident...
-
Site Reliability Engineer
3 days ago
Manila, Philippines Russell Tobin Full timeSenior Associate - Talent Acquisition - Corporate Strategy Hiring | Specialized in APAC We are seeking a highly skilled Site Reliability Engineering (SRE) Subject Matter Expert (SME) to lead and advance our observability, performance engineering, reliability, and AIOps practices. The SME will be responsible for designing, implementing, and evangelizing...
-
Site Reliability Engineering Manager
4 days ago
Manila, National Capital Region, Philippines Russell Tobin Full time $60,000 - $120,000 per yearWe are seeking a highly skilledSite Reliability Engineering (SRE) Subject Matter Expert (SME)to lead and advance our observability, performance engineering, reliability, and AIOps practices. The SME will be responsible for designing, implementing, and evangelizing modern SRE capabilities that improve system reliability, scalability, and efficiency across our...
-
Site Reliability Engineer
6 days ago
Manila, National Capital Region, Philippines Russell Tobin Full time ₱120,000 - ₱180,000 per yearWe are seeking a highly skilledSite Reliability Engineering (SRE) Subject Matter Expert (SME)to lead and advance our observability, performance engineering, reliability, and AIOps practices. The SME will be responsible for designing, implementing, and evangelizing modern SRE capabilities that improve system reliability, scalability, and efficiency across our...
-
Engineer, Site Reliability
5 days ago
Southern Manila District, Philippines Royal Caribbean International Full timeOverview Position Summary: The Site Reliability Engineer (Senior SRE) reports to the SRE Manager in support of the Royal Caribbean website by utilizing application and user performance data to guide informed decision-making. The SRE uses performance metrics from various sources and tools to support tasks such as initial triage of critical production...
-
Senior Site Reliability Engineer
1 week ago
Manila, National Capital Region, Philippines CDOps Tech Full time ₱2,000,000 - ₱2,500,000 per yearAbout the OpportunityWe are seeking a seasoned and passionate Senior Site Reliability Engineer for a high-impact contract engagement with one of our key clients, a leader in the marketing-tech sector. This is not just a typical SRE role; you will be the foundational expert responsible for spearheading theadoption of SRE culture and practiceswithin the...
-
Cloud Site Reliability Engineer
2 weeks ago
Manila, Philippines Tyler Technologies Full timeJoin to apply for the Cloud Site Reliability Engineer role at Tyler Technologies Overview Responsibilities Implement tooling to monitor AWS EKS-based systems focusing on performance, reliability, and scalability. Ensure that architecture and deployment models are sufficient to support SLA commitments and are well prepared for future problems of scale....
-
Senior Site Reliability Engineer
2 weeks ago
Eastern Manila District, Philippines CC.Talent Full timeSenior Site Reliability Engineer (SRE) Senior Site Reliability Engineer (SRE) to join our global infrastructure team. You will be a guardian of our production environment, responsible for its health, performance, and scalability. Your mission is to apply software engineering principles to solve operational problems, automate everything, and ensure our...
-
Cloud Site Reliability Engineer
5 days ago
Manila, Philippines Tyler Technologies, Inc. Full timeCloud Site Reliability Engineer Apply Online Location Manila, Philippines Responsibilities Implement tooling to monitor AWS EKS-based systems focusing on performance, reliability, and scalability. Ensure that architecture and deployment models are sufficient to support SLA commitments and are well prepared for future problems of scale. Leverage cloud...