Site Reliability Engineer
3 hours ago
We are seeking a highly skilled
Site Reliability Engineering (SRE) Subject Matter Expert (SME)
to lead and advance our observability, performance engineering, reliability, and AIOps practices. The SME will be responsible for designing, implementing, and evangelizing modern SRE capabilities that improve system reliability, scalability, and efficiency across our IT ecosystem. This role requires deep technical expertise, hands-on problem-solving skills, and the ability to influence cross-functional teams.
Key Responsibilities
- Observability & Monitoring
- Define and implement observability frameworks across logs, metrics, traces, and events.
- Architect monitoring platforms (e.g., Prometheus, Grafana, ELK, Splunk, Datadog, Dynatrace, New Relic) to deliver actionable insights.
- Establish SLOs, SLIs, and error budgets in collaboration with product and engineering teams.
- Drive proactive incident detection and root cause analysis.
- Performance Engineering
- Lead performance benchmarking, load/stress testing, and scalability assessments of applications and infrastructure.
- Build performance models and capacity planning strategies for critical business systems.
- Partner with development teams to identify performance bottlenecks and optimize application/infrastructure efficiency.
- Reliability Engineering
- Design and implement automation for incident response, disaster recovery, and self-healing systems.
- Lead Chaos Engineering and Resilience testing initiatives.
- Drive reliability reviews, postmortems, and blameless RCA culture.
- Ensure best practices for fault tolerance, availability, and resilience are embedded in system design.
- AIOps & Intelligent Automation
- Define AIOps strategy and deploy ML/AI-driven observability and incident response capabilities.
- Leverage anomaly detection, event correlation, and predictive analytics for proactive IT operations.
- Integrate AIOps platforms with ITSM tools for intelligent ticketing, alert suppression, and automated remediation.
- Leadership & Evangelism
- Act as a thought leader in SRE practices, mentoring engineers and influencing leadership decisions.
- Partner with development, infrastructure, and business teams to embed SRE principles across the enterprise.
- Drive continuous improvement culture for availability, scalability, and operational excellence.
Required Qualifications
- 10+ years of experience in IT Operations, Reliability Engineering, or Performance Engineering.
- Deep expertise in observability and monitoring platforms (Prometheus, Grafana, Splunk, Datadog, Dynatrace, ELK, AppDynamics, etc.).
- Strong background in performance testing tools (JMeter, LoadRunner, Gatling, k6, etc.) and capacity planning.
- Hands-on experience in cloud platforms (AWS, Azure, GCP) and containerized environments (Kubernetes, Docker, OpenShift).
- Expertise in automation frameworks (Terraform, Ansible, Python, Go, Shell scripting).
- Experience with AIOps platforms (Moogsoft, BigPanda, Dynatrace Davis AI, ServiceNow AIOps, etc.) and ML-driven IT operations.
- Strong understanding of distributed systems, networking, CI/CD, and DevOps practices.
Preferred Qualifications
- Prior experience leading enterprise-wide SRE/Observability transformations.
- Knowledge of Chaos Engineering platforms (Gremlin, Chaos Mesh, Litmus).
- Exposure to ITSM/ITIL processes and modern incident management practices.
- Strong communication skills with ability to influence CxO-level stakeholders.
- Certifications: Google SRE, AWS DevOps Engineer, Azure SRE Expert, Dynatrace/Datadog certifications (preferred).
Key Competencies
- Strategic and analytical thinker with problem-solving mindset.
- Strong leadership, mentorship, and stakeholder engagement skills.
- Passionate about automation, scalability, and resilience engineering.
- Ability to balance reliability with velocity in fast-paced environments.
-
Senior Site Reliability Engineer
1 week ago
Manila, National Capital Region, Philippines CDOps Tech Full time ₱2,000,000 - ₱2,500,000 per yearAbout the OpportunityWe are seeking a seasoned and passionate Senior Site Reliability Engineer for a high-impact contract engagement with one of our key clients, a leader in the marketing-tech sector. This is not just a typical SRE role; you will be the foundational expert responsible for spearheading theadoption of SRE culture and practiceswithin the...
-
Site Reliability Engineering Manager
3 hours ago
Manila, National Capital Region, Philippines Russell Tobin Full time $60,000 - $120,000 per yearWe are seeking a highly skilledSite Reliability Engineering (SRE) Subject Matter Expert (SME)to lead and advance our observability, performance engineering, reliability, and AIOps practices. The SME will be responsible for designing, implementing, and evangelizing modern SRE capabilities that improve system reliability, scalability, and efficiency across our...
-
Site Reliability Engineer
2 weeks ago
Manila, National Capital Region, Philippines Aumtrend Full time ₱1,200,000 - ₱2,400,000 per yearRole : Site Reliability Engineer -IBM MQCompany : One of the global clientLocation : BGC - ManilaWork Setup : Hybrid-2 days onsite/weekSchedule: Day shiftPermanent position & Direct Hiring by the clientRequired Technical Skill Set :Hiring for two levels: L3 (Senior) and L2.5 (Mid-senior)Technical requirements :Core focus: IBM MQ and Kafka administration —...
-
Site Reliability Engineer
1 week ago
Manila, National Capital Region, Philippines Acquire Intelligence Full time ₱1,500,000 - ₱2,500,000 per yearWe're an award-winning global outsourcer providing contact center and back office services on behalf of our global clients. Come work at a place where innovation and teamwork come together to support the most exciting missions in the worldAcquire Intelligence exists to help businesses unlock smarter ways of working. We believe that by combining the best of...
-
Senior Site Reliability Engineer, Arlington
3 hours ago
Manila, National Capital Region, Philippines Onebrief Full time ₱800,000 - ₱1,240,000 per yearAbout OnebriefOnebrief is collaboration and AI-powered workflow software designed specifically for military staffs. By transforming this work, Onebrief makes the staff as a whole superhuman - meaning faster, smarter, and more efficient.We take ownership, seek excellence, and play to win with the seriousness and camaraderie of an Olympic team. Onebrief...
-
Senior Site Reliability Engineer
1 week ago
Manila, National Capital Region, Philippines Broadridge Full time ₱1,200,000 - ₱2,400,000 per yearAt Broadridge, we've built a culture where the highest goal is to empower others to accomplish more. If you're passionate about developing your career, while helping others along the way, come join the Broadridge team.Role OverviewWe are seeking a dynamic Senior Site Reliability Engineer (SRE) to lead the design, implementation, and operational support of...
-
Site Reliability Engineer
4 hours ago
Manila, National Capital Region, Philippines Nasdaq Full time ₱1,200,000 - ₱2,400,000 per yearWhy NasdaqWhen you work at Nasdaq, you're working for more open and transparent markets so that more people can access opportunities. Connections can be made, jobs can be created, and communities can thrive. We want all our employees to have access to opportunity, too. That means planning for career growth, ensuring you have the tools you need, and promoting...
-
Reliability Manager
4 hours ago
Manila, National Capital Region, Philippines DexCom Full time ₱1,500,000 - ₱2,500,000 per yearThe Company Dexcom Corporation (NASDAQ DXCM) is a pioneer and global leader in continuous glucose monitoring (CGM). Dexcom began as a small company with a big dream: To forever change how diabetes is managed. To unlock information and insights that drive better health outcomes. Here we are 25 years later, having pioneered an industry. And we're just getting...
-
Reliability Manager
3 hours ago
Manila, National Capital Region, Philippines Dexcom Full time ₱300,000 - ₱1,200,000 per yearThe CompanyDexcom Corporation (NASDAQ DXCM) is a pioneer and global leader in continuous glucose monitoring (CGM). Dexcom began as a small company with a big dream: To forever change how diabetes is managed. To unlock information and insights that drive better health outcomes. Here we are 25 years later, having pioneered an industry. And we're just getting...
-
SRE (Site Reliability Engineer)
5 hours ago
Manila, National Capital Region, Philippines GCash Full time ₱900,000 - ₱1,200,000 per yearDo you want to take the first step in making Filipinos' lives better everyday? Here in GCash we want to stay at the forefront of the FinTech industry by creating innovative, meaningful, and convenient financial solutions for the nation G ka ba? Join the G Nation todayWho You'll Be Working WithIf you have a strong background in IT, computer science, or...