Site Reliability Lead

2 weeks ago


Pasig, Philippines White Cloak Technologies Full time

Job Description

  • Lead, mentor, and manage a team of Site Reliability Engineers, ensuring coverage across shifts and on-call rotations.
  • Define team goals, KPIs, and performance metrics aligned with service reliability and business continuity.
  • Conduct regular coaching, performance reviews, and skills development planning.
  • Oversee workload distribution, escalation protocols, and incident ownership across the team.
  • Champion a culture of documentation, knowledge sharing, and operational discipline.
Key Responsibilities
  • Own the architecture and lifecycle of monitoring, alerting, and logging systems.
  • Ensure early detection, triage, and escalation of service degradation based on SLAs.
  • Lead major incident response, root cause analysis (RCA), and postmortem documentation.
  • Review and approve SOPs, runbooks, and playbooks created by the team.
  • Analyze incident trends and drive systemic fixes to reduce recurrence and improve MTTR.
  • Work closely with DevOps, Infrastructure, QA, and Development teams to improve deployment readiness and system resilience.
  • Represent the SRE function in planning meetings, audits, and compliance reviews.
  • Collaborate with ITSM teams to align incident, problem, and change management processes.
Skills and Competencies
  • Proven leadership experience in managing technical operations or SRE teams.
  • Strong command of ITSM platforms (e.g., ServiceNow, Jira Service Management).
  • Deep understanding of monitoring tools (e.g., Prometheus, Grafana, ELK, Datadog).
  • Familiarity with ITIL principles and regulatory frameworks (e.g., BSP, PDIC, ISO 27001).
  • Expertise in incident response, escalation protocols, and RCA methodologies.
  • Excellent communication and stakeholder management skills.
  • Ability to synthesize operational data into actionable insights and team strategies.
Qualifications and Experience
  • Bachelors degree in Computer Science, Information Technology, Electronics Engineering, or equivalent.
  • 5+ years of experience in Site Reliability Engineering, DevOps, or Infrastructure roles.
  • 2+ years in a leadership or team management capacity.
  • Hands-on experience with cloud platforms (AWS, GCP, Azure).
  • Knowledgeable in scripting (Python, Bash) and Linux systems.
  • Experience in fintech, banking, or SaaS environments with high availability SLAs.
#J-18808-Ljbffr

  • Pasig, National Capital Region, Philippines Seven Seven Global Services Inc Full time $90,000 - $120,000 per year

    Job Description:As part of the Site Reliability Engineering team within the Reference Data Engineering group, you'll help build a meaningful engineering discipline, combining software and systems to develop creative engineering solutions to runtime problems. In this environment, you'll take the lead on relevant projects, supported by an organization that...


  • Pasig, National Capital Region, Philippines Seven Seven Full time ₱1,200,000 - ₱2,400,000 per year

    Handle service monitoring, incident response, and drive technical support efficiency.Responsible for managing and maintaining network monitoring tools, systems, and processes that ensure the availability, scalability, and performance of our production environments.Responsible for incident handling, service monitoring, and technical support efficiency.Closely...


  • Pasig, National Capital Region, Philippines Modulus Labs Full time $100,000 - $150,000 per year

    Role OverviewAs a Senior Site Reliability Engineer, you will be a key player in ensuring the reliability, security, and scalability of our cloud-native payment systems. You will work closely with engineering teams to build resilient infrastructure, automate operations, and maintain high availability while adhering to stringent compliance requirements.Key...


  • Pasig, Philippines Buscojobs Full time

    Sr Site Reliability Engineer (Project based) Location: 1226 Makati City, National Capital Region | iScale Solutions Posted 16 days ago Job Description This is a remote position. Core Expertise SRE Foundations & Practices Deep understanding of SRE principles (SLIs, SLOs, error budgets, toil reduction, reliability vs. velocity trade-offs). Proven experience...


  • Pasig, National Capital Region, Philippines Asia Select, Inc. (ASI) Full time ₱600,000 - ₱1,200,000 per year

    Responsibilities:Ensure CLIENTS's multiple systems are operating at peak efficiency, performance and uptime.Assist in providing root cause analysis of complex faults in a large distributed system, and work with multiple teams to see the issue through to resolution and improvements.Participate in ongoing technology refresh initiatives or special projects as...


  • Pasig, Philippines Asia Select, Inc. (ASI) Full time

    Responsibilities Ensure CLIENTS’s multiple systems are operating at peak efficiency, performance and uptime. Assist in providing root cause analysis of complex faults in a large distributed system, and work with multiple teams to see the issue through to resolution and improvements. Participate in ongoing technology refresh initiatives or special...


  • Pasig, National Capital Region, Philippines Seven Seven Global Services, Inc. Full time ₱1,500,000 - ₱3,000,000 per year

    About the RoleWe are looking for a Site Reliability Engineer who will play a key role in ensuring the availability, scalability, and performance of our production environments. You will handle service monitoring, incident response, and technical support efficiency, working closely with developers, DevOps, infrastructure teams, and various stakeholders to...


  • Pasig, National Capital Region, Philippines TGServices, Inc. Full time ₱900,000 - ₱1,200,000 per year

    JOB SUMMARYPerform trade tests, training, and orientation to ensure personnel effectiveness, and conduct site audits to ensure quality of service.Willing to render duty 6 days a week; Rotating shifts with/ a co-supervisor (Corporate office)JOB DESCRIPTIONConduct pre-deployment orientation of newly hired employees on policies, procedures, and quality...


  • Pasig, National Capital Region, Philippines TGServices, Inc. Full time

    JOB SUMMARYPerform trade tests, training, and orientation to ensure personnel effectiveness, and conduct site audits to ensure quality of service.Willing to render duty 6 days a week; Rotating shifts with/ a co-supervisor (Corporate office)JOB DESCRIPTIONConduct pre-deployment orientation of newly hired employees on policies, procedures, and quality...


  • Pasig, National Capital Region, Philippines IQVIA Full time

    Job Responsibilities:Serve as Single Point of Contact (SPOC) in assigned studies for investigative sites, RSU Team Lead, Clinical Operations, Feasibility, Site Identification, Project Leadership and GICS. Ensure adherence to standard operating procedures (SOPs), work instructions (WIs), quality of designated deliverables and to project timelines. Where...