Current jobs related to Head of Site Reliability Engineering - Taguig, National Capital Region - Robert Walters


  • Taguig, National Capital Region, Philippines Philtech Inc. Full time

    Non-negotiable skills:SRE, SQL, Database, Azure/Google Cloud Platform and observability/monitoringJob Qualifications:Technical support and troubleshooting: Work experience as a Site Reliability Engineer Work experience with both web and mobile applications Good knowledge and understanding of Azure foundation components or Google Cloud Platform Familiarity in...


  • Taguig, National Capital Region, Philippines Ben Edictio Corporated Full time

    Specific Job title/Position: Site Reliability EngineerWork Set up & Location: FULLY ONSITE in BGC TaguigWork Schedule: Shifting and any shift assigned scheduleSalary offer depends on your experiences and skills. Your salary expectation is subject for approvalGraduate of Bachelor's Degreeat least 3 years of hands-on experience as a Site Reliability...

  • Site Reliability Engineer

    12 minutes ago


    Taguig, National Capital Region, Philippines GECO Asia Pte Ltd Full time

    GECO Asia is a trusted IT consultancy with 18+ years of experience, deliveringdigital transformation projectsacross Asia Pacific. Recognized with theSBR Technology Excellence Awards 2023.We are hiring Site Reliability EngineerKey Responsibilities:Embed security practices in the development lifecycle.Automate security controls in CI/CD pipelines.Monitor...


  • Taguig, National Capital Region, Philippines weSource Management Consultancy Firm Full time

    We are looking for Senior Site Reliability Engineer client in BGCSalary: up to 180kSet up: HybridJob responsibilities:Our SRE/DevOps Engineering team combines software and systems engineering to ensure that our production systems are always performing optimally and efficiently.SRE/DevOps Engineers are responsible for understanding how our systems interact...


  • Taguig, National Capital Region, Philippines Nasdaq Full time

    Why NasdaqWhen you work at Nasdaq, you're working for more open and transparent markets so that more people can access opportunities. Connections can be made, jobs can be created, and communities can thrive. We want all our employees to have access to opportunity, too. That means planning for career growth, ensuring you have the tools you need, and promoting...

  • Head of Data Engineering

    10 minutes ago


    Taguig, National Capital Region, Philippines Globe Telecom Full time

    At Globe, our goal is to create a wonderful world for our people, business, and nation. By uniting people of passion who believe they can make a difference, we are confident that we can achieve this goal.Job DescriptionThe Head of Data Engineering is a strategic and technical leader responsible for building, scaling, and evolving our enterprise data...

  • Site Engineer

    9 minutes ago


    Taguig, National Capital Region, Philippines FLOOR CRETE USA INC. Full time

    Role SummaryWe are seeking a detail-oriented Site Engineer to oversee on-site construction operations. You will act as the primary technical point of contact, ensuring that projects are executed according to blueprints, safety standards, and client specifications. You will coordinate closely with our Sales Executives regarding material deliveries and manage...

  • Civil Engineer

    10 minutes ago


    Taguig, National Capital Region, Philippines OneMark Engineering Technologies Full time

    OneMark Engineering Technologies is engaged in the business of providing and/or rendering technical services for various medical, industrial or commercial plants, facilities, or buildings including planning, consultation, project management, construction management, supervision maintenance, operation and performing other technical business process.MAIN...


  • Taguig, National Capital Region, Philippines NATIONSTAR DEVELOPMENT CORP. Full time

    POSITION DESCRIPTION:A Construction Operations Head oversee the planning, coordination, and execution of multiple construction projects. This role ensures projects are completed on time, within budget, and according to quality and safety standards.Key Responsibilities:Oversee and manage all projects within the organization, ensuring adherence to budgets,...

  • Project Engineer

    9 minutes ago


    Taguig, National Capital Region, Philippines Jedaux Engineering Services Co. Full time

    QualificationsLicensed Electrical Engineer (required).3–5 years of relevant experience in high-rise building or industrial projects (advantage).Strong knowledge of electrical systems, project management, and construction standards.Excellent communication and coordination skills.Ability to work independently and as part of a multidisciplinary team.Willing...

Head of Site Reliability Engineering

18 minutes ago


Taguig, National Capital Region, Philippines Robert Walters Full time

A Head of Site Reliability Engineer has opened at a global IT Consulting Company.

A leading technology organisation is seeking a Head of Site Reliability Engineering to shape and drive the reliability strategy for its mission-critical IoT platform. This is a rare opportunity to build a high-impact SRE function from the ground up, with full ownership over production services running on AWS and the chance to influence global best practices. You will lead and nurture a remote team, championing an inclusive and psychologically safe culture that values learning, continuous improvement, and blameless retrospectives. The role offers protected time for innovation, flexible working arrangements within Australia, and the ability to make a tangible difference in a high-growth environment where your expertise will keep thousands of connected devices operating seamlessly around the clock.

  • Take charge of building and scaling a collaborative SRE team, fostering professional development and psychological safety while driving operational excellence across all production systems.
  • Enjoy remote-first flexibility, protected improvement time, and the opportunity to set the blueprint for global reliability engineering practices in a rapidly expanding tech company.
  • Lead automation initiatives, incident response strategies, and security compliance efforts while partnering with cross-functional teams to deliver resilient, scalable solutions that support business growth.

What You'll Do
As Head of Site Reliability Engineering you will be entrusted with shaping the future of reliability engineering within a fast-growing technology organisation. Your day-to-day will involve reviewing operational dashboards for SLO compliance, triaging alerts, diagnosing complex system performance issues, coaching junior engineers through incident command scenarios, attending cross-functional design reviews to advise on reliability concerns, and facilitating blameless post-mortems. Weekly activities include rotating as primary on-call engineer for seamless handovers, leading stand-ups and one-to-ones focused on career progression frameworks, presenting reliability metrics to senior leadership, delivering automation OKRs such as self-service Kafka topic provisioning, conducting security audits of IAM roles used by CI/CD pipelines, hosting internal knowledge-sharing sessions on reliability principles, refining on-call schedules factoring public holidays, defining quarterly reliability OKRs and capacity plans, driving major upgrades with zero customer impact, leading hiring pipelines from candidate screening through technical interviews to offers, and participating in occasional company off-sites. Success in this role means maintaining SLO compliance above target thresholds; reducing manual operational overhead through relentless automation; leading effective incident response with reduced mean time to resolution; implementing robust monitoring that provides early warning of issues; sustaining high-quality response during critical incidents without burnout; driving security initiatives forward; fostering collaboration between SRE and development teams; enabling faster deployment cycles through shared reliability practices.

  • Build and scale a site reliability engineering team of 3-6 engineers by setting clear goals, supporting career development, conducting regular one-to-ones, and managing annual performance reviews.
  • Recruit, onboard, mentor new engineers, and ensure operational system knowledge is captured so the team remains knowledgeable about troubleshooting procedures.
  • Maintain an inclusive culture centred on learning and continuous improvement by promoting psychological safety and equitable workloads through sustainable on-call rotations.
  • Define, monitor, and enforce service level objectives (SLOs) and error budgets across all production environments to ensure reliability meets customer expectations.
  • Continuously analyse error budget burn rates to guide deployment decisions and capacity planning while championing data-driven reliability throughout engineering and product teams.
  • Architect and implement Infrastructure-as-Code solutions using Pulumi/TypeScript for AWS resources such as EKS, MSK, SingleStore, MongoDB, S3, ensuring robust automation pipelines are in place.
  • Lead large-scale migration or modernisation projects including Kubernetes upgrades and multi-AZ resilience initiatives to enhance platform stability.
  • Eliminate manual toil by identifying repetitive tasks exceeding two engineer-days per quarter as candidates for automation.
  • Serve as escalation point and incident commander during critical events; ensure post-mortems are published promptly with actionable follow-up tracked to closure.
  • Enforce least-privilege IAM policies, champion DevSecOps practices, contribute to SOC 2 & ISO 27001 evidence collection, oversee vulnerability management pipelines, and maintain secrets hygiene.

What You Bring
Your extensive background in operating large-scale production systems equips you perfectly for the Head of Site Reliability Engineering position. You bring deep technical proficiency across cloud platforms—especially AWS—and have led teams through complex migrations or modernisation projects involving Kubernetes upgrades or multi-AZ resilience. Your hands-on approach to Infrastructure-as-Code ensures robust automation pipelines are implemented efficiently using Pulumi/TypeScript or Terraform. You excel at incident command during high-pressure situations thanks to your clear communication style and commitment to blameless retrospectives. Your familiarity with advanced observability tools enables you to provide actionable insights into system optimisation while your expertise in CI/CD pipeline management supports safe deployment cycles. Security is second nature: you enforce least-privilege IAM policies rigorously while contributing meaningfully to compliance initiatives such as SOC 2 & ISO 27001 certification. With a bachelor's degree in computer science or engineering—or equivalent practical experience—you possess both the technical acumen and interpersonal skills needed to foster collaboration between SREs and development teams. Your commitment to continuous improvement drives measurable reductions in operational burden through automation-first thinking.

  • Minimum ten years' experience operating production systems at scale including at least three years in an SRE or DevOps capacity where you have demonstrated hands-on expertise.
  • At least two years' people or technical leadership experience involving mentoring engineers or line management responsibilities within high-performing teams.
  • Proven proficiency with AWS EKS (Elastic Kubernetes Service), MSK (Managed Streaming for Kafka), large-scale databases such as SingleStore or PostgreSQL or MongoDB in cloud environments.
  • Demonstrated incident commander experience with strong communication skills under pressure during critical events requiring rapid decision-making.
  • Hands-on Infrastructure-as-Code experience using Pulumi/TypeScript or Terraform for automating cloud resource provisioning and management.
  • Familiarity with high-volume data pipelines handling at least 10k messages per second alongside exposure to IoT workloads requiring robust scalability.
  • Expert-level TypeScript skills including services development for AWS Lambda functions integrated into Pulumi tooling workflows.
  • Deep understanding of AWS networking concepts including container networking (CNI), TLS encryption protocols, HTTP/DNS routing mechanisms essential for secure operations.
  • Advanced observability toolset knowledge: Prometheus/Grafana/Loki/PagerDuty/AWS CloudWatch for monitoring distributed systems effectively.
  • CI/CD pipeline management using GitLab or GitHub Actions coupled with automated testing strategies such as blue/green deployments or canary rollouts for safe releases.
  • Security best practices encompassing IAM policy enforcement/KMS/secrets management/compliance frameworks like SOC 2 & ISO 27001 certification processes.
  • Bachelor's degree in Computer Science/Engineering or equivalent practical experience gained through progressive responsibility roles.

What Sets This Company Apart
This organisation stands out by offering remote-first flexibility within Australia so you can work from anywhere while enjoying protected improvement time dedicated specifically toward making tomorrow better. The company places a premium on professional growth—giving you founding influence over its global SRE practice—and encourages participation in complex reliability initiatives that have real-world impact across thousands of connected devices. Supported by inclusive leadership committed to psychological safety and continuous learning opportunities—including regular off-sites for knowledge sharing—you'll find yourself part of a supportive network where your contributions are valued not just technically but interpersonally as well. The culture is built around collaboration between cross-functional teams ensuring everyone shares responsibility for platform resilience while benefiting from generous training opportunities designed to help you grow both personally and professionally. By joining this organisation you become part of a community dedicated not only to technological excellence but also nurturing talent through open communication channels that prioritise empathy trust kindness loyalty modesty sensitivity warmth understanding cooperation dependability communal spirit shared success yielding positive outcomes togetherness supportive leadership inclusive workplace benefits flexible working opportunities generous pensions contributions training opportunities knowledgeable dependable under-represented Team Network growth leadership supportive leadership—all key differentiators that make this company truly exceptional.

What's Next
If you're ready to shape the future of site reliability engineering while enjoying unparalleled flexibility and professional growth apply now—your next big challenge awaits

Apply today by clicking on the link provided—don't miss your chance to join an organisation where your expertise will make a lasting impact.

Adjust tone

Due to the high volume of applications we are experiencing, our team will only be in touch with you if your application is shortlisted.