Site Reliability Engineer

1 week ago


Bonifacio Global Metro Manila, Philippines Acquire Intelligence Full time

We're an award-winning global outsourcer providing contact center and back office services on behalf of our global clients. Come work at a place where innovation and teamwork come together to support the most exciting missions in the world

Acquire Intelligence exists to help businesses unlock smarter ways of working. We believe that by combining the best of people, process, and automation, companies can grow faster and operate with greater confidence. Our purpose is to remove complexity, improve performance, and drive intelligent transformation for organizations around the world.

As an Acquire Intelligence employee, your role is vital in achieving and exceeding individual and team targets that support company objectives, while building and maintaining stakeholder relationships. You're also responsible for complying with and enforcing procedures aligned with our information security policies.

As a values-led organization, we expect all our team members to exemplify our four values: Curious and Clever, Entrepreneurial Energy, Fast with Intent, and Laugh and Learn.

A SNAPSHOT OF YOUR ROLE

Responsibilities of the Site Reliability Engineer will include but are not limited to:

Service Level Management & Reliability

·       Define, monitor, and enforce Service Level Objectives (SLOs) and error budgets across all production systems

·       Track error budget burn rates and make data-driven decisions to halt risky deployments when thresholds are exceeded

·       Implement comprehensive monitoring and alerting strategies using Prometheus, Grafana, and PagerDuty

·       Establish and maintain reliability standards that support business-critical uptime requirements

Infrastructure Automation & Management

·       Design and implement Infrastructure as Code (IaC) solutions using Pulumi with TypeScript

·       Manage and optimize AWS services including EKS (Elastic Kubernetes Service), MSK (Managed Streaming for Kafka), SingleStore, MongoDB S3

·       Automate operational processes to eliminate toil, targeting any task that consumes more than 2 engineer-days per quarter

Incident Response & Post-Mortem Leadership

·       Serve as incident commander during production outages and service degradations

·       Lead comprehensive post-mortem processes within 48 hours of incidents

·       Drive "never-again" corrective actions to completion, ensuring systemic improvements

·       Maintain and improve incident response procedures and runbooks

Security & Compliance

·       Implement and enforce least-privilege IAM policies across all AWS resources

·       Manage security patch pipelines and vulnerability remediation processes

·       Support compliance initiatives including SOC2 and ISO 27001 certification requirements

·       Ensure security best practices are embedded in all infrastructure and operational procedures

On-Call & Operational Excellence

·       Participate in follow-the-sun on-call rotation with one week primary/secondary commitment every five weeks

·       Provide 24×7 support coverage across AU/NZ, EU/ZA, and MX time zones

·       Maintain operational runbooks and knowledge transfer documentation

·       Continuously improve on-call experience and reduce alert fatigue

A BIT ABOUT YOU

Experience

·       Minimum 3+ years of hands-on experience running AWS production systems at scale

·       Proven expertise with AWS EKS (Elastic Kubernetes Service) or similar and MSK (Managed Streaming for Kafka) in production environments as well as database performance diagnostics (MySQL, Postgres, MongoDB) in multi-TB scale databases

·       Strong background in Infrastructure as Code, preferably with Pulumi using TypeScript or equivalent Terraform experience

·       Demonstrated experience participating in incident management (ideally as an incident commander with a track record of leading post-mortem processes)

·       Experience with high-volume data processing systems, ideally IoT telemetry or streaming pipelines processing ≥50k messages per second

·       Background in implementing and maintaining observability solutions using Prometheus, Grafana, PagerDuty, or similar tools Experience with CI/CD pipeline management and deployment automation using GitLab, or similar platforms

·       Exposure to Hypervisors (VMWare, Hyper V), Microsoft Server stack, SAN/NAS, L2/3 Networking Layers, Firewalls (Palo Alto), Switching (Aruba, Juniper) considered advantageous.

Technical Skills & Qualifications

·       Bachelor's degree in computer science, engineering, or related technical field, or equivalent practical experience

·       Expert-level proficiency in TypeScript for production systems, including services, AWS Lambda functions, and operational tooling

·       Deep understanding of AWS services ecosystem, with particular expertise in container orchestration, messaging systems, and content delivery

·       Strong networking fundamentals including TCP/IP, DNS, TLS, HTTP protocols, and container networking (CNI)

·       Proficiency with monitoring and observability tools including Prometheus, Grafana, and incident management platforms

·       Experience with Infrastructure as Code tools, particularly Pulumi with TypeScript for comprehensive AWS resource management

·       Understanding of security best practices including least-privilege access, IAM policy management, and compliance frameworks

WHAT WE VALUE
  • Curious and Clever – Smart questions spark smart solutions
  • Entrepreneurial Energy – Think like an owner. Solve like a founder
  • Fast with Intent – We move fast and deliver real results
  • Laugh and Learn – We don't take ourselves too seriously, just our results

What Are You Waiting For?

Apply now and help turn data into action with Acquire Intelligence

Join the A-Team and experience the A-Life



  • Bonifacio Global, Metro Manila, Philippines Acquire Intelligence Full time

    We're an award-winning global outsourcer providing contact center and back office services on behalf of our global clients. Come work at a place where innovation and teamwork come together to support the most exciting missions in the worldAcquire Intelligence exists to help businesses unlock smarter ways of working. We believe that by combining the best of...


  • Bonifacio Global, Metro Manila, Philippines Acquire Intelligence Full time

    We're an award-winning global outsourcer providing contact center and back office services on behalf of our global clients. Come work at a place where innovation and teamwork come together to support the most exciting missions in the worldAcquire Intelligence exists to help businesses unlock smarter ways of working. We believe that by combining the best of...


  • Manila, National Capital Region, Philippines Nezda Global Full time

    About the RoleAs anSRE SME, you'll design, implement, and evangelize modern SRE and AIOps frameworks — ensuring systems are reliable, scalable, and intelligent. You'll collaborate across infrastructure, development, and leadership teams to embed observability and reliability at scale.Key ResponsibilitiesDesign and implement observability frameworks ...


  • Manila, National Capital Region, Philippines Cambridge University Press & Assessment Full time

    Work setup: Hybrid (open to 2x a week in the office)Work schedule: 10AM to 6PM Manila timeEmployment type: PermanentLocation: Makati City, Metro ManilaPay range: Php 60,000 to Php 81,000We value transparency and encourage applicants comfortable with this range to apply.Discover a world of endless possibilities with Cambridge University Press & Assessment, a...


  • Bonifacio Global, Metro Manila, Philippines -2a5a-4c31-b174-e8c022226eef Full time

    TP ICAP operates at the heart of the world's financial, energy and commodities markets.We are professional intermediaries playing a pivotal role in the world's financial markets, covering FX, Rates, Credit, Equities, Energy & Commodities.Globally, we are a leading provider of market participants, with execution via a range of regulated venues, covering a...

  • Devops Engineer

    2 weeks ago


    Bonifacio Global, Metro Manila, Philippines BCS Information Technology Corp. Full time

    QualificationsEducation & ExperienceBachelor's degree in Computer Science, Engineering, or related fieldRequired: 5+ years of hands-on software development experience (Angular + .NET Core)5+ years of experience in DevOps, Site Reliability Engineering, or related roles5+ years of hands-on experience with Kubernetes and cloud platformsExperience with...

  • Site Engineer

    4 days ago


    Novaliches, Metro Manila, Philippines DETArkitektura Design Full time

    About the role DETArkitektura Design' is seeking an experienced Site Engineer/Architect to join our team in Novaliches, Quezon City. This full-time role will play a crucial part in overseeing the daily construction site activities and development of a two storey residential project.What you'll be doingManage and oversee all on-site construction activities...

  • Site Engineer

    5 days ago


    Ortigas, Metro Manila, Philippines Funtomato Consultancy Inc. Full time

    Responsible for on-site safety management work, including on-site three-level safety education, safety records, insurance verification and handling, on-site safety hazard inspection and rectification, etc.;Responsible for project quality control, conducting on-site installation technical disclosure, verifying against installation specifications, gradually...


  • Bonifacio Global, Metro Manila, Philippines Primus@Knowledge Specialist Inc. Full time

    Job description:Job PurposeMonitoring, reporting and management of project expenditures ensuringProvide technical expertise for the design of structures, foundation and site layoutsDesign assurance for sites ensuring TPM, managing and resolving Tower LoadingDesign optimization ensuring to achieve cost optimization targetOverseeing Tower retrofitting design...


  • Manila, National Capital Region, Philippines Amadeus Full time

    Job TitleLead Service Reliability EngineerPurpose of the roleThe Lead Site Reliability Engineering for Stratos will be responsible for ensuring the reliability, performance and scalability of our mission-critical platforms. In this role, you will be safeguarding operational excellence in the products under Stratos, influence reliability strategies, integral...