Head of Site Reliability Engineering
1 week ago
We're an award-winning global outsourcer providing contact center and back office services on behalf of our global clients. Come work at a place where innovation and teamwork come together to support the most exciting missions in the worldAcquire Intelligence exists to help businesses unlock smarter ways of working. We believe that by combining the best of people, process, and automation, companies can grow faster and operate with greater confidence. Our purpose is to remove complexity, improve performance, and drive intelligent transformation for organizations around the world.As an Acquire Intelligence employee, your role is vital in achieving and exceeding individual and team targets that support company objectives, while building and maintaining stakeholder relationships. You're also responsible for complying with and enforcing procedures aligned with our information security policies.As a values-led organization, we expect all our team members to exemplify our four values:Curious and Clever,Entrepreneurial Energy,Fast with Intent, andLaugh and Learn.A SNAPSHOT OF YOUR ROLELeadership & People Management· Build an SRE team of initially 3-6 engineers: goal setting, career development, regular 1:1s, and annual performance reviews.· Ensure operational system knowledge is captured and that the team is kept "fresh" on operating and troubleshooting procedures.· Recruit, onboard, and mentor new engineers; scale the team to meet business growth.· Maintain an inclusive, psychologically‑safe culture centered on learning and continuous improvement.· Own, and participate in, the on‑call roster for the team, ensuring equitable rotations and sustainable workloads.Service Level Management & Reliability· Define, monitor, and enforce SLOs and error budgets across all production systems.· Continuously analyse error‑budget burn to halt risky deployments and guide capacity decisions.· Champion a data‑driven reliability mindset throughout engineering and product teams.Infrastructure Automation & Management· Architect and implement Infrastructure‑as‑Code in Pulumi/TypeScript for AWS resources (EKS, MSK, Single Store, MongoDB, S3, etc.).· Lead large‑scale migration or modernization projects (e.g., Kubernetes upgrades, multi‑AZ resilience).· Eliminate toil—any manual task >2 engineer‑days/quarter or frequently repeated becomes an automation candidate.Incident Response & Post‑Mortem Leadership· Participate in on-call monitoring and response roster.· Serve as escalation point and incident commander.· Ensure post‑mortems are published within 48 hours with actionable "never again" tasks tracked to closure.· Improve runbooks and game‑day exercises; train engineers on incident command principles.Security & Compliance· Enforce least‑privilege IAM policies and champion DevSecOps practices.· Contribute to SOC 2 & ISO 27001 evidence collection and continuous control monitoring.· Oversee security patch pipelines, vulnerability management, and secrets hygiene.Operational Excellence & Continuous Improvement· Own reliability KPIs (MTTR, change failure rate, meantime between failures).· Lead quarterly reliability reviews and drive the reliability roadmap.· Partner with Product on capacity forecasts and cost‑optimization initiatives.A BIT ABOUT YOUMinimum Experience·10+ yearsoperating production systems at scale, including3+ yearsin an SRE/DevOps capacity.·2+ yearspeople or technical leadership—mentoring, performance coaching, or line· management.· Proven expertise with AWS EKS, MSK, large‑scale databases (SingleStore, PostgreSQL, MongoDB).· Demonstrated incident commander experience with strong communication under pressure.· Hands‑on Infrastructure‑as‑Code with Pulumi/TypeScript or Terraform.· Familiarity with high‑volume data pipelines (≥10k msgs/sec) and IoT workloads.Technical Proficiency· Expert‑level TypeScript services, AWS Lambda, Pulumi tooling).· Deep understanding of AWS networking, container networking (CNI), TLS, HTTP, DNS.· Advanced observability: Prometheus, Grafana, Loki, PagerDuty, AWS CloudWatch.· CI/CD (GitLab or GitHub Actions), automated testing & rollout strategies (blue/green, canary).· Security best practices: IAM, KMS, secrets management, compliance frameworks.Education· Bachelor's in Computer Science, Engineering, or equivalent practical experience.WHAT WE VALUECurious and Clever – Smart questions spark smart solutionsEntrepreneurial Energy – Think like an owner. Solve like a founderFast with Intent – We move fast and deliver real resultsLaugh and Learn – We don't take ourselves too seriously, just our resultsWhat Are You Waiting For?Apply now and help turn data into action with Acquire IntelligenceJoin the A-Team and experience the A-Life
-
Site Reliability Engineering
1 week ago
Taguig, National Capital Region, Philippines Tata Consultancy Services Full time ₱120,000 - ₱180,000 per yearJob Description: Site Reliability Engineering (SRE) SMEPosition OverviewWe are seeking a highly skilledSite Reliability Engineering (SRE) Subject Matter Expert (SME)to lead and advance our observability, performance engineering, reliability, and AIOps practices. The SME will be responsible for designing, implementing, and evangelizing modern SRE capabilities...
-
Site Reliability Engineer
1 week ago
Taguig, National Capital Region, Philippines Socium - Teams Done Differently Full time ₱900,000 - ₱1,200,000 per yearJob Title:Site Reliability Engineering (SRE) Subject Matter Expert (SME)OverviewWe're looking for an experiencedSRE Subject Matter Expert (SME)to lead our reliability, performance, and automation initiatives. This role will design and drive best-in-classobservability, performance engineering, AIOps, and reliabilitypractices to ensure our systems arestable,...
-
Site Reliability Engineer
1 week ago
Taguig, National Capital Region, Philippines weSource Management Consultancy Firm Full time ₱1,200,000 - ₱1,860,000 per yearWe are urgently Hiring for:Site Reliability EngineersHybrid BGCUp to 155K Gross Monthly**The Role**- Implement and maintain Observability platforms such as Datadog- Proactive monitoring of production and other environments to ensure stability, availability,security and integrity- Collaborate with cross-functional teams to ensure the reliability,...
-
Site Reliability Engineer
1 week ago
Taguig, National Capital Region, Philippines Philtech Full time ₱1,200,000 - ₱2,400,000 per yearAbout the RoleWe are seeking a highly skilled and motivated Site Reliability Engineer (SRE) with a strong focus on front-end application performance and reliability. In this role, you will ensure the scalability, availability, and responsiveness of our web and mobile user-facing platforms. You will collaborate closely with engineering, product, and design...
-
Site Reliability Engineering
1 week ago
Taguig, National Capital Region, Philippines Tata Consultancy Services Full time ₱2,000,000 - ₱2,500,000 per yearRequired Qualifications10+ years of experience in IT Operations, Reliability Engineering, or Performance Engineering.Deep expertise in observability and monitoring platforms (Prometheus, Grafana, Splunk, Datadog, Dynatrace, ELK, AppDynamics, etc.).Strong background in performance testing tools (JMeter, LoadRunner, Gatling, k6, etc.) and capacity...
-
Site Reliability Engineer
1 week ago
Taguig, National Capital Region, Philippines weSource Management Consultancy Firm Full time ₱100,000 - ₱180,000 per yearWe are looking for Senior Site Reliability Engineer client in BGCSalary: up to 180kSet up: HybridJob responsibilities:Our SRE/DevOps Engineering team combines software and systems engineering to ensure that our production systems are always performing optimally and efficiently.SRE/DevOps Engineers are responsible for understanding how our systems interact...
-
Principal Networks Site Reliability Engineer
1 week ago
Taguig, National Capital Region, Philippines Cloud Bridge Full time ₱3,300,000 per yearPrincipal Networks Site Reliability EngineerUp to 3.3 million per annum3 days per week in Manilla OfficeMy client are looking for a Site Reliability Engineer (SRE) to join their team. This position demands a strategic individual who can collaborate with cross-functional teams to implement cutting-edge best practices, drive process automation, and elevate...
-
Service Reliability Engineer
1 week ago
Taguig, National Capital Region, Philippines YONDU INC. Full time ₱900,000 - ₱1,200,000 per yearAbout the role: As a Service Reliability Engineer at YONDU INC.', you will be responsible for ensuring the smooth and reliable operation of the company's critical IT systems and infrastructure. This full-time position is based in Taguig City Metro Manila and is a key role in supporting the company's overall business objectives.What you'll be...
-
Senior Site Reliability Engineer
1 week ago
Taguig, National Capital Region, Philippines weSource Management Consultancy Firm Full time ₱160,000 - ₱200,000 per yearWe are looking for Senior Site Reliability Engineer client in BGCSalary: up to 200kSet up: HybridJob responsibilities:Our DevOps Engineering team combines software and systems engineering in order to ensure that our production systems are always performing optimally and efficiently.DevOps Engineers are responsible for understanding how our systems interact...
-
Site Reliability Engineer
1 week ago
Taguig, National Capital Region, Philippines NASDAQ Full time ₱1,200,000 - ₱2,400,000 per yearWhy NasdaqWhen you work at Nasdaq, you're working for more open and transparent markets so that more people can access opportunities. Connections can be made, jobs can be created, and communities can thrive. We want all our employees to have access to opportunity, too. That means planning for career growth, ensuring you have the tools you need, and promoting...