
Site Reliability Engineer
3 weeks ago
Overview
Work setup: Hybrid (open to 2x a week in the office)
Work schedule: 10AM to 6PM Manila time
Employment type: Permanent
Location: Makati City, Metro Manila
Pay range: Php 60,000 to Php 81,000
We value transparency and encourage applicants comfortable with this range to apply.
Discover a world of endless possibilities with Cambridge University Press & Assessment, a distinguished global academic publisher and assessment organization proudly affiliated with the prestigious University of Cambridge.
We are recruiting for a Site Reliability Engineer to be part of our Education Technology Team. As a Site Reliability Engineer (AI Operations), you'll be pioneering operational excellence for AI systems that are transforming how millions learn worldwide
Why Cambridge?Cambridge University Press & Assessment is a world-renowned not-for-profit academic publisher and assessment organisation, proudly part of the prestigious University of Cambridge. With a legacy rooted in over 800 years of educational excellence, we are dedicated to unlocking the potential of learners and educators across the globe.
Joining Cambridge's second largest global office in the Philippines —operating for over 22 years with 1,300+ colleagues— means becoming a part of an extraordinary institution renowned worldwide. We are recognised as a Great Place to Work for three consecutive years, reflecting our inclusive culture, strong sense of purpose, and commitment to the professional growth and well-being of our people. At Cambridge, we don't just publish books or deliver tests—we empower progress, inspire curiosity, and champion the pursuit of knowledge.
What can you get from Cambridge?At Cambridge, you'll become a part of a vibrant and forward-thinking community that transcends tradition, fostering a culture of continuous growth and personal development. Here, we provide the right environment for you to thrive, supporting your professional journey and empowering you to reach your highest potential, that is why our pay philosophy is intricately tied to your skills and competencies, ensuring that your compensation aligns with the unique value you bring to the role you are applying for.
The organization offers a wide range of benefits and opportunities including:
- Regular Employment on Day 1
- HMO Coverage and Life Insurance on Day 1
- Paid Annual Leaves (Vacation, Well-being, Flexible, Holiday, and Volunteering leaves)
- Vesting/Retirement package
- Opportunities for career growth and development
- Access to well-being programs
- Flexible schedule, hybrid work arrangement and work-life balance
- Opportunity to collaborate with colleagues from diverse branches that will expand your horizons and enrich your understanding of different cultures
You'll be joining our Education Technology Platform Operations team at a pivotal moment as we embrace AI to enhance learning outcomes globally. Working alongside passionate technologists, you'll help us transform how we deploy and operate AI services - from large language models to intelligent automation platforms - ensuring they're reliable, cost-effective, and ethically sound.
In this role, you'll bridge the gap between cutting-edge AI innovation and production excellence. You'll establish the operational frameworks that allow us to deploy AI responsibly in education, always keeping learner safety and data protection at the forefront.
- Drive innovation in AI operations by implementing observability solutions for LLM deployments, workflow automation platforms (e.g. n8n), and AI services across AWS Bedrock and Azure OpenAI
- Make a real difference by establishing governance frameworks that ensure our AI services are ethical, compliant, and safe for educational use
- Transform our approach to cost optimisation for AI workloads through intelligent caching, model selection, and resource allocation strategies
- Collaborate with teams to operationalise AI features, sharing your expertise to help developers build production-ready, scalable AI solutions
- Be continuously learning about emerging AI operational tools like Portkey and LiteLLM, bringing new approaches to improve reliability and efficiency
- Strengthen our impact by implementing sustainable AI practices that consider the environmental footprint of compute-intensive workloads
Please review the attached job description for further details on the role.
What makes you the ideal candidate for this role?- Education & Experience: 3–5 years in Site Reliability Engineering or related roles, with proven application of operational excellence in emerging technologies. Degree or equivalent experience in Computer Science, Engineering, or related field.
- Cloud & Infrastructure: Strong experience with cloud platforms, particularly AWS, including Infrastructure as Code (Terraform, CDK, CloudFormation) and cloud-native services.
- Automation & Delivery: Skilled in delivering change through automation with strong scripting abilities (Python, Bash, etc.) and hands-on experience with CI/CD pipelines (GitHub Actions, Jenkins, Bitbucket Pipelines).
- Monitoring & Reliability: Practical experience with monitoring and observability systems (Datadog, New Relic, Grafana, ELK/EFK stack) to ensure performance, availability, and incident response in distributed systems.
- API & Distributed Systems: Knowledge of API management, rate limiting, scalability, and the complexities of distributed architectures, particularly for AI-related workloads.
- AI & Emerging Tech: Familiarity with Large Language Models, cloud AI services, or workflow automation tools. Willingness to learn and apply new approaches to maximize impact in education technology.
- Ways of Working: Enthusiastic about exploring possibilities with AI while maintaining operational rigor. Collaborative, curious, and aligned with the vision of using technology to unlock potential in learners worldwide.
This is more than a technical role - it's an opportunity to define how AI operates in educational technology, ensuring it's deployed responsibly and effectively. You'll be at the forefront of establishing best practices that could influence how the entire education sector approaches AI operations.
#J-18808-Ljbffr-
Site Reliability Engineer
18 hours ago
Makati, Philippines Descartes Systems Group Full timeOverview As a Site Reliability Engineer, you’ll help design and maintain robust cloud infrastructure to ensure our systems are always secure, scalable, and automated. You’ll work cross-functionally with DevOps, Engineering, and Security teams, supporting CI/CD pipelines, troubleshooting issues, and contributing to documentation and system reliability...
-
Site Reliability Engineer
3 weeks ago
Makati, Philippines Strategic Staffing Solutions Full timeOverview We are seeking a Site Reliability Engineer (SRE) to help shape the future of monitoring, observability, and reliability across enterprise platforms and applications. This role will focus on Azure Monitor, ServiceNow ITOM Event Management, Grafana, and APM/Synthetics tooling to improve system performance, reduce incident noise, and embed automation...
-
Site Reliability Engineer
3 weeks ago
Makati, Philippines Penbrothers Full timeAbout Penbrothers: Penbrothers is an HR & remote talent management partner and one of the fastest growing companies in the Philippines. We provide talented Filipinos with global opportunities in high-growth startups and dynamic companies. About the Client: Our client helps the world’s largest enterprises and organizations automate the enforcement of...
-
Site Reliability Engineer
4 weeks ago
Makati, Philippines Cambridge University Press Full timeOverview Work setup: Hybrid (open to 2x a week in the office) Work schedule: 10AM to 6PM Manila time Employment type: Permanent Location: Makati City, Metro Manila Pay range: Php 60,000 to Php 81,000 We value transparency and encourage applicants comfortable with this range to apply. Discover a world of endless possibilities with Cambridge University Press &...
-
Senior Site Reliability Engineer
4 weeks ago
Makati, Philippines Royal Caribbean Group Full time6 days ago Be among the first 25 applicants Get AI-powered advice on this job and more exclusive features. The Senior Site Reliability Engineer (Senior SRE) will report to the SRE Manager in support of the Royal Caribbean website by utilizing application and user performance data to guide informed decision-making. The Senior SRE will use application and user...
-
Site Reliability Engineer
3 days ago
Makati City, National Capital Region, Philippines Descartes Systems Group Full time ₱30,000 - ₱60,000 per yearDescartes Unites the People and Technology that Move the WorldThe need for efficient, secure, and agile supply chains and logistics operations has become ever more critical and complex. By combining innovative technology, powerful trade intelligence and the reach of our network, Descartes helps get goods, information, transportation assets, and people where...
-
Lead Site Reliability Engineer
4 weeks ago
Makati, Philippines Royal Caribbean Group Full time1 week ago Be among the first 25 applicants The Lead Site Reliability Engineer (Lead SRE) will report to the SRE Manager in support of the Royal Caribbean website by utilizing application and user performance data to guide informed decision-making. The Lead SRE will use application and user performance metrics collected from various sources and tools to...
-
IT Infra
2 weeks ago
Makati, Philippines Nityo Infotech Full time- Salary: 130,000 - Set up: Hybrid (2x a week onsite) - Location: Makati City - Schedule: Night Shift (9pm - 6am or 10pm - 7am) (M-F) QUALIFICATIONS: - Deep experience with cloud platforms (Azure and AWS both preferred) - Strong expertise with CI/CD tools and practices - Proficiency with infrastructure-as-code and configuration management tools -...
-
IT Infra
4 weeks ago
MAKATI, Philippines Nityo Infotech Full time- Salary: 130,000 - Set up: Hybrid (2x a week onsite) - Location: Makati City - Schedule: Night Shift (9pm - 6am or 10pm - 7am) (M-F) QUALIFICATIONS: - Deep experience with cloud platforms (Azure and AWS both preferred) - Strong expertise with CI/CD tools and practices - Proficiency with infrastructure-as-code and configuration management tools -...
-
Senior Site Reliability Engineer
1 week ago
Makati City, National Capital Region, Philippines Yondu, Inc. Full time ₱900,000 - ₱1,200,000 per yearCompany DescriptionYondu is a Philippine-based IT solutions company owned by Globe Telecom. We empower businesses across various industries through innovative technology solutions to help them scale in the new digital economy. Our mission is to create better technological experiences by turning great ideas into valuable business solutions. As a Yondude, you...