Senior Site Reliability Engineer

4 weeks ago


Makati, Philippines Royal Caribbean Group Full time

6 days ago Be among the first 25 applicants

Get AI-powered advice on this job and more exclusive features.

The Senior Site Reliability Engineer (Senior SRE) will report to the SRE Manager in support of the Royal Caribbean website by utilizing application and user performance data to guide informed decision-making. The Senior SRE will use application and user performance metrics collected from various sources and tools to support tasks such as initial triage of critical production incidents, bug analysis, implementation of best practices in site reliability engineering, infrastructure optimization, and seamless collaboration between internal teams and external service providers, among other operational initiatives.

Essential Duties and Responsibilities:

At a high-level, responsibilities for this role will include:

  • Product Health : Responsible for the Incident Management, Application Performance, Configuration Management and Operational Readiness of the products within her/his ownership. Partners with and collaborate closely with stakeholders from the various teams within IT to ensure that performance tools, configuration tools and monitoring tools meet the needs of her/his products.
  • Incident Management: Is responsible for a team of resources prepared to react quickly to production incidents with the goal to restore systems/applications back to normal service operation as quickly as possible and minimize the impact on guest/crew experience or business operations, thus ensuring the best possible service levels and availability are maintained. Review ticket analysis and approve closure of tickets/incidents. Understands architecture of Royal website and escalates incidents as needed to the appropriate team for further triage. Synthesizes and communicates incident details to the production team, stakeholders, including executive level stakeholders. Document incident, perform postmortem and create next steps (as needed). Review postmortem / RCA document and follow up.
  • Application Performance Management (APM): Ensures the proactive monitoring and management of performance and availability of the software applications within the products s/he is responsible for. Strives to detect and diagnose complex application performance problems to maintain an expected level of service. Provides insight into application performance metrics (errors, exceptions, baseline violations, etc.) to identify technical impacts of bugs and enhancements. Understands key performance metrics (traffic volumes, booking volumes, response times, etc.) to identify business value of bug fixes and enhancements. Builds case for prioritizing bug and enhancement tickets using the above. Create reports on new deployment build performance for product teams to ensure build quality.
  • Configuration Management . Leads the team(s) in implementing and maintaining the technology standards and practices across product definition and product configuration. Adjust health thresholds and other monitoring settings based on historical performance. Creates and maintains performance dashboards used by support and product teams. Maintains alerting, communication, and documentation tool chain to ensure it is up to date and efficient.
  • Change Control Governance . Ensuring all production changes required by the product teams are carried out in a planned and authorized manner, within established change control policies and procedures and that all changes are thoroughly tested and validated from the monitoring perspective.
  • Production Operations Readiness. Ensure all product implementations go through an operational readiness review. Establish and maintain clear communication channels (e.g., Slack, Teams) with the scrum and marketing teams. Ensure all team members are informed about relevant updates and changes that may affect the website.

Qualifications:

  • 6-10 years in Site Reliability Engineering (SRE), DevOps, QA, or a related IT operations role.
  • Bachelor’s degree in Computer Science, Information Technology, Computer Engineering, or other relevant advanced degree preferred.

Knowledge and Skills:

Technical Expertise :

  • Proficiency in cloud platforms such as AWS, AWS Elastic Beanstalk.
  • Understanding of API design principles: REST, SOAP, Graph
  • Advanced knowledge of monitoring and logging tools (AppDynamics, DataDog, Splunk, New Relic, etc.).
  • Extensive experience with Adobe AEM Cloud is preferred to enhance system performance and reliability

AI & Automation Expertise:

  • Working knowledge of scripting languages (Python, Bash, PowerShell) applied to automate alert routing, incident response, and infrastructure tasks, combined with a proactive mindset to explore and adopt new automation approaches.
  • Hands-on exposure to AI Ops platforms for enhancing anomaly detection, root cause analysis, and incident management, demonstrating a passion for staying ahead of industry trends.
  • Solid understanding of AI/ML and Generative AI techniques aimed at reducing alert noise, predicting incidents, and developing automation workflows, with active interest in piloting innovative solutions.
  • Familiarity with autonomous AI agents (Agentic Agents) or intelligent automation systems within operational environments, coupled with enthusiasm to experiment with emerging AI-driven tools in SRE.

Problem-Solving Skills :

  • Strong analytical and troubleshooting skills to diagnose and resolve complex production issues swiftly.
  • Ability to develop and implement effective incident response plans.

Communication and Collaboration :

  • Excellent written and verbal communication skills for effective interaction with cross-functional teams and documentation.
  • Ability to collaborate with Development, QA, IT, and external managed service providers to ensure seamless operations.
Seniority level
  • Seniority level Mid-Senior level
Employment type
  • Employment type Full-time
Job function
  • Job function Engineering and Information Technology
  • Industries Travel Arrangements

Referrals increase your chances of interviewing at Royal Caribbean Group by 2x

Get notified about new Site Reliability Engineer jobs in Makati, National Capital Region, Philippines .

Makati, National Capital Region, Philippines 6 days ago

Quezon City, National Capital Region, Philippines 2 months ago

DevOps/Site Reliability Engineer (Nigeria-Remote)

Makati, National Capital Region, Philippines 6 days ago

Taguig, National Capital Region, Philippines 2 weeks ago

Manila, National Capital Region, Philippines 1 week ago

Manila, National Capital Region, Philippines 6 days ago

IT DevOps Engineer for CI/CD and Artifactory

Taguig, National Capital Region, Philippines 4 days ago

Manila, National Capital Region, Philippines 2 weeks ago

Taguig, National Capital Region, Philippines 1 month ago

Taguig, National Capital Region, Philippines 1 week ago

Senior Site Reliability / Gitops Engineer Software QA Automation Engineer (Remote, Philippines)

Taguig, National Capital Region, Philippines ₱120,000.00-₱120,000.00 1 month ago

Taguig, National Capital Region, Philippines 4 days ago

Software Engineer (Junior/Middle) Philippines

Manila, National Capital Region, Philippines 2 months ago

Manila, National Capital Region, Philippines 2 weeks ago

Manila, National Capital Region, Philippines 1 month ago

Manila, National Capital Region, Philippines 4 days ago

National Capital Region, Philippines 2 weeks ago

National Capital Region, Philippines 6 days ago

Taguig, National Capital Region, Philippines 2 weeks ago

Manila, National Capital Region, Philippines 3 weeks ago

Azure Systems Engineer Philippines or India 9 am - 6 pm UST EST Software Engineer for Cloud Infrastructure (WFH)

Taguig, National Capital Region, Philippines 1 month ago

Software Engineer (Python/Linux/Packaging)

Manila, National Capital Region, Philippines 6 days ago

Quezon City, National Capital Region, Philippines 2 weeks ago

Platform & Automation Engineer/Architect

Quezon City, National Capital Region, Philippines 1 week ago

Taguig, National Capital Region, Philippines 2 weeks ago

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

#J-18808-Ljbffr

  • Makati, Philippines Strategic Staffing Solutions Full time

    Overview We are seeking a Site Reliability Engineer (SRE) to help shape the future of monitoring, observability, and reliability across enterprise platforms and applications. This role will focus on Azure Monitor, ServiceNow ITOM Event Management, Grafana, and APM/Synthetics tooling to improve system performance, reduce incident noise, and embed automation...


  • Makati, Philippines Descartes Systems Group Full time

    Overview As a Site Reliability Engineer, you’ll help design and maintain robust cloud infrastructure to ensure our systems are always secure, scalable, and automated. You’ll work cross-functionally with DevOps, Engineering, and Security teams, supporting CI/CD pipelines, troubleshooting issues, and contributing to documentation and system reliability...


  • Makati, Philippines Royal Caribbean Group Full time

    1 week ago Be among the first 25 applicants The Lead Site Reliability Engineer (Lead SRE) will report to the SRE Manager in support of the Royal Caribbean website by utilizing application and user performance data to guide informed decision-making. The Lead SRE will use application and user performance metrics collected from various sources and tools to...


  • Makati, Philippines Penbrothers Full time

    About Penbrothers: Penbrothers is an HR & remote talent management partner and one of the fastest growing companies in the Philippines. We provide talented Filipinos with global opportunities in high-growth startups and dynamic companies. About the Client: Our client helps the world’s largest enterprises and organizations automate the enforcement of...


  • Makati, Philippines Cambridge University Press Full time

    Overview Work setup: Hybrid (open to 2x a week in the office) Work schedule: 10AM to 6PM Manila time Employment type: Permanent Location: Makati City, Metro Manila Pay range: Php 60,000 to Php 81,000 We value transparency and encourage applicants comfortable with this range to apply. Discover a world of endless possibilities with Cambridge University Press &...


  • Makati City, National Capital Region, Philippines Descartes Systems Group Full time ₱30,000 - ₱60,000 per year

    Descartes Unites the People and Technology that Move the WorldThe need for efficient, secure, and agile supply chains and logistics operations has become ever more critical and complex. By combining innovative technology, powerful trade intelligence and the reach of our network, Descartes helps get goods, information, transportation assets, and people where...

  • IT Infra

    2 weeks ago


    Makati, Philippines Nityo Infotech Full time

    - Salary: 130,000 - Set up: Hybrid (2x a week onsite) - Location: Makati City - Schedule: Night Shift (9pm - 6am or 10pm - 7am) (M-F) QUALIFICATIONS: - Deep experience with cloud platforms (Azure and AWS both preferred) - Strong expertise with CI/CD tools and practices - Proficiency with infrastructure-as-code and configuration management tools -...

  • IT Infra

    4 weeks ago


    MAKATI, Philippines Nityo Infotech Full time

    - Salary: 130,000 - Set up: Hybrid (2x a week onsite) - Location: Makati City - Schedule: Night Shift (9pm - 6am or 10pm - 7am) (M-F) QUALIFICATIONS: - Deep experience with cloud platforms (Azure and AWS both preferred) - Strong expertise with CI/CD tools and practices - Proficiency with infrastructure-as-code and configuration management tools -...


  • Makati City, National Capital Region, Philippines Yondu, Inc. Full time ₱900,000 - ₱1,200,000 per year

    Company DescriptionYondu is a Philippine-based IT solutions company owned by Globe Telecom. We empower businesses across various industries through innovative technology solutions to help them scale in the new digital economy. Our mission is to create better technological experiences by turning great ideas into valuable business solutions. As a Yondude, you...


  • Makati City, National Capital Region, Philippines 650 ALLIED, INC. Full time ₱50,000 - ₱65,000 per year

    We're looking for a SENIOR SITE ENGINEERto join our growing teamABOUT US650 Homes is a real estate development company dedicated to building well-designed affordable homes where Filipino families can grow and thrive. We strive to be the most trusted developer in Cavite, setting new standards in home quality and delivering an exceptional home-buying...