Engineer, Site Reliability

2 weeks ago


Pasay, National Capital Region, Philippines Royal Caribbean Group Full time

Position Summary
The Site Reliability Engineer (Senior SRE) will report to the SRE Manager in support of the Royal Caribbean website by utilizing application and user performance data to guide informed decision-making. The SRE will use application and user performance metrics collected from various sources and tools to support tasks such as initial triage of critical production incidents, bug analysis, implementation of best practices in site reliability engineering, infrastructure optimization, and seamless collaboration between internal teams and external service providers, among other operational initiatives.

The ideal candidate will have a deep understanding and proven track record in an IT support role. The ideal candidate will also have an eye toward the rapidly evolving technology landscape and implement proactive and preventative measures that avoid technical incidents.

S/he must be able to work with multiple product and project teams simultaneously, thrive in a fast-paced and dynamic environment and connect unexpected threads across disparate teams.

Essential Duties And Responsibilities
At a high-level, responsibilities for this role will include:

  • Product Health : Responsible for the Incident Management, Application Performance, Configuration Management and Operational Readiness of the products within her/his ownership. Partners with and collaborate closely with stakeholders from the various teams within IT to ensure that performance tools, configuration tools and monitoring tools meet the needs of her/his products.
  • Incident Management. Responsible for the initial response, triage, and communication of key production incidents (customer impacting) that occur on the site with the goal to restore systems/applications back to normal service operation as quickly as possible and minimizing the impact on guest/crew experience or business operations, thus ensuring the best possible service levels and availability are maintained. Performs analysis of incident impact on site to determine the root cause by reviewing performance data, including end user experience, application metrics, and infrastructure metrics. Support product team initiatives and releases. Synthesizes and communicates incident details to the production team, stakeholders, including executive level stakeholders. Document incident, perform postmortem and create next steps (as needed)
  • Application Performance Management (APM) . Ensures the proactive monitoring and management of performance and availability of the software applications within the products s/he is responsible for. Strives to detect and diagnose complex application performance problems to maintain an expected level of service. Provides insight into application performance metrics (errors, exceptions, baseline violations, etc.) to identify technical impacts of bugs and enhancements. Understands key performance metrics (traffic volumes, booking volumes, response times, etc.) to identify business value of bug fixes and enhancements.
  • Configuration Management . Understands high level view of the website operations to identify performance trends between business processes . Performs daily governance of application monitoring software.
  • Change Control Governance . Ensuring all production changes required by the product teams are carried out in a planned and authorized manner, within established change control policies and procedures and that all changes are thoroughly tested and validated from the monitoring perspective.
  • Production Operations Readiness. Ensure all product implementations go through an operational readiness review. Establish and maintain clear communication channels (e.g., Slack, Teams) with the scrum and marketing teams. Ensure all team members are informed about relevant updates and changes that may affect the website.

Qualifications

  • 3-6 years in Site Reliability Engineering (SRE), DevOps, QA, or a related IT operations role.
  • Bachelor's degree in Computer Science, Information Technology, Computer Engineering, or other relevant advanced degree preferred.

Knowledge And Skills

  • Technical Expertise :
  • Proficiency in cloud platforms such as AWS, AWS Elastic Beanstalk.
  • Understanding of API design principles: REST, SOAP, Graph
  • Advanced knowledge of monitoring and logging tools (AppDynamics, Datadog, Splunk, New Relic, etc.).
  • Familiarity with Adobe AEM Cloud is preferred to enhance system performance and reliability

  • AI & Automation Expertise

  • Working knowledge of scripting languages (Python, Bash, PowerShell) applied to automate alert routing, incident response, and infrastructure tasks, combined with a proactive mindset to explore and adopt new automation approaches.

  • Hands-on exposure to AI Ops platforms for enhancing anomaly detection, root cause analysis, and incident management, demonstrating a passion for staying ahead of industry trends.
  • Solid understanding of AI/ML and Generative AI techniques aimed at reducing alert noise, predicting incidents, and developing automation workflows, with active interest in piloting innovative solutions.
  • Familiarity with autonomous AI agents (Agentic Agents) or intelligent automation systems within operational environments, coupled with enthusiasm to experiment with emerging AI-driven tools in SRE.

  • Problem-Solving Skills :

  • Strong analytical and troubleshooting skills to diagnose and resolve complex production issues swiftly.
  • Ability to develop and implement effective incident response plans.
  • Communication and Collaboration :
  • Excellent written and verbal communication skills for effective interaction with cross-functional teams and documentation.
  • Ability to collaborate with Development, QA, IT, and external managed service providers to ensure seamless operations.

Work Environment

  • The SRE may be required to participate in an on-call rotation to handle urgent incidents and ensure 24x7 system reliability.
  • On-call duties may include evenings, weekends, and holidays as needed.

  • Site Engineer

    4 days ago


    Pasay, National Capital Region, Philippines Sun Wu Full time

    We are looking for a qualified and experienced Site Engineer to oversee construction activities, particularly for mall renovation projects. The ideal candidate must be knowledgeable in processing building permits and ensuring projects comply with engineering standards and regulations.Responsibilities:Supervise and monitor on-site construction and renovation...


  • Pasay, National Capital Region, Philippines SM Prime Holdings, Inc. Full time

    JOB OVERVIEW:The Land Development Site Engineer will be responsible for overseeing the day-to-day operations on-site during land development projects. This role ensures that construction work is carried out efficiently, safely, and according to specifications, regulations, and project timelines. The Site Engineer will work closely with project managers,...


  • Pasay, National Capital Region, Philippines Cebu Pacific Air Full time

    DepartmentCabin, Systems & StructureEmployee TypeProbationaryCebu Pacific 's commitment to ensuring every Juan's safety at all times will always be a top priority. Our Engineering & Fleet Management team leverage on their technical expertise and strong partnership with various business functions in upholding our commitment to our aircrafts' airworthiness,...


  • Pasay, National Capital Region, Philippines Vertical Space Interiors, Inc. Full time

    Participate in kick – off meetings to understand scope, site conditions, bill of quantities, special instructions.Evaluate and document work accomplishment rate in project site and prepare Monthly Progress Billing for submission to client.Receive Request for Invoice from client QS and submit to Project Accountant / Accounts ReceivableEvaluate and recommend...

  • Network Engineer

    4 days ago


    Pasay, National Capital Region, Philippines Woofy Incorporated Full time

    Job Title: Network EngineerDepartment: Network Deployment DepartmentLocation: One Esplanade Building / FieldReports to: Network Deployment ManagerJob SummaryA Network Engineer plays a crucial role in the planning, implementation, and maintenance of an organization's computer networks. The primary responsibility is to ensure that network infrastructure is...


  • Pasay, National Capital Region, Philippines TradeX Network Inc. Full time

    Key ResponsibilitiesProvide on-site user support for application and system issues, ensuring minimal downtimeTroubleshoot and resolve application errors, login issues, configuration problems, and system performance concernsCoordinate with development, IT, and operations teams for escalated technical issuesConduct application training and onboarding for...

  • NOC Engineer

    2 days ago


    Pasay, National Capital Region, Philippines Woofy Incorporated Full time

    Job Title: Network Operations EngineerReports to: NOC Manager or Chief Technology OfficerJob Summary: A Network Operations Engineer, also known as a Network Ops Engineer or NOC Engineer, is responsible for efficiently and securely operating the organization's network infrastructure. This role involves monitoring network performance, troubleshooting issues,...

  • Mechanical Engineer

    1 week ago


    Pasay, National Capital Region, Philippines PJ-Trigon Realty Corp. Full time

    I. JOB OBJECTIVES/SUMMARYA Mechanical Engineer or Facilities Mechanical Engineer, designs, builds and tests various mechanical devices, like industrial machines, transportation systems and robotics equipment. Their main duties include locating problems with various machinery especially Site Equipment and Service Vehicles, conducting monitoring of PHCC...

  • cost engineer

    9 hours ago


    Pasay, National Capital Region, Philippines San Miguel Corporation Full time

    Job DescriptionPosted on20 December 2025A Project/Cost Engineer Electrical manages project finances by creating budgets, estimating costs, controlling expenditures, analyzing variances, forecasting, and identifying savings to ensure projects stay within financial and schedule targets, working with project managers on resource allocation, risk management, and...

  • Sales Engineer

    6 days ago


    Pasay, National Capital Region, Philippines LM Tronics Inc. Full time

    Job Overview:The Sales and Application Engineer with experience in the Auxilliary field is responsible for driving the sales of Engineering products and solutions, with a focus on auxiliary systems and equipment. This role combines technical expertise with strong sales skills to deliver optimal solutions to customers ensuring product performance, application...