Senior AWS Site Reliability Engineer

5 days ago


Manila, Philippines Flexisource IT Full time

Direct message the job poster from Flexisource IT

We are seeking an experienced Senior AWS Site Reliability Engineer to join our cross-functional cloud platform team. Working alongside a diverse group of DevOps and Site Reliability Engineers, you will combine deep technical expertise in AWS cloud infrastructure with strong leadership capabilities in incident response and system reliability. In this role, you will be instrumental in leading incident response, maintaining, optimizing and scaling our cloud infrastructure while ensuring exceptional system reliability and performance.

Responsibilities
  • Lead incident response from initial detection, real-time mitigation, root cause analysis, post-mortem documentation (using Incident IO) and implementation of lessons learned, with a focus on continuous improvement.
  • Develop and execute comprehensive incident response strategies to minimise downtime and business impact
  • Participate in a 24/7 on-call rotation to ensure continuous system availability
  • Implement and maintain comprehensive observability solutions using Cloudwatch, DataDog or similar monitoring platforms
  • Maintain, improve, and optimise AWS infrastructure using Terraform while ensuring scalability, reliability, and cost efficiency.
  • Continuously assess and enhance AWS infrastructure to optimise performance and cost-effectiveness
  • Monitor and optimise serverless technologies including AWS Lambda and API Gateway for peak performance and cost efficiency
  • Monitor and maintain ECS Fargate deployments for containerised applications, ensuring optimal resource utilization
  • Collect and analyse metrics to identify resource consumption, abnormal behavior, and potential performance bottlenecks
  • Configure and manage alerting, dashboards, and automated monitoring across distributed systems
  • Foster improved collaboration between development and operations teams by implementing SRE practices.
Required Qualifications
  • Previous experience in a DevOps or SRE role
  • Exceptional written and verbal communication skills
  • Proven experience in incident response and 24/7 on-call responsibilities
  • Expert-level knowledge of Infrastructure as Code, primarily Terraform (demonstrated experience with other IaC tools will be highly regarded)
  • Expert-level knowledge of AWS compute infrastructure
  • Proficiency in automation tools and scripting languages
  • Strong understanding of monitoring, metrics collection, and performance analysis
  • Expert knowledge of observability and monitoring platforms such as DataDog, New Relic, Prometheus, or similar tools
  • Experience with log aggregation, APM (Application Performance Monitoring), and distributed tracing
  • Excellent collaboration abilities and capacity to work effectively in cross-functional teams
  • Strong analytical and problem-solving skills
  • Demonstrated ability to work autonomously and take ownership
Preferred Qualifications
  • Experience with incident.io (highly desirable).
  • Background in payments and PCI compliance environments (highly desirable).
  • AWS certifications.
  • Experience with container orchestration and microservices architecture.
  • Knowledge of security best practices in cloud environments.
Work Details
  • Schedule: Monday- Friday, 6:00am- 3:00pm or 7:00am- 4:00pm (PH Time); depending on business needs
  • Location: Makati | Work from Home Until Further Notice
#J-18808-Ljbffr

  • Manila, National Capital Region, Philippines HGS Offshore Staffing Solutions Full time ₱2,000,000 - ₱2,500,000 per year

    SENIOR SITE RELIABILITY ENGINEERPOSITION OVERVIEWWe are seeking an experienced Senior AWS Site Reliability Engineer to join our cross-functionalcloud platform team. Working alongside a diverse group of DevOps and Site ReliabilityEngineers, you will combine deep technical expertise in AWS cloud infrastructure with strongleadership capabilities in incident...


  • Manila, National Capital Region, Philippines Tyler Technologies Full time

    Join to apply for the Cloud Site Reliability Engineer role at Tyler TechnologiesOverviewResponsibilitiesImplement tooling to monitor AWS EKS-based systems focusing on performance, reliability, and scalability.Ensure that architecture and deployment models are sufficient to support SLA commitments and are well prepared for future problems of scale.Leverage...


  • Manila, Philippines Tyler Technologies Full time

    Join to apply for the Cloud Site Reliability Engineer role at Tyler Technologies Overview Responsibilities Implement tooling to monitor AWS EKS-based systems focusing on performance, reliability, and scalability. Ensure that architecture and deployment models are sufficient to support SLA commitments and are well prepared for future problems of scale....


  • , Metro Manila, Philippines Buscojobs Full time

    Site Reliability Engineer jobs in the Philippines 47 Site Reliability Engineer jobs in the Philippines Site Reliability Engineer Posted today Job Viewed Tap Again To Close Job Description Responsibilities: Develop, maintain, and optimize SAP landscapes on GCP for our clients, ensuring optimal performance, reliability, and efficiency. Utilize industry-leading...


  • Eastern Manila District, Philippines CC.Talent Full time

    Senior Site Reliability Engineer (SRE) Senior Site Reliability Engineer (SRE) to join our global infrastructure team. You will be a guardian of our production environment, responsible for its health, performance, and scalability. Your mission is to apply software engineering principles to solve operational problems, automate everything, and ensure our...


  • Manila, National Capital Region, Philippines Broadridge Full time

    OverviewSenior Site Reliability Engineer (Hybrid-Flexible Options) – BroadridgeWe are looking for a seasoned Site Reliability Engineer to design, implement, and maintain scalable, secure, and high-performing infrastructure solutions across a full-stack environment. This role requires deep collaboration with cross-functional teams to drive automation,...


  • Southern Manila District, Philippines Royal Caribbean International Full time

    Overview Position Summary: The Site Reliability Engineer (Senior SRE) reports to the SRE Manager in support of the Royal Caribbean website by utilizing application and user performance data to guide informed decision-making. The SRE uses performance metrics from various sources and tools to support tasks such as initial triage of critical production...


  • , Metro Manila, Philippines ABC Worldwide (AKA BRIP Careers Worldwide) Full time

    Overview Our client, a global Business Process Outsourcing (BPO) business, is looking for Site Reliability Engineers (SRE) to support their global payment technology company that provides platforms to consumers, businesses and organizations to make electronic payments. The successful candidate will be responsible for ensuring site reliability & performance,...


  • , Metro Manila, Philippines Satori Full time

    Senior Site Reliability Engineer (Hybrid) Our client, a multinational leader in fleet performance management, is establishing its operations in the Philippines and is currently hiring members for the pioneer team. Job Summary: You will be part of an autonomous team, responsible for maintaining and developing the Client’s global SaaS platforms. Your efforts...


  • Manila, National Capital Region, Philippines Broadridge Financial Solutions Full time

    Senior Site Reliability Engineer (Hybrid) page is loaded## Senior Site Reliability Engineer (Hybrid)locations: Manila - 6805 Ayala Avetime type: Full timeposted on: Posted Todayjob requisition id: JR1075784At Broadridge, we've built a culture where the highest goal is to empower others to accomplish more. If you're passionate about developing your...