Senior Site Reliability Engineer

19 hours ago


Mandaluyong City, National Capital Region, Philippines The Dairy Farm Company, Limited- ROHQ Full time $90,000 - $120,000 per year

As a Site Reliability Engineer (SRE) at DFI Retail Group, you will be the bridge between development and operations, ensuring our systems are designed, implemented, and maintained for maximum reliability, scalability, and performance. You will leverage your software engineering expertise to automate operations, optimize system performance, and develop solutions that prevent recurring issues. Your work will be essential in guaranteeing seamless experience for our users by maintaining the high availability and efficiency of our services.

Is this your next challenge in Site Reliability Engineering?

Responsibilities:

  • Design and Implement Solutions for Reliability and Scalability: Develop and implement highly scalable and available system architectures to meet growing user demands without compromising performance.
  • Automate Operations: Design, build, and integrate software tools to automate operational processes, including system monitoring, incident response, and deployment procedures.
  • Optimize System Performance: Proactively monitor system performance, identify bottlenecks, and implement optimization strategies to ensure efficient resource utilization and service delivery.
  • Implement and Manage Monitoring and Observability: Establish comprehensive service metrics and implement robust monitoring systems to track, analyze, and report on system reliability, performance, and efficiency including, but not limited to the following monitoring systems (New Relic, Azure Monitor, and Google Cloud Monitoring). Utilize observability tools to gain deeper insights into system behavior and identify potential issues proactively.
  • Incident Response and Resolution: Develop and implement strategies for rapid incident detection and response. Troubleshoot and resolve complex system issues, minimizing downtime and mitigating service disruptions.
  • Capacity Planning and Performance Tuning: Conduct capacity planning analyses to anticipate future resource needs and ensure system scalability. Proactively tune system performance to optimize resource utilization and maintain service level agreements (SLAs).
  • Collaboration with Development Teams: Work closely with software development teams to integrate reliability considerations throughout the software development lifecycle. Participate in code reviews, design discussions, and post-incident reviews to enhance system reliability and prevent recurring issues.
  • Drive Continuous Improvement: Continuously evaluate existing processes and tools, identifying areas for improvement and automation. Research and implement new technologies and best practices to enhance system reliability and operational efficiency.
  • Documentation and Knowledge Sharing: Create and maintain comprehensive documentation for systems, processes, and incident responses. Actively share knowledge and best practices with the team and organization.
  • Administer Atlassian Product Suite: Manage and maintain the Atlassian product suite, including Jira, Confluence, and Bitbucket, ensuring seamless operation and integration with existing workflows. Provide user support and training as needed.

Do you have experience as Site Reliability Engineer?

Qualifications:

  • Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent experience.
  • Proven experience (At least 5 years) as an SRE, DevOps Engineer, or a similar role, demonstrating a strong understanding of software engineering principles and IT operations.
  • Hands-on experience in the administration of the Atlassian product suite (Jira, Confluence and Bitbucket).
  • In-depth knowledge of cloud platforms such as AWS, Azure, or GCP, including services related to compute, storage, networking, and databases.
  • Proficiency in scripting languages like Python or PowerShell and experience with automation tools such as Terraform or Ansible.

Familiarity with Monitoring and log system (Prometheus, Zabbix, Grafana, ELK, Azure Monitor, Google Monitoring)

  • Hands-on experience with containerization technologies like Docker and container orchestration tools like Kubernetes.
  • Strong understanding of networking concepts and protocols.
  • Experience with CI/CD pipelines and tools for continuous integration, continuous delivery, and infrastructure automation.
  • Solid understanding of security best practices for cloud environments.
  • Strong analytical and problem-solving skills, with the ability to identify root causes and implement effective solutions.
  • Excellent communication and collaboration skills, with the ability to work effectively within a team and communicate technical details to both technical and non-technical audiences.

If you have the right skills and experience, this is an opportunity to build your career with Pan Asia's leading retailer.

DFI Retail Group is an equal opportunity employer and responsible for ensuring that all personal information collected from each Candidate presented to DFI Retail Group is used for recruitment purposes only and the personal data will be kept and handled confidentially. We will retain the applications of candidates not selected for a period of no more than 24 months. The data collection process is in accordance with all applicable laws and compliant with the Code of Practice on Human Resource Management.

To find out more about Our Businesses and Our People, please visit our website:

Issued by The Dairy Farm Company, Limited



  • Mandaluyong City, National Capital Region, Philippines Maya Bank Full time $80,000 - $100,000 per year

    Maya Mandaluyong, National Capital Region, PhilippinesSite Reliability Engineer (IAU)Maya Mandaluyong, National Capital Region, Philippines3 days ago Be among the first 25 applicants Work on an environment driven by automation. Build and simplify infrastructure resource deployment by creating reusable templates. Advanced knowledge in AWS with...


  • Mandaluyong City, National Capital Region, Philippines DFI Retail Group Full time $90,000 - $120,000 per year

    DFI Team BriefAs a Site Reliability Engineer (SRE) at DFI Retail Group, you will be the bridge between development and operations, ensuring our systems are designed, implemented, and maintained for maximum reliability, scalability, and performance. You will leverage your software engineering expertise to automate operations, optimize system performance, and...


  • Makati City, National Capital Region, Philippines Royal Caribbean International Full time $80,000 - $100,000 per year

    Get AI-powered advice on this job and more exclusive features. Site Reliability Engineer (SRE) will assist the SRE team in support of the Royal Caribbean website using application and user performance data to guide informed decision making. The SRE will use site performance metrics collected by various sources and tools to support the following tasks: the...


  • Mandaluyong City, National Capital Region, Philippines Maya Bank Full time $80,000 - $100,000 per year

    Maya Mandaluyong, National Capital Region, PhilippinesSite Reliability Engineer (Banking)Maya Mandaluyong, National Capital Region, Philippines1 month ago Be among the first 25 applicants This role will heavily contribute in the setup, maintenance, and configuration of Maya's cloud infrastructure with significant focus on: security, network, performance,...


  • Makati City, National Capital Region, Philippines Royal Caribbean International Full time $90,000 - $120,000 per year

    Senior Site Reliability Engineer (Sr. SRE) will support the Royal Caribbean website by analyzing application and user performance data to inform decision-making. The Sr. SRE will utilize site performance metrics from various sources and tools to:Assist in triaging critical production incidents Analyze bugs and implement best practices in site reliability...


  • Mandaluyong City, National Capital Region, Philippines beBeeSiteReliability Full time ₱900,000 - ₱1,200,000

    Highly Available Reliability Engineer PositionWe are seeking a highly skilled reliability engineer to join our team.About the RoleThis is an exciting opportunity for someone who has experience in site reliability engineering and wants to make a real impact in a fast-paced environment. As a senior site reliability engineer, you will be working on critical API...


  • Makati City, National Capital Region, Philippines eTap Inc. Full time ₱900,000 - ₱1,200,000 per year

    e Tap Inc. Makati, National Capital Region, PhilippinesSenior Site Reliability EngineereTap Inc. Makati, National Capital Region, Philippines1 day ago Be among the first 25 applicants Direct message the job poster from e Tap Inc.Human Resources Manager at Electronic Transfer and Advance Processing Inc.About Electronic Transfer and Advance Processing Inc (e...


  • Makati City, National Capital Region, Philippines Globant Full time $80,000 - $100,000 per year

    Globant Makati, National Capital Region, PhilippinesSite Reliability EngineerGlobant Makati, National Capital Region, PhilippinesWe are seeking a motivated and experienced Site Reliability Engineer (SRE) to join our dynamic team. The ideal candidate will have a strong background in application performance monitoring, logging and tracing, and web performance...


  • Mandaluyong City, National Capital Region, Philippines The Penbrothers International, Inc. Full time $90,000 - $120,000 per year

    About PenbrothersPenbrothers is an HR & remote talent management partner and one of the fastest growing companies in the Philippines. We provide talented Filipinos with global opportunities in high-growth startups and dynamic companies. About the ClientOur client is a purpose-driven organization and company headquartered in Sweden, operating with a globally...


  • Makati City, National Capital Region, Philippines Electronic Transfer and Advance Processing Inc. Full time $90,000 - $120,000 per year

    Job DescriptionWe are seeking a Senior Site Reliability Engineer (SRE) to lead the design, deployment, and management of highly available and scalable AWS cloud infrastructure. This role will focus on building automation solutions, optimizing system performance, and strengthening the reliability and security of cloud services. As a senior member of the team,...