Senior Site Reliability Engineer
4 days ago
As a Site Reliability Engineer (SRE) at DFI Retail Group, you will be the bridge between development and operations, ensuring our systems are designed, implemented, and maintained for maximum reliability, scalability, and performance. You will leverage your software engineering expertise to automate operations, optimize system performance, and develop solutions that prevent recurring issues. Your work will be essential in guaranteeing seamless experience for our users by maintaining the high availability and efficiency of our services.
Is this your next challenge in Site Reliability Engineering?
Responsibilities:
- Design and Implement Solutions for Reliability and Scalability: Develop and implement highly scalable and available system architectures to meet growing user demands without compromising performance.
- Automate Operations: Design, build, and integrate software tools to automate operational processes, including system monitoring, incident response, and deployment procedures.
- Optimize System Performance: Proactively monitor system performance, identify bottlenecks, and implement optimization strategies to ensure efficient resource utilization and service delivery.
- Implement and Manage Monitoring and Observability: Establish comprehensive service metrics and implement robust monitoring systems to track, analyze, and report on system reliability, performance, and efficiency including, but not limited to the following monitoring systems (New Relic, Azure Monitor, and Google Cloud Monitoring). Utilize observability tools to gain deeper insights into system behavior and identify potential issues proactively.
- Incident Response and Resolution: Develop and implement strategies for rapid incident detection and response. Troubleshoot and resolve complex system issues, minimizing downtime and mitigating service disruptions.
- Capacity Planning and Performance Tuning: Conduct capacity planning analyses to anticipate future resource needs and ensure system scalability. Proactively tune system performance to optimize resource utilization and maintain service level agreements (SLAs).
- Collaboration with Development Teams: Work closely with software development teams to integrate reliability considerations throughout the software development lifecycle. Participate in code reviews, design discussions, and post-incident reviews to enhance system reliability and prevent recurring issues.
- Drive Continuous Improvement: Continuously evaluate existing processes and tools, identifying areas for improvement and automation. Research and implement new technologies and best practices to enhance system reliability and operational efficiency.
- Documentation and Knowledge Sharing: Create and maintain comprehensive documentation for systems, processes, and incident responses. Actively share knowledge and best practices with the team and organization.
- Administer Atlassian Product Suite: Manage and maintain the Atlassian product suite, including Jira, Confluence, and Bitbucket, ensuring seamless operation and integration with existing workflows. Provide user support and training as needed.
Do you have experience as Site Reliability Engineer?
Qualifications:
- Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent experience.
- Proven experience (At least 5 years) as an SRE, DevOps Engineer, or a similar role, demonstrating a strong understanding of software engineering principles and IT operations.
- Hands-on experience in the administration of the Atlassian product suite (Jira, Confluence and Bitbucket).
- In-depth knowledge of cloud platforms such as AWS, Azure, or GCP, including services related to compute, storage, networking, and databases.
- Proficiency in scripting languages like Python or PowerShell and experience with automation tools such as Terraform or Ansible.
Familiarity with Monitoring and log system (Prometheus, Zabbix, Grafana, ELK, Azure Monitor, Google Monitoring)
- Hands-on experience with containerization technologies like Docker and container orchestration tools like Kubernetes.
- Strong understanding of networking concepts and protocols.
- Experience with CI/CD pipelines and tools for continuous integration, continuous delivery, and infrastructure automation.
- Solid understanding of security best practices for cloud environments.
- Strong analytical and problem-solving skills, with the ability to identify root causes and implement effective solutions.
- Excellent communication and collaboration skills, with the ability to work effectively within a team and communicate technical details to both technical and non-technical audiences.
If you have the right skills and experience, this is an opportunity to build your career with Pan Asia's leading retailer.
DFI Retail Group is an equal opportunity employer and responsible for ensuring that all personal information collected from each Candidate presented to DFI Retail Group is used for recruitment purposes only and the personal data will be kept and handled confidentially. We will retain the applications of candidates not selected for a period of no more than 24 months. The data collection process is in accordance with all applicable laws and compliant with the Code of Practice on Human Resource Management.
To find out more about Our Businesses and Our People, please visit our website:
Issued by The Dairy Farm Company, Limited
-
Site Reliability Engineering Specialist
4 days ago
Makati City, National Capital Region, Philippines Electronic Transfer and Advance Processing Inc. Full time ₱1,500,000 - ₱2,500,000 per yearJob DescriptionWe are seeking a Senior Site Reliability Engineer (SRE) to lead the design, deployment, and management of highly available and scalable AWS cloud infrastructure. This role will focus on building automation solutions, optimizing system performance, and strengthening the reliability and security of cloud services. As a senior member of the team,...
-
Site Reliability Engineer
1 week ago
Quezon City, National Capital Region, Philippines Comrise Full time ₱900,000 - ₱1,200,000 per yearWe are seeking a Site Reliability Engineer (Cloud) to join our growing technology team. In this role, you will be responsible for maintaining and enhancing the reliability, performance, and scalability of our cloud infrastructure. You'll apply software engineering principles to operations tasks, helping ensure the continuous availability and resilience of...
-
Site Reliability Engineer
4 days ago
Makati City, National Capital Region, Philippines Cambridge University Press & Assessment | Manila Full time ₱720,000 - ₱972,000 per yearNOTE: When you click the apply button, you will be re-directed to Cambridge University Press & Assessment's website where you will be required to create a profile and upload a copy of your CV to complete your application.ork setup: Hybrid (open to 2x a week in the office)Work schedule: 10AM to 6PM Manila timeEmployment type: PermanentLocation: Makati City,...
-
Senior Site Reliability Engineer
4 days ago
Makati City, National Capital Region, Philippines iScale Solutions, Inc. Full time ₱2,000,000 - ₱2,500,000 per yearPreferred QualificationsHands-on experience migrating applications to SRE operating models in multi-team/multi-application settings.Certification(s): Google Cloud Professional DevOps Engineer, Kubernetes CKA/CKS, or equivalent.Core ExpertiseSRE Foundations & PracticesDeep understanding of SRE principles (SLIs, SLOs, error budgets, toil reduction,...
-
Senior Site Reliability Engineer
4 days ago
Makati City, National Capital Region, Philippines Broadridge Full time $104,000 - $130,878 per yearAt Broadridge, we've built a culture where the highest goal is to empower others to accomplish more. If you're passionate about developing your career, while helping others along the way, come join the Broadridge team.Role OverviewWe are looking for a seasoned Site Reliability Engineer to design, implement, and maintain scalable, secure, and high-performing...
-
Site Reliability Engineer
2 weeks ago
Makati City, National Capital Region, Philippines Descartes Systems Group Full time ₱30,000 - ₱60,000 per yearDescartes Unites the People and Technology that Move the WorldThe need for efficient, secure, and agile supply chains and logistics operations has become ever more critical and complex. By combining innovative technology, powerful trade intelligence and the reach of our network, Descartes helps get goods, information, transportation assets, and people where...
-
Senior Site Reliability Engineer
2 days ago
Makati City, National Capital Region, Philippines Broadridge Full time ₱80,000 - ₱120,000 per yearAt Broadridge, we've built a culture where the highest goal is to empower others to accomplish more. If you're passionate about developing your career, while helping others along the way, come join the Broadridge team.Role OverviewWe are looking for a seasoned Site Reliability Engineer to design, implement, and maintain scalable, secure, and high-performing...
-
Cloud & Site Reliability Associate
4 days ago
Makati City, National Capital Region, Philippines Electronic Transfer and Advance Processing Inc. Full time ₱600,000 - ₱800,000 per yearJob DescriptionThe Junior Site Reliability Engineer (SRE) supports the stability, scalability, and performance of the organization's cloud infrastructure. This role focuses on assisting with automation, monitoring, and incident response while gaining hands-on experience with AWS services. It's an excellent opportunity for those eager to build expertise in...
-
Senior Site Reliability Engineer
4 days ago
Makati City, National Capital Region, Philippines eTap Inc. Full time ₱2,000,000 - ₱2,500,000 per year𝐉𝐨𝐢𝐧 𝐞𝐓𝐚𝐩 𝐈𝐧𝐜. 𝐚𝐬 𝐚 𝐒𝐞𝐧𝐢𝐨𝐫 𝐒𝐢𝐭𝐞 𝐑𝐞𝐥𝐢𝐚𝐛𝐢𝐥𝐢𝐭𝐲 𝐄𝐧𝐠𝐢𝐧𝐞𝐞𝐫𝐀𝐛𝐨𝐮𝐭 𝐞𝐓𝐚𝐩 𝐈𝐧𝐜.Founded in 2015, eTap Inc. is a pioneering Financial Technology company in the Philippines, specializing in Custom Self-Service...
-
Site Reliability Engineer
4 days ago
Makati City, National Capital Region, Philippines Broadridge Full time ₱1,800,000 - ₱2,500,000 per yearAt Broadridge, we've built a culture where the highest goal is to empower others to accomplish more. If you're passionate about developing your career, while helping others along the way, come join the Broadridge team.Role OverviewWe are seeking a Site Reliability Engineer (Cloud) to lead the design, implementation, and operational support of our...