Senior Site Reliability Engineer
1 week ago
Itiviti
Negotiable
On-site - Manila 3-5 Yrs Exp Bachelor Full-time
Job DescriptionDescriptionRole Overview
We are seeking a dynamic Senior Site Reliability Engineer (SRE) to lead the design, implementation, and operational support of our hybrid environments, spanning on-premises, private cloud, and public cloud platforms. This role will be pivotal in setting the foundation and strategy for our SRE practices while driving their implementation across the organization. The ideal candidate will combine technical expertise with leadership skills to guide our team on the SRE journey and ensure our environments are scalable, reliable, and secure.
Responsibilities
- SRE Best Practices Implementation: Lead the rollout of SRE best practices, including error budgeting, service level objectives (SLOs), service level indicators (SLIs), and monitoring and alerting systems.
- Automation and Efficiency: Develop and implement automation tools and processes to improve the reliability, scalability, and efficiency of our systems and services.
- Incident Management: Respond to system outages and emergencies, participate in incident calls, and provide root cause analysis to prevent future occurrences.
- Capacity Planning and System Design: Ensure infrastructure can handle increasing traffic and workloads through proactive capacity planning and system design.
- Collaboration: Work with cross-functional teams, including application development, architecture, DevOps, quality engineering, and vendor teams, to align on solutions and operational standards.
- Observability and Monitoring: Implement and optimize observability tools, such as Datadog, Splunk, and CloudWatch, to provide actionable insights into system performance and health.
- Technical Leadership: Lead technical design sessions, set expectations for onshore and offshore SRE team members, and mentor junior associates.
- Operational Governance: Manage vulnerabilities, end-of-life issues, and non-functional requirements (NFRs) within products and platforms.
- Strategic Contributions: Define technical standards for infrastructure, automation, operational processes, and tooling to align with the organization's long-term vision.
Your profile
- Technical Expertise:
- Advanced knowledge of cloud platforms (AWS, Azure, private cloud) and on-premises environments.
- Hands-on experience with automation tools like Terraform, Ansible, Chef, Puppet, and Jenkins.
- Proficiency in scripting languages such as Python, Shell, and PowerShell.
- Experience with containerization technologies (Docker, Kubernetes) and middleware (databases, web servers, MQ, Kafka).
- Strong background in Linux and/or Windows systems administration and networking fundamentals.
- SRE Practices:
- Demonstrated experience implementing SLOs, SLIs, and observability tools.
- Knowledge of error budgeting and incident management processes.
- Proven ability to troubleshoot complex technical issues and perform root cause analysis.
- Soft Skills:
- Ability to work independently and proactively.
- Capable of engaging with global teams across different time zones.
- Strong leadership skills to empower and inspire others.
- Excellent written and verbal communication skills.
- Collaborative mindset with the ability to mentor and inspire team members.
- Ability to prioritize and adapt in a fast-paced environment.
Preferred Qualifications
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- 7 to 10 + years of experience in site reliability engineering or related roles.
- Familiarity with microservices architecture and modern design patterns.
- Practical experience with tools like Datadog, CloudWatch, CloudTrail, and Splunk.
- Proven track record in leading teams and driving organizational change related to SRE.
-
Site Reliability Engineering
3 weeks ago
Manila, National Capital Region, Philippines Neksjob Corporation Full timeSkills and Qualifications: Required Skill: Proficiency in Site Reliability Engineering. Additional Must-Have Skills: Experience with Ansible on Microsoft Azure. Strong understanding of automation principles and best practices. Experience in developing and deploying automation scripts and tools. Knowledge of monitoring and maintaining automation...
-
Senior Site Reliability Engineer
1 week ago
Manila, National Capital Region, Philippines DTN Full timeDTN is a global leader providing insights and analytics to our customers to feed, fuel, and protect the world. We help people make critical business decisions that impact the agriculture, oil and gas, trading, and weather industries.It's a difference you've likely felt without even knowing it. From the food on your plate to the gas in your car to the last...
-
Site Reliability Engineering Team Lead
24 hours ago
Manila, National Capital Region, Philippines Token Metrics Full timeSite Reliability Engineering Team LeadToken Metrics is seeking a Site Reliability Engineering Team Lead to lead our SRE team. As a Site Reliability Engineering Team Lead, you will be responsible for designing, implementing, and maintaining our company's IT infrastructure, with a focus on reliability and efficiency.Key Responsibilities:Lead our SRE team in...
-
Senior Site Reliability Engineer Specialist
6 days ago
Manila, National Capital Region, Philippines Ll Oefentherapie Full timeLl Oefentherapie is seeking a highly skilled Senior Site Reliability Engineer Specialist to join our team.This role involves working with various modern technologies such as Kubernetes, Oracle Cloud Infrastructure, Ansible, Prometheus, Grafana, Java, Python, Shell scripting, and other SaaS tools.You will participate in projects that involve collaboration...
-
Senior Site Reliability Engineer Specialist
6 days ago
Manila, National Capital Region, Philippines Broadridge Trading & Connectivity Solutions Full timeKey QualificationsThe ideal candidate should possess a bachelor's or master's degree in Computer Science, Engineering, or a related field, with 7 to 10+ years of experience in site reliability engineering or related roles.Preferred QualificationsFamiliarity with microservices architecture and modern design patterns.Practical experience with tools like...
-
Site Reliability Engineering Lead
6 days ago
Manila, National Capital Region, Philippines HyDIGIT Group Full timeJob OverviewThe Site Reliability Engineering Lead will work closely with development squads to design and implement platform and infrastructure management systems.Main ResponsibilitiesDevelop and maintain automation scripts for infrastructure management.Collaborate with development teams to implement monitoring techniques and improve system reliability.Lead...
-
Site Reliability Engineer
6 days ago
Manila, National Capital Region, Philippines MSCI Inc Full timeAt MSCI Inc., we are passionate about empowering our clients to make informed investment decisions. As a Site Reliability Engineer, you will be part of our ESG & Data Technology SRE team, which is dedicated to ensuring the stability, performance, and reliability of our proprietary financial software applications.Key ResponsibilitiesProvide second-level...
-
Software Engineer II, Site Reliability
1 week ago
Manila, National Capital Region, Philippines Zendesk, Incorporated (Philippines) Full timeSoftware Engineer II, Site ReliabilitySoftware Engineer II, Site ReliabilityApply locations Manila, Philippines time type Full time posted on Posted 2 Days Ago job requisition id R30564Job DescriptionAre you ready to join a world-class team of software engineers and make a real impact? As a Software Engineer - Site Reliability at Zendesk, you will have the...
-
Site Reliability Manager
6 days ago
Manila, National Capital Region, Philippines PENBROTHERS Full timePENBROTHERS is looking for a Site Reliability Manager to oversee the planning, implementation, and maintenance of our cloud infrastructure. As a key member of our cloud team, you'll be responsible for designing, implementing, and maintaining CI/CD pipelines to support application deployment and infrastructure provisioning.The ideal candidate will have 3-5...
-
Site Reliability Engineer Leader
6 days ago
Manila, National Capital Region, Philippines Broadridge Trading & Connectivity Solutions Full timeCompany Overview">Broadridge Trading & Connectivity Solutions is a leading provider of cloud-based solutions for financial institutions. As a dynamic and innovative company, we are seeking a skilled Senior Site Reliability Engineer to lead our hybrid environment team.Job DescriptionWe are looking for a highly experienced SRE leader who can design, implement,...
-
Reliability Engineer
3 days ago
Manila, National Capital Region, Philippines Teamware Solutions Full timeAbout the RoleAs a Site Reliability Engineer at Teamware Solutions, you will be responsible for improving system reliability and resilience. This role requires strong problem-solving skills and the ability to work with stakeholders to define service level objectives (SLOs) for system operations.The successful candidate will track performance against SLOs,...
-
Software Engineer II, Site Reliability
2 weeks ago
Manila, National Capital Region, Philippines Zendesk Group Full timeSoftware Engineer II, Site ReliabilityJob DescriptionAre you ready to join a world-class team of software engineers and make a real impact? As a Software Engineer - Site Reliability at Zendesk, you will have the opportunity to work on exceptionally innovative projects that are revolutionizing the customer service industry. We are looking for ambitious and...
-
Site Reliability Engineer
3 weeks ago
Manila, National Capital Region, Philippines Manpower Core Group Inc. Full timeResponsibilities: Ensure the reliability, availability, and performance of mission-critical systems and services. Design, implement, and maintain monitoring, alerting, and incident management systems to detect and resolve issues quickly. Develop and improve automation tools to streamline infrastructure provisioning, deployment, and operational...
-
Manila, National Capital Region, Philippines TASQ Staffing Solutions Full timePosition Title: Site Reliability EngineerTHE WORK:Site Reliability Engineering bridges the gaps between platform design, development, and operational execution by providing new perspectives on system reliability. As an Application Automation Engineer, you will be responsible for applying innovative ideas to drive the automation of Delivery Analytics at the...
-
Site Reliability Engineer
3 days ago
Manila, National Capital Region, Philippines MSCI Inc Full timeAt MSCI Inc., we are seeking a skilled Reliability and Operations Expert to join our ESG & Data Technology SRE (Site Reliability) & DevOps team. This dynamic team is located globally across India, Manila, the US, and Mexico, providing tier 2/3 support to proprietary MSCI ESG & Data Technology Business.Key ResponsibilitiesProvide second-level product...
-
Site Reliability Engineer
7 days ago
Manila, National Capital Region, Philippines Ben Edictio Corp Full timeResponsibilities:Design and Implement Scalable Solutions: Develop system architectures that handle growing user demands while maintaining performance.Automate Operations: Build tools to automate system monitoring, incident response, and deployment processes.Optimize System Performance: Monitor systems, identify bottlenecks, and improve resource...
-
Façade Senior Designer
1 week ago
Manila, National Capital Region, Philippines Gig Engineer Full timeThe Façade Senior Designer is responsible for leading and executing the engineering, design coordination, and technical development of façade systems, ensuring compliance with industry standards, project specifications, and quality requirements. The role involves managing technical aspects, collaborating with multidisciplinary teams, and driving project...
-
DevOps Site Reliability Engineer SRE
1 day ago
Manila, National Capital Region, Philippines Alchemy Insights, LLC Full timeResponsibilitiesSet high standards for Cloud Infrastructure and SAAS Reliability at AlchemyDevelop and own company-wide Reliability best practices like SLO definition, incident management, postmortem reviews, launch readiness reviews, and change managementArchitect production infrastructure and tools that encourage and enforce high reliabilityInspire the...
-
Reliability Engineer
6 days ago
Manila, National Capital Region, Philippines Strategem Ventures Management Inc. Full timeJob SummaryWe are seeking a highly skilled Reliability Engineer to join our team at Strategem Ventures Management Inc. in Metro Manila. The successful candidate will be responsible for ensuring the reliability and efficiency of our electrical and electronic systems.Main Responsibilities:Developing and implementing strategies to improve system reliability and...
-
Reliable Infrastructure Expert
6 days ago
Manila, National Capital Region, Philippines Ll Oefentherapie Full timeAbout This RoleWe are looking for a talented Reliable Infrastructure Expert to join Ll Oefentherapie's team.As a Senior Site Reliability Engineer, you will be responsible for ensuring the reliability and performance of our infrastructure.Key ResponsibilitiesTech Stack: Familiarity with Kubernetes, Oracle Cloud Infrastructure, Ansible, Prometheus, Grafana,...