Principal Service Reliability Engineer
5 days ago
Principal Service Reliability Engineer
Purpose of the role
The Principal Site Reliability Engineer will be responsible for ensuring the reliability, performance and scalability of our mission-critical platforms. In this role, you will be safeguarding operational excellence in the assigned product, influence reliability strategies, integral in production incident response, and helping to improve operational metrics.
The role will be collaborating closely with different teams, such as Development and Production Support teams, to make sure target SLOs are met, making adjustments where needed and designing/developing code to facilitate meeting such targets. The role is also expected to work on toil reduction projects, handle capacity planning/tuning activities, revisiting existing SOPs and designing/developing code for performance improvements. This is a hybrid position and would require you to be in the local office 2-3 days a week.
In this role you'll:
- Define and track Service Level Indicators (SLIs), Objectives (SLOs), and Error Budgets in partnership with engineering and product leads
- Collaborate with Operations and Development teams to drive service reliability, availability, and scalability
- Drive and participate in toil reduction projects to minimize if not eliminate recurring manual activities performed by the team
- Establish feedback loop with development teams for them to have visibility on the how stable and reliable their services are in client environments
- Drive production incident response and lead root cause analysis and continuous improvement
- Design/Develop operational improvement items with development teams working with them closely in prioritizing these improvements
- Provide input on process improvements to Change, Release, and Incident Management
- Create and implement support playbooks that resources can use as part of emergency response to production issues
About the ideal candidate
- Knowledgeable and experienced in utilizing different Azure resources such as VMs, Storage, Network, Functions, Logic Apps. App Services, AKS
- Strong technical expertise on Azure DevOps, developing in git and working on gitops repo and build/release pipelines
- Have hands-on experience in developing Azure Powershell scripts, Azure Runbooks, or any other infrastructure automation tools
- Experienced with monitoring and logging tools (Grafana, Dynatrace, Splunk)
- Proven ability to adapt to emerging cloud technologies and industry leading DevOps applications such as Terraform, Docker Containers, and Kubernetes
- Knowledgeable in cloud implementation of Navitaire products across different cloud infrastructure models
- Understands production environments and processes and ways on how they can be further optimized through various Azure features and other cloud technologies/services
- Proven ability to drive problem solving efforts through effective issue analysis
- Has the ability to lead efforts to implement infrastructure changes to increase environment stability and support scalability
- Has the ability to drive collaborations with different Navitaire teams in enforcing environment standards and policies
- Effectively works in a team environment and contributes in building capabilities of team members
- Proficient in C#
- Proven ability to work in a dynamic, fast-paced and multi-cultural environment
- Willing to work on shifting schedules
Diversity & Inclusion
Amadeus aspires to be a leader in Diversity, Equity and Inclusion in the tech industry, enabling every employee to reach their full potential by fostering a culture of belonging and fair treatment, attracting the best talent from all backgrounds, and as a role model for an inclusive employee experience.
Amadeus is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to gender, race, ethnicity, sexual orientation, age, beliefs, disability or any other characteristics protected by law.
-
Principal Service Reliability Engineer
2 days ago
Taguig, National Capital Region, Philippines Amadeus Full timeJob TitlePrincipal Service Reliability EngineerPurpose of the roleThe Principal Site Reliability Engineer will be responsible for ensuring the reliability, performance and scalability of our mission-critical platforms. In this role, you will be safeguarding operational excellence in the assigned product, influence reliability strategies, integral in...
-
Principal Networks Site Reliability Engineer
2 days ago
Taguig, National Capital Region, Philippines Cloud Bridge Full time ₱3,300,000 per yearPrincipal Networks Site Reliability EngineerUp to 3.3 million per annum3 days per week in Manilla OfficeMy client are looking for a Site Reliability Engineer (SRE) to join their team. This position demands a strategic individual who can collaborate with cross-functional teams to implement cutting-edge best practices, drive process automation, and elevate...
-
Service Reliability Engineer
7 days ago
Taguig, National Capital Region, Philippines YONDU INC. Full time ₱900,000 - ₱1,200,000 per yearAbout the role: As a Service Reliability Engineer at YONDU INC.', you will be responsible for ensuring the smooth and reliable operation of the company's critical IT systems and infrastructure. This full-time position is based in Taguig City Metro Manila and is a key role in supporting the company's overall business objectives.What you'll be...
-
Site Reliability Engineer
2 days ago
Taguig, National Capital Region, Philippines weSource Management Consultancy Firm Full time ₱600,000 - ₱1,800,000 per yearWe are urgently Hiring for:Site Reliability EngineersHybrid BGCUp to 155K Gross Monthly**The Role**- Implement and maintain Observability platforms such as Datadog- Proactive monitoring of production and other environments to ensure stability, availability,security and integrity- Collaborate with cross-functional teams to ensure the reliability,...
-
Site Reliability Engineer
2 days ago
Taguig, National Capital Region, Philippines Socium - Teams Done Differently Full time ₱900,000 - ₱1,200,000 per yearJob Title:Site Reliability Engineering (SRE) Subject Matter Expert (SME)OverviewWe're looking for an experiencedSRE Subject Matter Expert (SME)to lead our reliability, performance, and automation initiatives. This role will design and drive best-in-classobservability, performance engineering, AIOps, and reliabilitypractices to ensure our systems arestable,...
-
Site Reliability Engineer
2 weeks ago
Taguig, National Capital Region, Philippines Procter & Gamble Full time ₱1,200,000 - ₱2,400,000 per yearJob LocationTaguig CityJob DescriptionInformation Technology (IT) at Procter & Gamble is where business, innovation and technology integrate to build a competitive advantage for P&G. Our mission is clear -- you deliver IT to help P&G win with consumers.Do you love implementing continuous improvement in IT solutions to drive efficiency and agility in meeting...
-
Site Reliability Engineer
1 week ago
Taguig, National Capital Region, Philippines Philtech Full time ₱1,200,000 - ₱2,400,000 per yearAbout the RoleWe are seeking a highly skilled and motivated Site Reliability Engineer (SRE) with a strong focus on front-end application performance and reliability. In this role, you will ensure the scalability, availability, and responsiveness of our web and mobile user-facing platforms. You will collaborate closely with engineering, product, and design...
-
Principal QA Engineer
2 days ago
Taguig, National Capital Region, Philippines Cloud Bridge Full time ₱120,000 - ₱180,000 per yearJoin a UK-based tech-driven company at the forefront of financial data innovation. Our client is seeking an experienced Principal QA Engineer who will be responsible for testing functional improvements and new/updated data streams into the company's Trade Surveillance system. This will involve extensive data analysis, reconciliation and edge case...
-
Site Reliability Engineering
1 week ago
Taguig, National Capital Region, Philippines Tata Consultancy Services Full time ₱2,000,000 - ₱2,500,000 per yearRequired Qualifications10+ years of experience in IT Operations, Reliability Engineering, or Performance Engineering.Deep expertise in observability and monitoring platforms (Prometheus, Grafana, Splunk, Datadog, Dynatrace, ELK, AppDynamics, etc.).Strong background in performance testing tools (JMeter, LoadRunner, Gatling, k6, etc.) and capacity...
-
Site Reliability Engineer
2 days ago
Taguig, National Capital Region, Philippines Philtech Full time ₱2,000,000 - ₱2,500,000 per yearREQUIREMENTS:· Bachelor's degree in Information Technology, Computer Science, Engineering, or any related course.· At least 5 years of working experience as SRE/Application and Maintenance Support.· Knowledge on the following technologies:o Cloud Platforms: Microsoft Azure, Google Cloud Platform (GCP)o Operating Systems:· Experience with Unix/Linux and...