Site Reliability Engineer
1 day ago
Job Description:
• Handle service monitoring, incident response, and drive technical support efficiency
• Responsible for managing and maintaining network monitoring tools, systems, and
processes that ensure the availability, scalability, and performance of our production
environments.
• Responsible for incident handling, service monitoring, and technical support efficiency.
• Closely work with developers, DevOps, infrastructure teams, and different stakeholders
to achieve proactive incident prevention, issue resolution and incident documentations.
Key Responsibilities:
• Ensure that all tickets are updated and handled based on set KPI's and SLA's
• Manage monitoring, alerting, and logging tools to ensure system health and service
uptime.
• Ensure early detection, triage and escalation of service degradation based on defined
service level agreement
• Trigger L2 ticket handling and on-call rotations for critical incidents.
• Execute triage, diagnosis, and resolution of incidents required for L3 escalations, both
internal and 3rd party support teams
• Support major incident response, contribute to root cause analysis (RCA), and help
document postmortems.
• Track, analyze, and act on incident trends and recurring technical issues.
• Use data from ticketing systems (Jira, ServiceNow, etc.) to improve team responsiveness
and resolution quality.
• Update and maintain SOPs, runbooks, and knowledge base articles including the
documentation of known issues, fixes, and playbooks to improve mean time to resolution.
• Collaborate with development and QA teams to improve deployment readiness and
reliability
• Participate in technical competency mapping to ensure coverage and reduce unnecessary
escalations.
Skills and Competencies:
• Hands-on experience with ITSM platforms (e.g., ServiceNow, Jira Service Management).
• Familiarity with ITIL principles and ITSM process areas (incident, problem, request,
change, asset, and service catalog management).
• Basic knowledge of IT infrastructure components (networks, servers, applications) and
how they support IT services.
• Experience in monitoring system performance and escalating outages or performance
degradation.
• Ability to troubleshoot and document IT issues effectively for escalation and closure.
• Strong attention to detail in documentation, ticket updates, and asset records.
• Familiarity with regulatory and compliance frameworks (e.g., BSP, PDIC, ISO 27001,
COBIT) is a plus.
• Clear written and verbal communication skills for ticket handling and team collaboration.
• Proactive, detail-oriented, and able to manage multiple tasks in a structured IT operations
environment.
Qualifications and Experience:
• Bachelor's degree in Electronics Engineering, Information Technology, Computer
Science, Management Information Systems, or equivalent.
• 2–5 years of experience in Site Reliability Engineering, DevOps, or Infrastructure roles..
• Minimum of 3 years' experience in Site Reliability Engineering, DevOps, or Infrastructure roles is required.
• Hands-on experience with monitoring tools (e.g., Prometheus, Grafana, ELK, or
Datadog).
• Familiarity with incident response and troubleshooting in production systems.
• Experience with at least one cloud platform (AWS, GCP, or Azure).
• Knowledgeable in scripting (e.g., Python, Bash) and Linux systems.
• Exposure to ITIL-based processes, especially Incident and Problem Management.
• Experience working in fintech, banking, or SaaS with high availability SLAs.
• Familiarity with DevOps practices, CI/CD pipelines, and cloud-based monitoring tools.
• Experience with automation platforms
• Knowledge of BSP regulatory frameworks, policies, and guidelines.
-
Site Reliability Engineer
4 weeks ago
Manila, Philippines Tata Consultancy Services Full timeHuman Resources Executive at Tata Consultancy Services Job Description: Site Reliability Engineering (SRE) SME Position Overview We are seeking a highly skilled Site Reliability Engineering (SRE) Subject Matter Expert (SME) to lead and advance our observability, performance engineering, reliability, and AIOps practices. The SME will be responsible for...
-
Site Reliability Engineer 14N25
3 weeks ago
, Metro Manila, Philippines TALENTMATE Full timeJob Description As a Site Reliability Engineer (SRE) 14N25, you will be integral in transforming and maintaining reliable systems while working across diverse engineering, operations, and support teams. Your primary focus will be ensuring the uptime, performance, and resilience of crucial online platforms and services. By employing both software engineering...
-
Site Reliability Engineer
4 weeks ago
Manila, Philippines Russell Tobin Full timeSenior Associate - Talent Acquisition - Corporate Strategy Hiring | Specialized in APAC We are seeking a highly skilled Site Reliability Engineering (SRE) Subject Matter Expert (SME) to lead and advance our observability, performance engineering, reliability, and AIOps practices. The SME will be responsible for designing, implementing, and evangelizing...
-
Site Reliability Engineer
4 weeks ago
, Metro Manila, Philippines Michael Page Full timeJoin a growing team. Enjoy market-aligned salaries & benefits. About Our Client The hiring company is a large organization in the healthcare industry, focused on delivering innovative solutions to improve patient care and operational efficiency. The company is committed to leveraging cutting-edge technology to support its services. Job Description Oversee...
-
Site Reliability Engineer
3 weeks ago
, Metro Manila, Philippines QualityKiosk Technologies Full timeUniting Talent with Opportunity | Talent Acquisition | Strategic Hiring | Global Recruitment | SAAS GTM & Tech Hiring | MarTech | FinTech Experience: 6 to 10 years Location: Makati About QualityKiosk Technologies QualityKiosk Technologies is one of the world’s largest independent Quality Engineering (QE) providers and digital transformation enablers,...
-
Site Reliability Engineer
1 day ago
Manila, National Capital Region, Philippines Russell Tobin Full time ₱120,000 - ₱180,000 per yearWe are seeking a highly skilledSite Reliability Engineering (SRE) Subject Matter Expert (SME)to lead and advance our observability, performance engineering, reliability, and AIOps practices. The SME will be responsible for designing, implementing, and evangelizing modern SRE capabilities that improve system reliability, scalability, and efficiency across our...
-
Site Reliability Engineering Manager
4 weeks ago
Manila, Philippines Russell Tobin Full timeSenior Associate - Talent Acquisition - Corporate Strategy Hiring | Specialized in APAC We are seeking a highly skilled Site Reliability Engineering (SRE) Subject Matter Expert (SME) to lead and advance our observability, performance engineering, reliability, and AIOps practices. The SME will be responsible for designing, implementing, and evangelizing...
-
Senior Site Reliability Engineer
2 weeks ago
Manila, National Capital Region, Philippines CDOps Tech Full time ₱2,000,000 - ₱2,500,000 per yearAbout the OpportunityWe are seeking a seasoned and passionate Senior Site Reliability Engineer for a high-impact contract engagement with one of our key clients, a leader in the marketing-tech sector. This is not just a typical SRE role; you will be the foundational expert responsible for spearheading theadoption of SRE culture and practiceswithin the...
-
Site Reliability Engineering Manager
1 day ago
Manila, National Capital Region, Philippines Russell Tobin Full time $60,000 - $120,000 per yearWe are seeking a highly skilledSite Reliability Engineering (SRE) Subject Matter Expert (SME)to lead and advance our observability, performance engineering, reliability, and AIOps practices. The SME will be responsible for designing, implementing, and evangelizing modern SRE capabilities that improve system reliability, scalability, and efficiency across our...
-
Site Reliability Engineer
2 weeks ago
Manila, National Capital Region, Philippines Aumtrend Full time ₱1,200,000 - ₱2,400,000 per yearRole : Site Reliability Engineer -IBM MQCompany : One of the global clientLocation : BGC - ManilaWork Setup : Hybrid-2 days onsite/weekSchedule: Day shiftPermanent position & Direct Hiring by the clientRequired Technical Skill Set :Hiring for two levels: L3 (Senior) and L2.5 (Mid-senior)Technical requirements :Core focus: IBM MQ and Kafka administration —...