Site Reliability Engineer
5 days ago
REQUIREMENTS:
· Bachelor's degree in Information Technology, Computer Science, Engineering, or any related course.
· At least 5 years of working experience as SRE/Application and Maintenance Support.
· Knowledge on the following technologies:
o Cloud Platforms: Microsoft Azure, Google Cloud Platform (GCP)
o Operating Systems:
· Experience with Unix/Linux and Windows environments
· Knowledge of Unix shell commands and scripting
o Database/Data Warehouse: MongoDB, Azure Cosmos DB, SQL, BigQuery
o Query Languages: KQL (Kusto Query Language), LogQL (Log Query Language), PromQL (Prometheus Query Language)
o Monitoring & Observability: Grafana, AppDynamics
o Containerization & Orchestration: Azure Kubernetes Service (AKS), Helm
o Messaging Systems: Kafka
o CI/CD Tools: GitHub Actions (GHA), general CICD pipelines
o Scripting & Programming Languages: Python, Go, Groovy, Java, Ruby
o Data & Analytics Platforms: Databricks
o Workload Automation: Stonebranch
o ITSM Tools: ServiceNow
o Infrastructure as Code & Automation: Helm, GitOps practices
o Incident Management & Root Cause Analysis
o Experience with AI for anomaly detection, predictive analysis, log and metric analysis, process improvement
o Experience with Agile Framework
o Good to have experience with integrating AI solutions into processes or systems
· Strong analytical and problem-solving abilities
· Excellent communication and collaboration skills
· Ability to work under pressure and manage multiple priorities
· Proactive mindset with a focus on continuous improvement
· Customer-centric approach to reliability and performance
DUTIES AND RESPONSIBILITIES:
· Deliver comprehensive support services, including incident and problem resolution , handling operational and service requests, managing application lifecycle, and facilitating support transitions.
· Ensure high availability and performance of eCommerce fulfillment applications and services.
· Design and implement monitoring, alerting, and observability solutions using Grafana
· Continuously monitor system health, performance, and availability
· Manage and optimize cloud infrastructure on Microsoft Azure and GCP.
· Develop and maintain CI/CD pipelines using GitHub Actions and other tools.
· Automate operational tasks using Python, Go, Groovy, Java, and Ruby.
· Deploy and manage containerized applications using AKS and Helm.
· Collaborate with development and operations teams to improve system reliability and scalability.
· Perform root cause analysis and post-incident reviews to prevent recurrence.
· Coordinate with external vendors to address application issues.
· Participate in on-call rotations and respond to production incidents.
· Work on rotating shifts for a 24x7 support
· Continuously improve deployment strategies, rollback mechanisms, and failover processes.
· Advocate for SRE best practices across engineering teams.
· Mentor, coach and train junior engineers.
· Lead incident resolution calls with urgency and clarity, driving cross-functional collaboration with offshore and onshore counterparts to restore services swiftly.
· Participate in the development and execution of strategies or automations aimed at achieving team excellence and fostering continuous improvement
-
Site Reliability Engineer
5 days ago
Taguig, National Capital Region, Philippines Socium - Teams Done Differently Full time ₱900,000 - ₱1,200,000 per yearJob Title:Site Reliability Engineering (SRE) Subject Matter Expert (SME)OverviewWe're looking for an experiencedSRE Subject Matter Expert (SME)to lead our reliability, performance, and automation initiatives. This role will design and drive best-in-classobservability, performance engineering, AIOps, and reliabilitypractices to ensure our systems arestable,...
-
Site Reliability Engineer
5 days ago
Taguig, National Capital Region, Philippines weSource Management Consultancy Firm Full time ₱600,000 - ₱1,800,000 per yearWe are urgently Hiring for:Site Reliability EngineersHybrid BGCUp to 155K Gross Monthly**The Role**- Implement and maintain Observability platforms such as Datadog- Proactive monitoring of production and other environments to ensure stability, availability,security and integrity- Collaborate with cross-functional teams to ensure the reliability,...
-
Site Reliability Engineer
2 weeks ago
Taguig, National Capital Region, Philippines Tata Consultancy Services Full time ₱900,000 - ₱1,200,000 per yearRole:EIT MQ L3About the Role:We are seeking a skilled and motivated Site Reliability Engineer (SRE) with expertise in supporting and managingMQ and Kafka systems. The ideal candidate will have a strong background in Unix systems administration, experience with Kubernetes (preferred), and a passion for maintaining high availability, performance, and...
-
Site Reliability Engineer
2 weeks ago
Taguig, National Capital Region, Philippines Philtech Full time ₱1,200,000 - ₱2,400,000 per yearAbout the RoleWe are seeking a highly skilled and motivated Site Reliability Engineer (SRE) with a strong focus on front-end application performance and reliability. In this role, you will ensure the scalability, availability, and responsiveness of our web and mobile user-facing platforms. You will collaborate closely with engineering, product, and design...
-
Site Reliability Engineer
5 days ago
Taguig, National Capital Region, Philippines weSource Management Consultancy Firm Full time ₱1,440,000 - ₱2,160,000 per yearWe are looking for Senior Site Reliability Engineer client in BGCSalary: up to 180kSet up: HybridJob responsibilities:Our SRE/DevOps Engineering team combines software and systems engineering to ensure that our production systems are always performing optimally and efficiently.SRE/DevOps Engineers are responsible for understanding how our systems interact...
-
Site Reliability Engineering
2 weeks ago
Taguig, National Capital Region, Philippines Tata Consultancy Services Full time ₱2,000,000 - ₱2,500,000 per yearRequired Qualifications10+ years of experience in IT Operations, Reliability Engineering, or Performance Engineering.Deep expertise in observability and monitoring platforms (Prometheus, Grafana, Splunk, Datadog, Dynatrace, ELK, AppDynamics, etc.).Strong background in performance testing tools (JMeter, LoadRunner, Gatling, k6, etc.) and capacity...
-
Site Reliability Engineer
1 day ago
Taguig, National Capital Region, Philippines Samsung Electronics Philippines Corporation Full time ₱100,000 - ₱120,000 per yearDisclaimer: Samsung has a strict policy on trade secrets. In applying to Samsung and progressing through the recruitment process, you must not disclose any trade secrets of your current or previous employer.Job Summary: We're looking for experienced Site Reliability Engineer to develop, implement, optimize and maintain our platform. You will be responsible...
-
Principal Networks Site Reliability Engineer
5 days ago
Taguig, National Capital Region, Philippines Cloud Bridge Full time ₱3,300,000 per yearPrincipal Networks Site Reliability EngineerUp to 3.3 million per annum3 days per week in Manilla OfficeMy client are looking for a Site Reliability Engineer (SRE) to join their team. This position demands a strategic individual who can collaborate with cross-functional teams to implement cutting-edge best practices, drive process automation, and elevate...
-
Senior Site Reliability Engineer
5 days ago
Taguig, National Capital Region, Philippines weSource Management Consultancy Firm Full time ₱120,000 - ₱200,000 per yearWe are looking for Senior Site Reliability Engineer client in BGCSalary: up to 200kSet up: HybridJob responsibilities:Our DevOps Engineering team combines software and systems engineering in order to ensure that our production systems are always performing optimally and efficiently.DevOps Engineers are responsible for understanding how our systems interact...
-
Service Reliability Engineer
2 weeks ago
Taguig, National Capital Region, Philippines YONDU INC. Full time ₱900,000 - ₱1,200,000 per yearAbout the role: As a Service Reliability Engineer at YONDU INC.', you will be responsible for ensuring the smooth and reliable operation of the company's critical IT systems and infrastructure. This full-time position is based in Taguig City Metro Manila and is a key role in supporting the company's overall business objectives.What you'll be...