
System Reliability Specialist
2 days ago
As a seasoned Site Reliability Engineer, you will be responsible for safeguarding our production environment's health, performance, and scalability. This entails applying software engineering principles to resolve operational challenges, automating processes, and ensuring our platform exceeds the reliability expectations of our customers.
You will work with a talented team of engineers across different time zones, making your mark on a platform that handles millions of transactions. This role requires a deep passion for eliminating toil, a proactive approach to system stability, and excellent communication skills to thrive in a remote-first environment.
Responsibilities- Infrastructure Design & Automation: Utilize Infrastructure as Code (IaC) principles to design, build, and maintain our core infrastructure. You will play a key role in evolving our Continuous Integration/Continuous Deployment (CI/CD) pipelines to ensure safe, rapid, and reliable releases.
- Reliability & Scalability Enhancements: Proactively identify and address performance bottlenecks, single points of failure, and scalability limits. You will define and monitor Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to maintain and improve platform health.
- Observability Champion: Implement and manage comprehensive monitoring, logging, and alerting systems (e.g., Prometheus, Grafana, ELK Stack) to provide deep insights into system behavior and ensure rapid incident detection.
- Incident Management Leadership: Participate in our on-call rotation, acting as a key player in incident response and resolution. You will lead blameless post-mortems to identify root causes and implement preventative measures.
- Collaboration & Empowerment: Work closely with software engineering teams to foster a culture of reliability. You will provide guidance on building resilient services, implementing best practices for observability, and improving the developer experience.
- Security Foundation: Implement and maintain security best practices across our cloud infrastructure, ensuring our platform is robust and compliant.
Required Skills & Qualifications
- 5+ years of hands-on experience with a major cloud provider, preferably AWS (EC2, S3, RDS, VPC, IAM, etc.).
- Deep proficiency with tools like Terraform or CloudFormation to manage infrastructure declaratively.
- Strong experience with Docker and container orchestration systems like Kubernetes (EKS) or ECS.
- Proven ability to build, optimize, and manage CI/CD pipelines using tools like GitLab CI, Jenkins, or CircleCI.
- Hands-on experience with modern monitoring and logging tools (e.g., Prometheus, Grafana, Loki, Alertmanager, ELK Stack).
- Proficiency in at least one programming language, such as Go, Python, or Bash, for automation and tooling.
- Excellent written and verbal communication skills, with a proven ability to work effectively and asynchronously in a distributed team environment.
Preferred Qualifications
- Experience in the payments or FinTech industry.
- Familiarity with service mesh technologies like Istio or Linkerd.
- Experience with database administration (e.g., PostgreSQL, MySQL).
- Knowledge of networking, security principles, and compliance standards (e.g., PCIDSS).
-
Infrastructure Reliability Specialist
1 week ago
Makati City, National Capital Region, Philippines beBee Reliability Full time $90,000 - $124,000Reliability Engineer Position">This role focuses on supporting and enhancing the critical components of our real-time trading infrastructure. You will work alongside production experts across global regions to ensure the availability, performance, and resilience of these high-throughput platforms.">Key Responsibilities:">">Ensure the availability and health...
-
System Reliability Specialist
14 hours ago
Makati City, National Capital Region, Philippines beBeeReliability Full time ₱4,500,000 - ₱6,000,000Expertise in System ReliabilityWe seek a seasoned System Reliability Specialist to join our global team. In this pivotal role, you will be responsible for ensuring the dependability and performance of our real-time trading infrastructure.The successful candidate will work collaboratively with production experts across various regions to guarantee seamless...
-
Reliable Systems Engineer
2 days ago
Marikina City, National Capital Region, Philippines beBeeSystemReliability Full time $120,000 - $200,000Job DescriptionAs a proactive systems reliability specialist, you will be responsible for maintaining the dependability, performance, and scalability of our systems. You thrive in environments where you can automate and improve existing processes, identifying system bottlenecks and implementing best practices to prevent downtime and ensure high...
-
Reliability Systems Specialist
3 days ago
Makati City, National Capital Region, Philippines beBeeSecurity Full time ₱680,000 - ₱1,020,000Job OverviewThe OT Security Engineer is responsible for ensuring the reliability and security of Shell's assets worldwide. This role is part of the SEAM organization, which integrates Safety, Environment & Asset Management activities to support Shell's business.Key ResponsibilitiesEnsure compliance with relevant industrial and Shell policies & standards by...
-
Reliability Engineer for Middleware Systems
2 weeks ago
Makati City, National Capital Region, Philippines beBeeInfrastructure Full time ₱900,000 - ₱1,200,000Job OverviewA senior infrastructure reliability engineer is required to ensure the reliability, performance and scalability of middleware systems and cloud-based infrastructure. The role involves hands-on technical work focused on ensuring system uptime, performance and security.
-
Key Blockchain System Reliability Specialist
2 days ago
Marikina City, National Capital Region, Philippines beBeeWeb3 Full time $120,000 - $200,000System Reliability Engineer RoleJoin a dynamic team to fill the role of System Reliability Engineer. This challenging position demands expertise in Web3 infrastructures, with a focus on blockchain network management, complex issue resolution, and proactive system monitoring.The ideal candidate will possess a deep understanding of Linux/Unix system...
-
Reliable Systems Expert
1 day ago
Makati City, National Capital Region, Philippines beBeeEngineer Full time $120,000 - $150,000Sr Site Reliability Engineer Job DescriptionWe are seeking a highly skilled Sr Site Reliability Engineer to drive the adoption of SRE principles across our teams and applications.
-
Reliability Systems Engineer
1 day ago
Makati City, National Capital Region, Philippines beBeeReliability Full time $100,000 - $120,000A career as a Site Reliability Engineer is an exciting opportunity to deploy and run cutting-edge technologies such as OpenStack, Kubernetes, and open source applications. Key ResponsibilitiesIdentify and address system incidents, monitor application performance, anticipate potential issues, and enable product refinement to achieve high-quality standards in...
-
Reliability Architect
2 weeks ago
Quezon City, National Capital Region, Philippines beBeeReliability Full time ₱900,000 - ₱1,200,000Our ideal candidate will be a visionary who excels in designing and building high-quality solutions that improve application health and performance, enable visibility, and automate processes to enhance system reliability and customer experience.Key ResponsibilitiesCollaborate with cross-functional teams to provide strategic application monitoring...
-
Reliability Systems Architect
1 day ago
Marikina City, National Capital Region, Philippines beBeeInfrastructure Full time $120,000 - $180,000Job OpportunityThe Site Reliability Engineering Manager position offers a chance to lead a team of engineers in implementing and maintaining the company's infrastructure and applications.This role requires a strong understanding of Linux operations, software engineering, and product development.Key Responsibilities:Oversee daily agile devops practicesAct as...