Senior Site Reliability Engineer - Kazakhstan - Almaty

Responsibilities

Design, implement, and maintain high-load, highly available systems with a focus on reliability, scalability, and performance(finding and eliminating bottlenecks)
Collaborate with teams to develop and optimize processes, tools, and automation for efficient incident response, monitoring, and continuous improvement
Integrate reliability best practices into key product components and infrastructure;
Engage in the company’s developer and Infrastructure community, promoting the adoption of best practices in observability, performance tuning, and automation. Actively participate in incident resolution, root cause analysis, and post-mortem processes, driving improvements to system reliability, fault tolerance, and operational efficiency across the company
 On-call duty

Deep knowledge and understanding of Linux OS (we use Ubuntu, Amazon Linux)
Knowledge and experience with Kubernetes clusters; Experience with public and private clouds (AWS, GCP)
Deep knowledge of Docker / Containerd containerization technologies
Experience in building CI/CD Pipelines
Experience in using Ansible / Saltstack and writing roles / playbooks / states / pillars
Ability to independently decompose tasks and bring them to the end. 
Direct communication with developers.

Stable salary, official employment
Health insurance
Hybrid work mode and flexile schedule
Relocation package offered for candidates from other regions
Access to professional counseling services including psychological, financial, and legal support
Discount club membership
Diverse internal training programs
Partially or fully payed additional training courses
All necessary work equipment