Application tools
Create your documents and check eligibility fast — opportunity details are free to read.
Senior Reliability Engineer Jobs at Oman Investment Authority
Oman Investment Authority (OIA) is seeking to hire a Senior Reliability Engineer to join its technical team in the Sultanate of Oman. This position presents a significant opportunity for experienced engineers to become part of a key national institution responsible for managing government assets and investments, contributing to the stability and efficiency of its critical technological infrastructure.
About Oman Investment Authority
The Oman Investment Authority is the primary investment arm of the Government of the Sultanate of Oman, managing a diversified portfolio of domestic and international assets aimed at generating sustainable returns that support the national economy. With such a broad scope of responsibility, the Authority heavily relies on reliable, secure, and always-available information technology systems and digital infrastructure.
Job Description and Key Responsibilities
The Senior Reliability Engineer will be entrusted with the following core responsibilities:
- Designing, implementing, and improving strategies to ensure the highest levels of availability and resilience for the Authority's critical technical systems and services.
- Developing and operating advanced monitoring and alerting solutions for early problem detection and automated response.
- Leading efforts for Root Cause Analysis (RCA) of system failures and establishing effective plans to prevent their recurrence.
- Working on automating repetitive operational tasks (Operational Toil) to improve team efficiency and focus on high-value tasks.
- Designing and leading Chaos Engineering experiments to test system resilience in controlled environments and uncover weaknesses before they impact operations.
- Collaborating closely with Development (Dev), Operations (Ops), and Security (Sec) teams to foster a "Reliability Engineering" culture throughout the development and product lifecycle.
- Contributing to the establishment of standards and best practice guides for operational quality.
- Analyzing metrics such as Uptime, Latency, and Error Rate to measure performance and make informed improvement decisions.
Qualifications and Core Requirements
To be a qualified candidate for this role, you must possess:
- A Bachelor’s degree in Computer Engineering, Information Technology, or a related field. A Master’s degree is an added advantage.
- A minimum of 5 to 7 years of hands-on experience in Site Reliability Engineering (SRE), Systems Engineering, or Operations within complex, mission-critical technical environments.
- Deep knowledge of operating systems (Linux/Unix), networking, and cloud computing (e.g., AWS, Azure, or Google Cloud).
- Strong scripting skills using languages like Python, Go, or Shell for automation purposes.
- Familiarity with Infrastructure as Code (IaC) concepts and tools such as Terraform or CloudFormation.
- Experience with monitoring and alerting tools (e.g., Prometheus, Grafana, Datadog, or New Relic).
- Solid understanding of distributed systems principles, containers (Docker), and container orchestration systems (Kubernetes).
- Excellent analytical and problem-solving skills, with the ability to troubleshoot complex issues under pressure.
- Strong communication and interpersonal skills, with the ability to work within a team and collaborate with various technical stakeholders.
- Ability to document processes and systems clearly and accurately.
Desired Skills and Personal Attributes
- Results-oriented and proactive mindset.
- Ability to learn quickly and adapt to new technologies.
- Attention to detail and accuracy.
- Leadership skills and ability to mentor and guide.
- Integrity and commitment to the highest standards of confidentiality and security.
Why Work at Oman Investment Authority?
Joining the Oman Investment Authority offers the chance to work in a strategic national institution with a significant impact on Oman's future. The successful engineer will have the opportunity to work with world-class systems within a dynamic environment that encourages innovation and continuous professional development. The Authority provides a competitive work environment with a benefits package commensurate with the importance of the role.
How to Apply for the Position
Application for this position is primarily conducted through specialized online platforms such as Naukrigulf, where the advertisement was posted. Interested candidates should prepare an updated Curriculum Vitae (CV) in English, highlighting their relevant experience and past projects in the fields of reliability and systems engineering. The CV should include measurable achievements and examples of complex problems solved. Please search for the job advertisement under the title "Senior Reliability Engineer" on the Oman Investment Authority website or the mentioned online portal and follow the application instructions provided there carefully.
Location: Sultanate of Oman.
Job Type: Full-time.