- Course overview
- Course details
- Prerequisites
Course overview
About this course
This course is a structured program designed to introduce the principles and practices of SRE to professionals in the IT domain. It focuses on the core philosophy of SRE, which is to create scalable and highly reliable software systems.
At course completion
You will able to:
-
- Understand the fundamentals of Site Reliability Engineering and differentiate between SRE and DevOps.
- Learn how to develop and implement Service Level Objectives (SLOs) to measure and maintain reliability.
- Manage error budgets and create policies to balance pace of innovation with system reliability.
- Identify toil in operational tasks and explore strategies to reduce it, enhancing productivity.
- Gain knowledge on Service Level Indicators (SLIs) for effective monitoring and ensuring observability.
- Explore the benefits of automation in SRE, understand the hierarchy of automation types, and learn about secure automation practices.
- Discover tools essential for SRE tasks and learn how to integrate them into the workflow.
- Embrace anti-fragility by learning from failures and shifting towards a more resilient organizational culture.
- Analyze the organizational impacts of adopting SRE, including on-call necessities and conducting blameless post-mortems.
- Discuss how SRE interacts with other frameworks and look ahead to the future of SRE in the industry.
Audience profile
- Business Managers
- Business Stakeholders
- Change Agents
- Consultants
- DevOps Practitioners
- IT Directors
- IT Managers
- IT Team Leaders
- Product Owners
- Scrum Masters
Course details
Module 1: SRE Principles & Practices
Module 2: Service Level Objectives & Error Budgets
Module 3: Reducing Toil
Module 4: Monitoring & Service Level Indicators
Module 5: SRE Tools & Automation
Module 6: Anti-Fragility & Learning from Failure
Module 7: Organizational Impact of SRE
Prerequisites
- Basic understanding of software development or IT operations processes and principles.
- Familiarity with DevOps concepts and practices.
- Awareness of system administration, network administration, or software development.
- Interest in improving service reliability and working within an SRE or DevOps team environment.
Enquiry
Course : Site Reliability Engineering Foundation
Enquiry
request for : Site Reliability Engineering Foundation