Site Reliability Engineering:
Site Reliability Engineering (SRE)® Foundation & Practitioner
12 months e-learning + exam voucher
€1.119 + VAT
“It does not do to leave a live dragon out of your calculations, if you live near him.”
― J.R.R. Tolkien, The Hobbit or There and Back Again
Site Reliability Engineering:
Site Reliability Engineering (SRE)® Foundation
E-learning + Web-based examination
Key information about this course:
Price: 1.119 € + VAT ( 12 month E-learning + Exam Voucher)
- Fully accredited
- Exam voucher included
- Free handbook included
- Course duration: 30+ hours
- Access period: 12 months
- Tutor support
- Quizzes & practice exams
- Mobile compatible

Become a certified site reliability engineer with this fully accredited SRE Foundation (SREF)℠ & SRE Practitioner (SREP)℠ suite!
We cover the DevOps Institute’s SRE syllabus in its entirety, teaching candidates everything they need to know about site reliability engineering and how it enables businesses to provide and scale market-leading services. Following an introduction to the principles and practices of SRE, the suite covers how to implement them and fully optimize your pipeline. Kickstart your SRE training today!
- Software engineers, Scrum masters, system integrators, tool providers, change agents, consultants, IT directors, and anyone else involved in IT leadership, development, operations, scalability, and reliability
- DevOps and Site Reliability Engineers who wish to verify their knowledge with widely-recognized qualifications
- Organizations wishing to comprehensively integrate SRE’s best practices, insight, tools, and vocabulary
- Leaders and managers focused on modern IT leadership and organizational change
- DevOps-powered companies that wish to optimize their cultures
- Everything required to pass the SRE Foundation and SRE Practitioner certification exams
- The principles, practices, and tools of site reliability engineering and how they enhance development and operations
- How SRE empowers organizations to scale services reliability and economically
- How an organization can be realigned to support SRE best practices
- How SRE is evolving, as well as how site reliability engineers can continue updating their knowledge
- How to understand, set, and track service level objectives (SLOs)
- The relationship between SRE and DevOps
- How to highlight, avoid, and repair antipatterns
- How to define service level objectives and service level indicators (SLI) in distributed ecosystems
- The importance of having an error budget and how to perform error budget calculations
- How to make systems secure and reliable by design
- The importance of full-stack observability and how to review the health of your systems
- How to first implement SRE within an organization
- The advantages of building and operating control platforms as products for technology
- What kind of role AIOps plays in boosting the efficiency of IT services
- How to leverage an incident and command framework and OODA loops in incident response management
- The importance of chaos engineering for building confidence in a system
- This course suite is fully accredited by the DevOps Institute, the organization behind the Site Reliability Engineering certification
- This SRE course suite offers a detailed overview of SRE, its benefits, and how to implement it
- SRE optimizes efficiency by enhancing communication and collaboration between development and operations staff
- Qualified site reliability engineers are in increasingly high demand across all industries and sectors
- Good e-Learning is an award-winning online training provider, as well as a Trusted Education Partner for the DevOps Institute
- This SRE online training course suite comes with several engaging assets, including gamified knowledge checks, mock exams, case studies, and instructor-led videos
- Good e-Learning regularly provides free SRE training resources, including downloadable posters and webinars
- The GEL support team is fully qualified to answer questions on the SRE training syllabus
- Good e-Learning courses can be accessed from any web-enabled device thanks to the FREE Go.Learn app
- Worried about SRE certification costs? Good e-Learning offers FREE exam vouchers with this course
- The course is fully accredited by the DevOps Institute
An Introduction to SRE Foundation (SREF)
This module provides an introduction to the course, explaining its rationale, introducing the subject matter and providing an overview of the ‘SRE Foundation’ syllabus.
This module also provides students with a toolkit:
- Table of contents
- Glossary
- Further resources
- Diagram pack
Module 1: SRE Principles & Practices
The first module of this SRE online course introduces students to the discipline of SRE and compares it with DevOps. It also offers an introduction to the principles and practices of site reliability engineering.
Topics include:
- What is Site Reliability Engineering?
- SRE & DevOps: What is the Difference?
- SRE Principles & Practices
Module 2: Service Level Objectives & Error Budgets
This module looks at service levels, service level objectives (SLOs), error budgets, and error budget policies.
Topics include:
- Service Level Objectives (SLO’s)
- Error Budgets
- Error Budget Policies
Module 3: Reducing Toil
This module looks at the ‘Toil’ concept, why it is a problem, and how it can be managed.
Topics include:
- What is Toil?
- Why is Toil Bad?
- Doing Something About Toil
Module 4: Monitoring & Service Level Indicators
This module introduces service level indicators (SLIs), monitoring, and observability.
Topics include:
- Service Level Indicators (SLI’s)
- Monitoring
- Observability
Module 5: SRE Tools & Automation
This module looks at automation, defining it in terms of DevOps and SRE. It also introduces different types of automation, as well as a number of automation tools.
Topics include:
- Automation Defined
- Automation Focus
- Hierarchy of Automation Types
- Secure Automation
- Automation Tools
Module 6: Anti-Fragility & Learning from Failure
This module examines the principle of learning from failure, and how it can be used for anti-fragility and chaos engineering.
Topics include:
- Why Learn from Failure
- Benefits of Anti-Fragility
- Shifting the Organizational Balance
Module 7: Organizational Impact of SRE
This module introduces how site reliability engineering is managed at an organizational level, as well as how it can be implemented.
Topics include:
- Why Organizations Embrace SRE
- Patterns for SRE Adoption
- Sustainable Incident Response
- Blameless Post-Mortems
- SRE & Scale
Module 8: SRE, Other Frameworks, Trends
This module looks at how SRE can incorporate frameworks such as ITIL, Agile, and IT4IT. It also examines emerging trends that will define the future of SRE, including ‘customer reliability engineering’.
Topics include:
- SRE & Other Frameworks
- SRE Evolution
- Exam Simulator
An Introduction to SRE Practitioner (SREP)
Module zero introduces you to the course’s main features, along with its learning plan, aims, objectives, and structure.
The module also offers a syllabus, diagram pack, glossary, further reading and links document, and links to download essential copies of the framework publications. It also contains some of the most frequently asked questions about the Site Reliability Engineering (SRE) Practitioner qualification, including what you can expect from the exam.
Finally, the module provides an assessment to help you see how much you remember from the Foundation syllabus.
Module 1: SRE Antipatterns
This module covers site reliability engineering antipatterns, which are patterns of behavior that are unproductive and have a negative impact on work.
Module 2: Service Levels and Error Budgets
This module looks at how to identify system boundaries, define capabilities for each system, define SLI for each capability, define SLO targets, and measure the baseline. It also covers multi-service architecture, as well as how to calculate error budgets and use them effectively.
Module 3: Building Secure and Reliable Systems
This module talks about the role of site reliability engineers in systems design and the important considerations to make regarding the changing landscape and security needs of today’s landscape. It also explores current approaches and technologies available for system design, as well as design patterns for building secure, resilient, scalable, and reliable systems.
Module 4: Full-Stack Observability
This module covers the key elements of observability and looks at how instrumentation makes systems more observable.
Module 5: Review: Modules 1-4
This module features an opportunity for learners to reflect on the key concepts and terms covered in modules 1-4. Students play a memory game and are also given access to a concept checker.
Module 6: Platform SRE and AIOps
This module looks at the benefits of adopting a platform-centric view and building and operating a common platform as a product. It also covers how artificial intelligence in IT operations works and how to implement it.
Module 7: SRE and Incident Management
This module covers the key elements of incident management based on the incident command framework and how the OODA loop can be used to integrate process, technology, and resources for incident responses.
Module 8: Chaos Engineering
Chaos engineering is the discipline of experimenting on a distributed system in order to build confidence in the system’s ability to withstand turbulent conditions. This module goes into detail on how to set up a game day exercise required to practice chaos engineering. It also dispels the myths around the subject.
Module 9: Implementing SRE Practices
This module covers the role of SRE in optimizing operations and realizing a DevOps culture. It also goes into detail on the various steps and models used for SRE implementation and execution.
Module 10: Review: Modules 6-9
This module provides an opportunity for learners to reflect on the key concepts and terms covered in Modules 6-9. Students play a memory game and are also given access to a concept checker.
Module 11: Site Reliability Engineering (SRE) Practitioner Wrap-Up
This module brings the course to an end. It revisits earlier modules in the course to help students prepare for the exam.
Practice Exams
This module contains two practice exams. These can help candidates get used to the conditions of the Site Reliability Engineering (SRE) Practitioner exam before attempting the real thing.
PeopleCert Exam Information
Your exam voucher will be sent to you shortly after you purchase your course. The exam voucher must be redeemed on the PeopleCert website before starting your course. Redeeming the exam voucher allows customers to access the official eBook.
PeopleCert exams can be taken online, through your web browser or using ExamShield proctoring software. Candidates choose how to take their exam in their account area on the PeopleCert website.
PeopleCert operate an exam insurance scheme called Take2. If you would like the opportunity to retake the exam , you must purchase Take2 before taking your exam.
Take2 can be purchased through your PeopleCert account up to 15 minutes before your exam.
Alternatively, you can become a PeopleCert Plus member to receive a free resit for all certifications you take.
SRE Foundation (SREF) exam:
- This is a multiple choice exam consisting of 40 questions
- There is a time limit of 60 minutes to complete the exam
- The exam is open book, with only the provided materials being permitted for use
- The pass mark for the exam is 65%: you must score at least 26 out of 40 questions correct
- Candidates can take the exam online or in person with an invigilator
SRE Practitioner (SREP) exam:
- This is a multiple choice exam consisting of 40 questions
- There is a time limit of 90 minutes to complete the exam
- The exam is open book, with only the provided materials being permitted for use
- The pass mark for the exam is 65%: you must score at least 26 out of 40 questions correct
- Candidates can take the exam online or in person with an invigilator
What is SRE?
Site Reliability Engineering (SRE) is the process of continuously testing the ‘reliability’ of a new product in development. This enables developers to better understand and adapt to the needs of operations teams.
How does SRE work?
There are several elements to SRE, including:
- A ‘Service Level Agreement (SLA)’ is outlined to define reliable has to be for end-users
- An ‘Error Budget’ is established to show how much can be spent on fixing errors before production must stop
- Site reliability engineers make themselves available to help with development team workloads and vice versa
- Site reliability engineers actively find and repair problems during the development stage
- Developers take on Operations tasks if necessary
- Site reliability engineers create automation wherever possible for the sake of efficiency and reliability
What is a site reliability engineer?
A ‘site reliability engineer’ is an automation/ coding specialist whose job it is to find and solve problems within Development and Operations.
How can SRE benefit businesses?
An SRE team can not only make a DevOps pipeline more reliable, but also far more efficient and scalable. It can also free Development and Operations team members to focus on improving services elsewhere, boosting the quality of releases. Incorporating SRE will also further improve existing DevOps cultures by encouraging greater communication, clarity, and understanding between teams.
Finally, site reliability engineers are specialists in considering and conveying concerns in relation to the wider organization and can extract metrics that can prove extremely valuable for other departments.
Does SRE complement DevOps?
DevOps and SRE work extremely well together. This is largely because both are designed with automation, inter-team collaboration, and communication in mind, as well as boosting efficiency and reliability within IT pipelines. The SRE Foundation qualification even comes from the DevOps Institute.
What do I need to study site reliability engineering?
There are no prerequisites for taking this course. However, it can be helpful to have pre-existing knowledge of SRE, as well as DevOps.
Why is SRE necessary?
SRE was originally developed by Google. Its purpose is to quantify the relationship between Development and Operations teams, ensuring that code is created efficiently, reliably, and with operational factors in mind. This is particularly valuable in organizations where IT departments and teams have become siloed from one another.
Who can benefit from studying SRE?
SRE is ideal for organizations that rely on developing and releasing code. It works particularly well in DevOps environments and is a popular choice with DevOps engineers and DevOps Leaders. Given the growing popularity of SRE, a qualified and experienced practitioner will often find it easier to take the next step in their career.
Site Reliability Engineering (SRE®) Foundation is provided by Good e-learning, accredited by PeopleCert.