Informações sobre o curso
70,247 visualizações recentes

100% online

Comece imediatamente e aprenda em seu próprio cronograma.

Prazos flexíveis

Redefinir os prazos de acordo com sua programação.

Nível avançado

Aprox. 22 horas para completar

Sugerido: 4 weeks of study, estimated 2 hours per week....

Inglês

Legendas: Inglês

O que você vai aprender

  • Check

    How to make systems reliable

  • Check

    Understanding SLIs, SLOs and SLAs

  • Check

    Quantifying risks to and consequences of SLOs

100% online

Comece imediatamente e aprenda em seu próprio cronograma.

Prazos flexíveis

Redefinir os prazos de acordo com sua programação.

Nível avançado

Aprox. 22 horas para completar

Sugerido: 4 weeks of study, estimated 2 hours per week....

Inglês

Legendas: Inglês

Programa - O que você aprenderá com este curso

Semana
1
27 minutos para concluir

Introduction to SRE

This module is intended to bring you up to speed on the concepts underpinning SRE, CRE, and SLOs. If you're already familiar with these concepts, you may still find new information and perspectives in this module, but it is not necessary to complete it.

...
9 vídeos ((Total 15 mín.)), 1 teste
9 videos
Introduction15s
Intro10s
CRE's Three Reliability Principles3min
Reliability in the Cloud3min
How SLOs help your business make decisions1min
How SLOs help you build features faster1min
How SLOs help you balance operational and project work1min
Making SLOs work for your organization59s
1 exercício prático
DevOps/SRE1min
1 hora para concluir

Targeting Reliability

In this module we’re going to talk about how you measure the desired reliability of a service. We will address what to consider when setting SLOs for your application within your organization. We'll look at the three principles we use to measure the desired reliability of a service: figuring out what you want to promise and to whom, figuring out the metrics you care about that make your service reliability “good", and finally, deciding how much reliability is good enough.

...
7 vídeos ((Total 14 mín.)), 4 testes
7 videos
SLOs vs SLAs2min
The happiness test2min
How do we measure reliability?3min
Edge cases2min
100% is the wrong target1min
Iterating1min
4 exercícios práticos
A working service5min
SLOs and SLAs7min
Reliability and iterating1min
Targeting Reliability Assessment7min
1 hora para concluir

Operating for Reliability

In this module, we’ll start by introducing a mechanism for quantifying unreliability using something called an error budget. We'll show how error budgets help you decide when to focus on making a service more reliable. And then we'll learn about some of the engineering and operational improvements that can help you do that.

...
7 vídeos ((Total 19 mín.)), 3 testes
7 videos
Error budgets3min
Everything is a trade-off3min
Error budgets: advanced concepts2min
Axes of improvement4min
Operational approach to increasing reliability2min
Module summary50s
3 exercícios práticos
Error budgets5min
Increasing reliability3min
Operating for Reliability Assessment5min
Semana
2
1 hora para concluir

Choosing a Good SLI

In this module we will start off by taking a look at some characteristics of monitoring metrics that can make them useful as SLIs and contrast these against other metrics that are less useful. Because the choice of where to measure an SLI is a key variable, we'll cover the five main ways you can measure an SLI and compare their pros and cons.

...
14 vídeos ((Total 41 mín.)), 3 testes
14 videos
User happiness in metric form1min
The properties of good SLI metrics4min
Ways of measuring SLIs4min
The SLI menu2min
The SLI equation1min
Request / Response SLIs5min
Data processing SLIs6min
"But my system is really complex!"2min
Managing complexity with aggregation2min
Managing complexity with bucketing3min
Achieveable SLOs1min
Aspirational SLOs1min
Continuous improvement1min
3 exercícios práticos
Measuring happiness1min
Commonly used SLIs2min
Correctness and Coverage2min
Semana
3
5 horas para concluir

Developing SLOs and SLIs

In this module, we'll start off with an overview of our four step process for developing SLOs and SLIs for a user journey. We'll introduce the fictional company that created our example mobile game, the infrastructure that we'll be working with, and the simple user journey we'll be applying the four step process to.

...
7 vídeos ((Total 18 mín.)), 4 testes
7 videos
The 4 step process1min
Our example game1min
Loading the profile page1min
Refining SLI specifications4min
Looking for observability gaps2min
Failure modes4min
2 exercícios práticos
Postmortem!15min
Setting Achievable SLO targets15min
Semana
4
4 horas para concluir

Quantifying Risks to SLOs

In this module we'll be taking a critical look at the availability risks for our example service. We want to answer the question: "are our SLO targets and error budgets realistic?"

...
4 vídeos ((Total 20 mín.)), 2 testes
4 videos
Is your error budget realistic?3min
Modeling risks in our spreadsheet5min
Analyzing risk9min
1 hora para concluir

Consequences of SLO Misses

In this module, we'll cover best practices for documenting your SLOs, the rationale behind a formal error budget policy and how best to create one and finally, we'll look at an example error budget policy in order to understand the trade-offs and incentives that play out during negotiations when trying to write an error budget policy.

...
9 vídeos ((Total 21 mín.)), 3 testes
9 videos
No surprises2min
A dashboard example1min
Why an error budget policy?2min
Fundamentals of an error budget policy3min
How to draft an error budget policy3min
Example policy thresholds3min
A hypothetical policy scenario3min
Course conclusion and video wrap up47s
3 exercícios práticos
Error budget policies1min
Error budget policy -- considerations2min
Consequences of SLO Misses1min
4.5
20 avaliaçõesChevron Right

Principais avaliações do Site Reliability Engineering: Measuring and Managing Reliability

por RAMay 4th 2019

This is a excellent course that covers the in depth topics on Site Reliability Engineering

Sobre Google Cloud

We help millions of organizations empower their employees, serve their customers, and build what’s next for their businesses with innovative technology created in—and for—the cloud. Our products are engineered for security, reliability, and scalability, running the full stack from infrastructure to applications to devices and hardware. Our teams are dedicated to helping customers apply our technologies to create success....

Perguntas Frequentes – FAQ

  • Ao se inscrever para um Certificado, você terá acesso a todos os vídeos, testes e tarefas de programação (se aplicável). Tarefas avaliadas pelos colegas apenas podem ser enviadas e avaliadas após o início da sessão. Caso escolha explorar o curso sem adquiri-lo, talvez você não consiga acessar certas tarefas.

  • Quando você adquire o Certificado, ganha acesso a todo o material do curso, incluindo avaliações com nota atribuída. Após concluir o curso, seu Certificado eletrônico será adicionado à sua página de Participações e você poderá imprimi-lo ou adicioná-lo ao seu perfil no LinkedIn. Se quiser apenas ler e assistir o conteúdo do curso, você poderá frequentá-lo como ouvinte sem custo.

Mais dúvidas? Visite o Central de Ajuda ao Aprendiz.