Question: 1
You use Spinnaker to deploy your application and have created a canary deployment stage in the pipeline. Your application has an in-memory cache that loads objects at start time. You want to automate the comparison of the canary version against the production version. How should you configure the canary analysis?
Question: 2
You are managing an application that exposes an HTTP endpoint without using a load balancer. The latency of the HTTP responses is important for the user experience. You want to understand what HTTP latencies all of your users are experiencing. You use Stackdriver Monitoring. What should you do?
Question: 3
Your product is currently deployed in three Google Cloud Platform (GCP) zones with your users divided between the zones. You can fail over from one zone to another, but it causes a 10-minute service disruption for the affected users. You typically experience a database failure once per quarter and can detect it within five minutes. You are cataloging the reliability risks of a new real-time chat feature for your product. You catalog the following information for each risk:
* Mean Time to Detect (MUD} in minutes
* Mean Time to Repair (MTTR) in minutes
* Mean Time Between Failure (MTBF) in days
* User Impact Percentage
The chat feature requires a new database system that takes twice as long to successfully fail over between zones. You want to account for the risk of the new database failing in one zone. What would be the values for the risk of database failover with the new system?
A.
MTTD: 5
MTTR: 10
MTBF: 90
Impact: 33%
B.
MTTD:5
MTTR: 20
MTBF: 90
Impact: 33%
C MTTD:5
MTTR: 10
MTBF: 90
Impact 50%
D.
MTTD:5
MTTR: 20
MTBF: 90
Impact: 50%
Answer : C
Show Answer
Hide Answer
Question: 4
You are part of an organization that follows SRE practices and principles. You are taking over the management of a new service from the Development Team, and you conduct a Production Readiness Review (PRR). After the PRR analysis phase, you determine that the service cannot currently meet its Service Level Objectives (SLOs). You want to ensure that the service can meet its SLOs in production. What should you do next?
Adjust the SLO targets to be achievable by the service so you can bring it into production.
Question: 5
You encounter a large number of outages in the production systems you support. You receive alerts for all the outages that wake you up at night. The alerts are due to unhealthy systems that are automatically restarted within a minute. You want to set up a process that would prevent staff burnout while following Site Reliability Engineering practices. What should you do?