Loading…
May 2-4, 2018 - Copenhagen, Denmark
Click Here For Information & Registration
Thursday, May 3 • 11:55 - 12:30
Stories from the Playbook - Tina Zhang & Fred van den Driessche, Google (Any Skill Level) (Slides Attached)

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Have you ever wondered how GKE Site Reliability Engineers (SRE) manage an entire fleet of GKE clusters in 15 regions around the world? This talk provides an overview on how the SRE team approach this challenge, what tools are used, the problems encountered and war stories/learning experiences.

The talk introduces the most frequently used parts of our playbook and how SRE endeavours to save your cluster while oncall in an effort to meet our SLOs.

Speakers
avatar for Fred van den Driessche

Fred van den Driessche

Site Reliability Engineer, Google
Fred is an SRE at Google working on Google Kubernetes Engine, primarily focused on improving system observability, both at single cluster and fleet-wide levels. Previously he worked at Microsoft, writing and wrangling Java web apps for their Yammer product.
avatar for Tina Zhang

Tina Zhang

Site Reliability Engineer, Google
Tina joined the Google as a Site Reliability Engineer for GKE in March 2017 and has primarily been working on delivering High Availability Masters in GKE, bringing GKE to more cloud regions and improving monitoring and alerting for the system. Prior to this, she had a previous life... Read More →



Thursday May 3, 2018 11:55 - 12:30 CEST
C1-M2
  Operations, Any