blob: 80a484d0ddc4b7cb75d94087d16353e529953488 [file] [log] [blame]
<!DOCTYPE html>
<html lang="en">
<head>
<link rel="preconnect" href="https://www.googletagmanager.com">
<script >(function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':
new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],
j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src=
'https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);
})(window,document,'script','dataLayer','GTM-W8MVQXG');</script>
<meta charset="utf-8">
<meta name="description" content="Go is an open source programming language that makes it easy to build simple, reliable, and efficient software.">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="theme-color" content="#00add8">
<link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Google+Sans:400,500,600|Work+Sans:400,500,600|Roboto:400,500,700|Open+Sans:Source+Code+Pro|Material+Icons">
<link rel="stylesheet" href="/css/styles.css">
<script>(function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':
new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],
j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src=
'https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);
})(window,document,'script','dataLayer','GTM-W8MVQXG');</script>
<script src="/js/site.js"></script>
<title>Actuating Google Production: How Google’s Site Reliability Engineering Team Uses Go - go.dev</title>
</head>
<body class="Site">
<noscript><iframe src="https://www.googletagmanager.com/ns.html?id=GTM-W8MVQXG"
height="0" width="0" style="display:none;visibility:hidden"></iframe></noscript>
<header class="Site-header js-siteHeader">
<div class="Banner">
<div class="Banner-inner">
<div class="Banner-message">Black Lives Matter</div>
<a class="Banner-action"
href="https://support.eji.org/give/153413/#!/donation/checkout"
target="_blank"
rel="noopener">
Support the Equal Justice Initiative
</a>
</div>
</div>
<div class="Header Header--dark">
<nav class="Header-nav">
<a href="https://go.dev/">
<img
class="js-headerLogo Header-logo"
src="/images/go-logo-white.svg"
alt="Go">
</a>
<div class="Header-rightContent">
<ul class="Header-menu">
<li class="Header-menuItem ">
<a href="/solutions/">Why Go</a>
</li>
<li class="Header-menuItem ">
<a href="/learn/">Getting Started</a>
</li>
<li class="Header-menuItem ">
<a href="https://pkg.go.dev">Discover Packages</a>
</li>
<li class="Header-menuItem ">
<a href="/about">About</a>
</li>
</ul>
<button class="Header-navOpen js-headerMenuButton Header-navOpen--white" aria-label="Open navigation.">
</button>
</div>
</nav>
</div>
</header>
<aside class="NavigationDrawer js-header">
<nav class="NavigationDrawer-nav">
<div class="NavigationDrawer-header">
<a href="https://go.dev/">
<img class="NavigationDrawer-logo" src="/images/go-logo-blue.svg" alt="Go.">
</a>
</div>
<ul class="NavigationDrawer-list">
<li class="NavigationDrawer-listItem NavigationDrawer-listItem--active">
<a href="/solutions/">Why Go</a>
</li>
<li class="NavigationDrawer-listItem ">
<a href="/learn/">Getting Started</a>
</li>
<li class="NavigationDrawer-listItem ">
<a href="https://pkg.go.dev">Discover Packages</a>
</li>
<li class="NavigationDrawer-listItem ">
<a href="/about">About</a>
</li>
</ul>
</nav>
</aside>
<div class="NavigationDrawer-scrim js-scrim" role="presentation"></div>
<main class="SiteContent SiteContent--default">
<div>
<div class="WhoUsesSubPage-hero--caseStudy">
<div class="WhoUsesSubPage-heroInner--caseStudy">
<div class="WhoUsesSubPage-heroContent--caseStudy">
<div class="WhoUsesSubPage-heroText--caseStudy">
<div class="BreadcrumbNav">
<ol class="BreadcrumbNav-inner">
<li class="BreadcrumbNav-li ">
<a class="BreadcrumbNav-link" href="/solutions/">
Why Go
</a>
</li>
<li class="BreadcrumbNav-li ">
<a class="BreadcrumbNav-link" href="/solutions/google/">
Google
</a>
</li>
<li class="BreadcrumbNav-li active">
<a class="BreadcrumbNav-link" href="/solutions/google/sitereliability">
Google Site Reliability Engineering (SRE)
</a>
</li>
</ol>
</div>
<h1>Actuating Google Production: How Google’s Site Reliability Engineering Team Uses Go</h1>
<div class="Article-author">Pierre Palatin, Site Reliability Engineer</div>
</div>
<div class="WhoUsesSubPage-heroImg">
<img src="/images/go_sitereliability_case_study.png" alt="Google Site Reliability Engineering (SRE)">
</div>
</div>
</div>
</div>
<article class="Article Article--solutions">
<div class="CaseStudy-content">
<div class="CaseStudy-contentBody">
<p>Google runs a small number of very large services. Those services are powered
by a global infrastructure covering everything a developer needs: storage
systems, load balancers, network, logging, monitoring, and much more.
Nevertheless, it is not a static system—it cannot be. Architecture evolves,
new products and ideas are created, new versions must be rolled out, configs
pushed, database schema updated, and more. We end up deploying changes to our
systems dozens of times per second.</p>
<p>Because of this scale and critical need for reliability, Google pioneered Site
Reliability Engineering (SRE), a role that many other companies have since adopted.
“SRE is what you get when you treat operations as if it’s a software problem.
Our mission is to protect, provide for, and progress the software and systems
behind all of Google’s public services with an ever-watchful eye on their
availability, latency, performance, and capacity.”
<a href="https://sre.google/" rel="noreferrer" target="_blank">Site Reliability Engineering (SRE)</a>.</p>
<div class="BackgroundQuote">
<p class="BackgroundQuote-body">
“Go promised a sweet spot between performance and readability that neither of
the other languages [Python and C++] were able to offer.”
</p>
</div>
<p>In 2013-2014, Google’s SRE team realized that our approach to production
management was not cutting it anymore in many ways. We had advanced far beyond
shell scripts, but our scale had so many moving pieces and complexities that a
new approach was needed. We determined that we needed to move toward a
declarative model of our production, called &ldquo;Prodspec&rdquo;, driving a dedicated
control plane, called &ldquo;Annealing&rdquo;.</p>
<p>When we started those projects, Go was just becoming a viable option for
critical services at Google. Most engineers were more familiar with Python
and C++, either of which would have been valid choices. Nevertheless, Go
captured our interest. The appeal of novelty was certainly a factor of
course. But, more importantly, Go promised a sweet spot between performance
and readability that neither of the other languages were able to offer. We
started a small experiment with Go for some initial parts of Annealing and
Prodspec. As the projects progressed, those initial parts written in Go found
themselves at the core. We were happy with Go—its simplicity grew on us, the
performance was there, and concurrency primitives would have been hard to
replace.</p>
<div class="BackgroundQuote">
<p class="BackgroundQuote-body">
“Now the majority of Google production is managed and maintained by our systems
written in Go.”
</p>
</div>
<p>At no point was there ever a mandate or requirement to use Go, but we had no
desire to return to Python or C++. Go grew organically in Annealing and
Prodspec. It was the right choice, and thus is now our language of choice.
Now the majority of Google production is managed and maintained by our systems
written in Go.</p>
<p>The power of having a simple language in those projects is hard to overstate.
There have been cases where some feature was indeed missing, such as the
ability to enforce in the code that some complex structure should not be
mutated. But for each one of those cases, there have undoubtedly been tens or
hundred of cases where the simplicity helped.</p>
<div class="BackgroundQuote">
<p class="BackgroundQuote-body">
“Go’s simplicity means that the code is easy to follow, whether it is to spot
bugs during review or when trying to determine exactly what happened during a
service disruption.”
</p>
</div>
<p>For example, Annealing impacts a wide variety of teams and services meaning
that we relied heavily on contributions across the company. The simplicity of
Go made it possible for people outside our team to see why some part or another
was not working for them, and often provide fixes or features themselves. This
allowed us to quickly grow.</p>
<p>Prodspec and Annealing are in charge of some quite critical components. Go’s
simplicity means that the code is easy to follow, whether it is to spot bugs
during review or when trying to determine exactly what happened during a
service disruption.</p>
<p>Go performance and concurrency support have also been key for our work. As our
model of production is declarative, we tend to manipulate a lot of structured
data, which describes what production is and what it should be. We have large
services so the data can grow large, often making purely sequential processing
not efficient enough.</p>
<p>We are manipulating this data in many ways and many places. It is not a matter
of having a smart person come up with a parallel version of our algorithm. It
is a matter of casual parallelism, finding the next bottleneck and
parallelising that code section. And Go enables exactly that.</p>
<p>As a result of our success with Go, we now use Go for every new development for
Prodspec and Annealing.</p>
<p>In addition to the Site Reliability Engineering team, engineering teams across
Google have adopted Go in their development process. Read about how the
<a href="/solutions/google/coredata/">Core Data Solutions</a>,
<a href="/solutions/google/firebase/">Firebase Hosting</a>, and
<a href="/solutions/google/chrome/">Chrome</a> teams use Go to build fast, reliable,
and efficient software at scale.</p>
</div>
<div class="CaseStudy-contentAside">
<div class="CaseStudy-aboutBlock">
<img src="/images/logos/sitereliability.svg" class="CaseStudy-aboutBlockImg" alt="Google Site Reliability Engineering (SRE)">
<h3 class="CaseStudy-aboutBlockTitle">
About Google Site Reliability Engineering (SRE)
</h3>
<p class="CaseStudy-aboutBlockBody"><p>Google’s Site Reliability Engineering team has a mission to protect, provide for, and progress the software and systems behind all of Google’s public services — Google Search, Ads, Gmail, Android, YouTube, and App Engine, to name just a few — with an ever-watchful eye on their availability, latency, performance, and capacity.</p>
<p>They shared their experience building core production management systems with Go, coming from experience with Python and C++.</p>
</p>
</div>
</div>
</div>
</article>
</div>
</main>
<footer class="Site-footer">
<div class="Footer">
<div class="Container">
<div class="Footer-links">
<div class="Footer-linkColumn">
<a href="/solutions/" class="Footer-link Footer-link--primary">
Why Go
</a>
<a href="/solutions/#use-cases" class="Footer-link">
Use Cases
</a>
<a href="/solutions/#case-studies" class="Footer-link">
Case Studies
</a>
</div>
<div class="Footer-linkColumn">
<a href="/learn/" class="Footer-link Footer-link--primary">
Getting Started
</a>
<a href="https://play.golang.org" class="Footer-link">
Playground
</a>
<a href="https://tour.golang.org" class="Footer-link">
Tour
</a>
<a href="https://stackoverflow.com/questions/tagged/go?tab=Newest" class="Footer-link">
Stack Overflow
</a>
</div>
<div class="Footer-linkColumn">
<a href="https://pkg.go.dev" class="Footer-link Footer-link--primary">
Discover Packages
</a>
</div>
<div class="Footer-linkColumn">
<a href="/about" class="Footer-link Footer-link--primary">
About
</a>
<a href="https://golang.org/dl/" class="Footer-link">
Download
</a>
<a href="https://blog.golang.org" class="Footer-link">
Blog
</a>
<a href="https://github.com/golang/go/issues" class="Footer-link">
Issue Tracker
</a>
<a href="https://golang.org/doc/devel/release.html" class="Footer-link">
Release Notes
</a>
<a href="https://blog.golang.org/go-brand" class="Footer-link">
Brand Guidelines
</a>
<a href="https://golang.org/conduct" class="Footer-link">
Code of Conduct
</a>
</div>
<div class="Footer-linkColumn">
<a href="https://www.twitter.com/golang" class="Footer-link Footer-link--primary">
Connect
</a>
<a href="https://www.twitter.com/golang" class="Footer-link">
Twitter
</a>
<a href="https://github.com/golang" class="Footer-link">
GitHub
</a>
<a href="https://invite.slack.golangbridge.org/" class="Footer-link">
Slack
</a>
<a href="https://reddit.com/r/golang" class="Footer-link">
r/golang
</a>
<a href="https://www.meetup.com/pro/go" class="Footer-link">
Meetup
</a>
<a href="https://golangweekly.com/" class="Footer-link">
Golang Weekly
</a>
</div>
</div>
</div>
</div>
<div class="Footer">
<div class="Container Container--fullBleed">
<div class="Footer-bottom">
<img class="Footer-gopher" src="/images/gophers/pilot-bust.svg" alt="The Go Gopher">
<ul class="Footer-listRow">
<li class="Footer-listItem">
<a href="/copyright">Copyright</a>
</li>
<li class="Footer-listItem">
<a href="/tos">Terms of Service</a>
</li>
<li class="Footer-listItem">
<a href="http://www.google.com/intl/en/policies/privacy/"
target="_blank"
rel="noopener">
Privacy Policy
</a>
</li>
<li class="Footer-listItem">
<a
href="https://golang.org/s/discovery-feedback"
target="_blank"
rel="noopener"
>
Report an Issue
</a>
</li>
<li class="Footer-listItem">
<a
href="https://golang.org"
target="_blank"
rel="noopener"
>golang.org
</a>
</li>
</ul>
<a class="Footer-googleLogo" target="_blank" href="https://google.com" rel="noopener">
<img class="Footer-googleLogoImg" src="/images/google-white.png" alt="Google logo">
</a>
</div>
</div>
</div>
<script src="/js/carousels.js"></script>
<script src="/js/searchBox.js"></script>
<script src="/js/misc.js"></script>
<script src="/js/hats.js"></script>
</footer>
</body>
</html>