Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Liz Fong-Jones (方禮真)

@lizthegrey

Mar 28, 2018 • 33 tweets • 15 min read • Read on X

@nicolefv

First plenary talk: @nicolefv and @jezhumble on measurement. "If you don't know where you're going, it doesn't matter how fast you get there." #SREcon

Outline of the talk: (1) where am I going, (2) why do we care, (3) improve performance/quality, (4) measure performance, (5) culture & how to measure. #SREcon

@nicolefv

Maturity models are for chumps, says @nicolefv. Everyone has one, you're supposed to get to 5. Level caps in World of Warcraft as an example of level creep. [ed: this is a really interesting thing the CRE team at Google needs to consider in prod maturity assessments] #SREcon

The landscape has changed, and expectations have changed. Getting to level 40 or 60 or 80 is no longer good enough if you can get to level 110 now and there's extra land/tooling/technologies/gear to take advantage. #SREcon

@nicolefv

Customers expect a lot of more stuff. Docker didn't even exist 10 years ago, says @nicolefv.

The problem and the challenge. If maturity models point us to a shifting destination, what is right? Directions & continuous improvement instead. #SRECon

@jezhumble

But what direction or single metric should we pick? "LOLNO". We do have some things that come close.

@jezhumble kicks in here. State of DevOps report. 27,000 data points from all over the world. #SREcon

IT performance metrics that were true throughout. Tempo category: lead time for changes (VCS commit to prod) and release frequency. Stability category: time to restore service, and percentage of changes that fail and have to be rolled back/fix-forward #SREcon

@jezhumble

High performers do better at both of these categories rather than only one of them, says @jezhumble. They deploy on demand, at least once a day AND they also are able to restore service within an hour in event of breakage. #SREcon

Low performers deploy less frequently AND have bad time to restore. Why? Because you batch up work, and your fixes don't make it out. Technical debt accumulates. "Big bang release over the weekend, it'll all be sad." #SREcon

Emergency change processes tend to bypass testing. If you have reliable and fast CI/CD, you are going to be able to use normal change process to push emergency fixes. #SREcon

Both profit measure and customer satisfaction (esp for nonprofits where profit doesn't matter) are better in companies that have better practices. #SREcon

@jezhumble

"There's no definition of DevOps, it's still evolving" says @jezhumble. But SRE and DevOps are solving practically the same thing. How can we develop, evolve, and operate secure, resilient systems at scale. So we're at #SREcon but talking #DevOps. And it's natural!

@nicolefv

Now back to @nicolefv. How do we improve these metrics? CD, lean management, lean product development. Needs a *base* of transformational leadership. [ed: and CRE agrees -- without strong executive support, SRE has a much harder time taking root]. #SREcon

@nicolefv

Decrease of burnout and increases to job satisfaction. "The more streamlined we make our processes, the better it makes our lives and culture". --@nicolefv #SREcon

@jezhumble

"How do you change culture? By doing things differently. And changing culture affects your ability to deliver with speed and stability and drive organizational goals." --@jezhumble #SREcon

@sethvargo

So onto quality. Balance of new work, unplanned work/rework, and other work. This is looking awfully familiar. c.f. by myself and @sethvargo on "Overhead vs. Toil vs. Projects". #SREcon

How should we be measuring performance? Outputs vs. outcomes. What is going to be different, rather than what did you do? [ed: see *good* Key Results/Objectives, not lists of projects] #SREcon

@nicolefv

Don't measure lines of code as an output/productivity metric. It's easy but deceptive because it's an output measure not an outcome measure, says @nicolefv. More leads to bloated software with higher maintenance cost. #SREcon

@jezhumble

"Code is a liability rather than an asset" on your balance sheet, says @jezhumble. But optimize for readability, not for compactness. #SREcon

Common mistakes about velocity: story points are a capacity planning tool, not a productivity tool. Don't say "I scored 100 points this sprint and your team only scored 50!" #SREcon

@jezhumble

Don't wind up letting your feet getting stuck in concrete. Zero-sum gamification results in creating silos and inhibiting collaboration, says @jezhumble. #SREcon

Focus on global productivity rather than local metrics.

Another fallacy: utilization. How much of your time is working. CFOs apparently love this. But you *need* slack for unplanned work. #SREcon

Approaching full utilization results in lead times approaching infinity. Lack of resiliency against unplanned work or misestimations. #SREcon

@jezhumble

"The WIP-ocalypse". You heard it here first from @jezhumble and @nicolefv. #SREcon

Now, onto culture. What kind of culture are we talking about? High trust culture, learning from mistakes, accepting novelty. Generative/Bureaucratic/Pathological models of cultures by sociologist Ron Westrum. #SREcon

Touching specifically on failure. Scapegoating vs. justice vs. open enquiry. #SREcon

How can you measure this? Survey people in an organization. Likert scale asking about blame, sharing responsibility and information, and about failures. #SREcon

c.f. Project Aristotle research by Google. No significance of skills in team, but instead psych safety first, then dependability, structure/clarity, meaning, and impact. c.f. rework by Google public site. #SREcon

[ed: be wary of the psych safety trap in teams undergoing diversity/composition changes changes. I disagree with some of how Google attempts to measure and metric-ify psych safety for this reason plus.google.com/+lizthegrey/po…] #SREcon

@rynchantress

Example from Etsy of old: @rynchantress was given an award for causing the largest outage rather than shamed for it. Everyone asking "How can I help?" in Slack #SREcon

Citing Kripa Krishnan (keynote speaker at #SREcon 2016): practice to make sure people interact with each other across teams *before* real outages. #SREcon

@nicolefv

And a pitch for the Accelerate book by @nicolefv and @jezhumble in closing. #SREcon

view original on Twitter

External Tweet loading...
If nothing shows, it may have been deleted
by @jezhumble view original on Twitter

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

Read 5 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Liz Fong-Jones (方禮真)

Try unrolling a thread yourself!

More from @lizthegrey

Liz Fong-Jones (方禮真)

Liz Fong-Jones (方禮真)

Liz Fong-Jones (方禮真)

Liz Fong-Jones (方禮真)

Liz Fong-Jones (方禮真)

Liz Fong-Jones (方禮真)

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!