Notes on the 2019 DevOps Report

Some quick notes and callouts from this year’s 2019 DevOps Report:

  • Four key metrics: lead time, deployment frequency, mean time to restore (MTTR) and change fail percentage.
    • Med, High, and Elite all have a change fail rate of 0-15%. So, expect 15% change fail as benchmark worst case to shoot for…?
  • Demographics: 30% are devs, 26% “DevOps or SRE” – [so, lots of ICs self-evaluating]. 16% “managers,” and then it goes down from there…
  • Top industries are Technology at 38% and FinServe at 12%. Retail is 9%.
  • Mostly North American (50%)and Europe (29%)
  • Org. size: 100-499 (21%), 500-1,999 (15%), and 10,000+ (26%)
  • “A key goal in digital transformation is optimizing software delivery performance: leveraging technology to deliver value to customers and stakeholders.”
  • [I’m not sure if age of company, and, thus, an indication of governance and tech debt, is tracked. With 38% being tech companies, it’d be good know how young they are. But, most FinServ companies are large and old (unless it was mostly FinServ startups!).
  • Very prescriptive this year, a maturity model to put a strategy in place, etc.
  • A lot on paying down tech debt:
    • Bounded contexts, APIs, SOA and microservices. Using and testing out of team services without having to work with that team (sort of like mocking for runtime).
    • Also: “Teams that manage code maintainability well have systems and tools that make it easy for developers to change code maintained by other teams, find examples in the codebase, reuse other people’s code, as well as add, upgrade, and migrate to new versions of dependencies without breaking their code”
  • Very little prod chaos monkey stuff: less than 10% across the board.
  • CABs still bad: those that have them are 2.6x more likely to be low performers.
    • Instead, do peer reviews and automate governance: “peer review-based approval during the development process. In addition to peer review, automation can be leveraged to detect, prevent, and correct bad changes much earlier in the delivery lifecycle. Techniques such as continuous testing, continuous integration, and comprehensive monitoring and observability provide early and automated detection, visibility, and fast feedback. In this way, errors can be corrected sooner than would be possible if waiting for a formal review.”
    • CABs should instead focus on process and practices change: ” the CAB should focus instead on helping teams with process- improvement work to increase the performance of software delivery. This can take the form of helping teams implement the capabilities that drive performance by providing guidance and resources. CABs can also weigh in on important business decisions that require a trade-off and sign-off at higher levels of the business, such as the decision between time-to- market and business risk.”
    • [I’m pretty sure that was the original point, esp. when you look at RUP and ITIL stuff: setting the process to be used. Tooling to automate governance wasn’t really available. Policing it those prescriptive processes took over as it always does. And I’m not sure there are industry standard frameworks to use there yet either. There must be lots of hand-crafting.]
    • “Survey respondents with a clear change process were 1.8 times more likely to be in elite performers.” – [as ever, garbage in, garbage out.]
    • The people who work on governance are not the ones who can actually do the coding to automate it: “only our technical practitioners have the power to build and automate the change management solutions we design, making them fast, reliable, repeatable, and auditable…. Leaders at every level should move away from a formal approval process where external boards act as gatekeepers approving changes, and instead move to a governance and capability development role. After all, only managers have the power to influence and change certain levels of organizational policy. We have seen exponential improvements in performance— throughput, stability, and availability—in just months as a result of technical practitioners and organizational leaders working together.”
  • This is a different measure of “productivity”: “Productivity is the ability to get complex, time-consuming tasks completed with minimal distractions and interruptions.”
    • It doesn’t track amount of work done, but the environment people are working in…?
  • Tools use is all across the board: DIY stuff, COTs, open source, etc. [This sort of excludes the IaaS and other runtime layers, focusing on just CI/CD and test automation]
  • “Multi-tasking” across roles and projects might be OK: “we cannot conclude that how well teams develop and deliver software affects the number of roles and projects that respondents juggle.”
  • Being able to find things and ask questions [and, presumably, getting answers!], having search, is important.
  • From my read (slide 74), the methods of transforming orgs are all across the board with Big Bang and Training Center as the only low ranked ones. Communities of practice are high, part of the Spotify model.
  • Pg. 75 tries to derive some advice nonetheless: mostly that separate education and training groups don’t work well/widely, that grassroots is used a lot, and that communities of practice are good, as well as PoCs that get cloned.
  • [This is an instance where the high level of individual contributors in the answers might have an effect. They see the positive change in their own team, but don’t have the big picture view to see if the practices scale up to 1,000’s of people. On the other hand, they might follow the “my congressperson is perfect, all the other ones are corrupt and terrible” pattern. Also, those 5,000+ people orgs struggle.]
  • [We still don’t know how to change an engine in flight.]

5 Definitions of DevOps, or, ¯_(ツ)_/¯

https://flic.kr/p/MHemN8

I’ve tracked at least three different definitions of DevOps since the days of “agile infrastructure”:

  1. Using Puppet and Chef (and then Ansible and Chef) to replace Opsware and BladeLogic.
  2. Full stack engineers to setup EC2, load-balancers, and other Morlock shit.
  3. Full stack engineers are bad, but sort of the same thing. Also, you can’t have a DevOps “group” or title. But, you know, someone should do all that automation.
  4. Putting all the people on one team, having them focus on a product, and establishing a culture of caring and learning.
  5. SRE is not DevOps.

So…actually five. Maybe some of them just being footnotes on the evolving concept. (And, if you, dear reader, feel these are wrong, then let’s compromise and make the list six.)

All of them evolved around bringing down The Wall of Confusion, allowing “developers” to deploy their software to production more frequently, weekly, if not daily. And, of course, making sure production stays up. (You’re supposed to call that “resiliency” and instead of SLAs use SLOs and some other newly named metrics that answer the question “IS MY SHIT WORKING?” Whatever you do, just don’t say “uptime,” or you’re in for it and will be relegated to running the AS/400’s.)

I used to snide that the developers seemed to have been yanked out of DevOps, sometime around 2014 and 2015. All the talks I saw were, basically, operations talks. I haven’t really checked in on DevOps conference talks recently, but at the time, I don’t think there was much application development stuff. (I’m not sure if there ever was?)

None of this means that DevOps is not a thing. Not at all. It just means that the enterprise finds its own use for things. It also means there’s still weekly write-ups of what DevOps is – you know, those ones that are always lists of ideas, things you’re getting wrong, and how to start.

Autonomous product teams

https://flic.kr/p/bJHkSX

Nowadays, I try to stick to that forth one: you want to setup autonomous teams that have all the skills and responsibility/authority/tools needed to “own” the software being specified, designed, developed, and run. This means you have to, basically, remove-by-automating all the operations stuff it takes to stand-up environments, deploy things, and do all that “day 2” stuff.

(HEY! HEY! WANT TO BUY SOME ENTERPRISE SOFTWARE?!)

Now, I think this product-centric notion of DevOps is, well, kind of an over-extension of the term “DevOps.” But since SRE has sucked out the “ops” part (but, remember, dear reader, don’t commit the embarrassing act of saying SRE is DevOps – no, no, you’d never do that, right? SO SHAMEFUL! (SRE is totally different – no overlap or similar goals shared between them at all. I mean, they have separate groups, silos! COME ON!)), slicing “DevOps” back to just “Dev,” but with a product-not-project focus isn’t too shabby.

Anyhow. I came across a good overview of this product notion of DevOps, all the way back from 2016, while re-reading Schwartz’s evergreen excellent The Art of Business Value:

Agile approaches attempt to bring together developers and the business in an atmosphere of mutual respect and joint contribution. Until now, however, the focus has been on users of the software, product visionaries, and developers. Recent developments in the Agile world—notably DevOps—have broadened this idea of respect and inclusion to encompass Operations and Security. The DevOps model, in other words, looks to break down the silos that have resulted from technical specialization over the last few decades. But the DevOps spirit goes further, looking to eliminate the conflicting incentives of organizational silos and the inhumane behaviors that can result from those conflicting incentives.

 

Perhaps we can take this idea even further still. There is no reason why the DevOps team’s responsibility needs to stop at the border of what used to be considered IT. The team is part of a broader enterprise, whose collective knowledge, skills, and judgment need to be part of the value creation process.

Look a’ that guy! Business Value just effortlessly jets out of his pores like a peripatetic thought-monarch!

This is from an executives perspective, but it drives home the point we’re always trying to get to with software: doing whatever it takes to figure out, create, and give users features that are actually useful to them. Somewhere beyond that, if you’re lucky, it’ll help out “the business.” Also, it should implement The Unspoken User Story: user would like software to actually work.