Four key metrics: lead time, deployment frequency, mean time to restore (MTTR) and change fail percentage.
Med, High, and Elite all have a change fail rate of 0-15%. So, expect 15% change fail as benchmark worst case to shoot for…?
Demographics: 30% are devs, 26% “DevOps or SRE” – [so, lots of ICs self-evaluating]. 16% “managers,” and then it goes down from there…
Top industries are Technology at 38% and FinServe at 12%. Retail is 9%.
Mostly North American (50%)and Europe (29%)
Org. size: 100-499 (21%), 500-1,999 (15%), and 10,000+ (26%)
“A key goal in digital transformation is optimizing software delivery performance: leveraging technology to deliver value to customers and stakeholders.”
[I’m not sure if age of company, and, thus, an indication of governance and tech debt, is tracked. With 38% being tech companies, it’d be good know how young they are. But, most FinServ companies are large and old (unless it was mostly FinServ startups!).
Very prescriptive this year, a maturity model to put a strategy in place, etc.
A lot on paying down tech debt:
Bounded contexts, APIs, SOA and microservices. Using and testing out of team services without having to work with that team (sort of like mocking for runtime).
Also: “Teams that manage code maintainability well have systems and tools that make it easy for developers to change code maintained by other teams, find examples in the codebase, reuse other people’s code, as well as add, upgrade, and migrate to new versions of dependencies without breaking their code”
Very little prod chaos monkey stuff: less than 10% across the board.
CABs still bad: those that have them are 2.6x more likely to be low performers.
Instead, do peer reviews and automate governance: “peer review-based approval during the development process. In addition to peer review, automation can be leveraged to detect, prevent, and correct bad changes much earlier in the delivery lifecycle. Techniques such as continuous testing, continuous integration, and comprehensive monitoring and observability provide early and automated detection, visibility, and fast feedback. In this way, errors can be corrected sooner than would be possible if waiting for a formal review.”
CABs should instead focus on process and practices change: ” the CAB should focus instead on helping teams with process- improvement work to increase the performance of software delivery. This can take the form of helping teams implement the capabilities that drive performance by providing guidance and resources. CABs can also weigh in on important business decisions that require a trade-off and sign-off at higher levels of the business, such as the decision between time-to- market and business risk.”
[I’m pretty sure that was the original point, esp. when you look at RUP and ITIL stuff: setting the process to be used. Tooling to automate governance wasn’t really available. Policing it those prescriptive processes took over as it always does. And I’m not sure there are industry standard frameworks to use there yet either. There must be lots of hand-crafting.]
“Survey respondents with a clear change process were 1.8 times more likely to be in elite performers.” – [as ever, garbage in, garbage out.]
The people who work on governance are not the ones who can actually do the coding to automate it: “only our technical practitioners have the power to build and automate the change management solutions we design, making them fast, reliable, repeatable, and auditable…. Leaders at every level should move away from a formal approval process where external boards act as gatekeepers approving changes, and instead move to a governance and capability development role. After all, only managers have the power to influence and change certain levels of organizational policy. We have seen exponential improvements in performance— throughput, stability, and availability—in just months as a result of technical practitioners and organizational leaders working together.”
This is a different measure of “productivity”: “Productivity is the ability to get complex, time-consuming tasks completed with minimal distractions and interruptions.”
It doesn’t track amount of work done, but the environment people are working in…?
Tools use is all across the board: DIY stuff, COTs, open source, etc. [This sort of excludes the IaaS and other runtime layers, focusing on just CI/CD and test automation]
“Multi-tasking” across roles and projects might be OK: “we cannot conclude that how well teams develop and deliver software affects the number of roles and projects that respondents juggle.”
Being able to find things and ask questions [and, presumably, getting answers!], having search, is important.
From my read (slide 74), the methods of transforming orgs are all across the board with Big Bang and Training Center as the only low ranked ones. Communities of practice are high, part of the Spotify model.
[This is an instance where the high level of individual contributors in the answers might have an effect. They see the positive change in their own team, but don’t have the big picture view to see if the practices scale up to 1,000’s of people. On the other hand, they might follow the “my congressperson is perfect, all the other ones are corrupt and terrible” pattern. Also, those 5,000+ people orgs struggle.]
[We still don’t know how to change an engine in flight.]
After at least five years of struggling to transformation, IT knows how to deliver better software, how to do the process and use the new tools needed for “digital transformation.” They may not actually do all that, but they know what should be done. However, “The Business” is not involved enough nor knows what to do. This prevents achieving the full benefits of digital transformation. The Business just knows that Amazon is coming to eat their lunch and that their boards are demanding a strategic response, like, yesterday. There are a handful of educational exceptions: companies like The Home Depot that are figuring it out and thriving. But, there’s a lot more organizations that are stumbling than succeeding. IT isn’t the bottleneck anymore, it’s finance, strategy, and management.
And, here’s a talk on the topic:
It’s a sequel to my previous book, Monolithic Transformation. That book looks more inward at the IT department and how it should change, while this one is trying to look at the rest of the organization: how does (should?) “The Business” need to change?
Here are some draft excerpts and related things I’ve been working on:
There’s near universal sentiment that traditional banks need to shift to improve and protect their businesses against financial startups, so called “FinTechs.” These startups create banks that are often 100% online, even purely as a mobile app. The release of Apple Pay highlights how these banks are different: they’re faster, more customer experience focused, and innovate new features.
The core reason FinTechs can do all of this is because they’re good at creating well designed software that feels natural to people and allows these FinTechs to optimize the banking experience and even start innovating new features. People like banking with them!
These FinTechs are growing quickly, For example, N26 grew from 100,000 accounts in 2015 to 3.5m this year. Still, existing banks don’t seem to be feeling too much pain. In that same period, JPMC went from 39.2m digital accounts to 49m, adding 19.8m accounts. Even if it’s small or hard to chart, market share is being lost and existing banks are eager to respond. And, of course, the FinTechs are eager to take advantage of slower moving banks with the $128bn of VC funding that’s fueled FinTech growth.
I wanted to get a better handle on all this, so I’ve put together this “hot take” on digital banking, FinTech’s, whatever. My conclusion is that these new banks take advantage of having a clean-slate – a lack of legacy baggage in business models and technology stacks – to focus most of their attention on customer experience, doing software really well. This is at the heart of most “tech companies” operational differentiation, and it’s no different in banking.
Large, existing banks may be “slow moving,” but they have deep competitive advantages if they can address the legacy of past success: those big, creaking backend systems and a culture of product development that, well, isn’t product development. Thankfully, there are several instances and case studies of banks keeping transforming how they do business.
The cash back amounts show up in your account by the end of the day. In contrast many credit cards offer cashback, it can take weeks or even months for that show up on your account – that cash back period, is perhaps not surprisingly hard to fine for cards.
The Apple Card has a really quick activation process. Traditionally, getting your account setup, activating a card, can take days to weeks – usually, you need a card snail mailed to you. But once you setup your account, you can start using tap-to-pay with your phone. When I moved to Amsterdam, I setup an ABN AMRO account, and last week I setup an N26 account. In both instances, I had to wait several days to get a physical debit card. I could start transferring money instantly, however.
There’s no guarantee that the Apple Card will be a competitive monster. Per usual, the huge customer base and trust Apple has boosts their chance. As Patrick McGee at The Financial Timesnotes: “JD Power survey published last week, before the card was even available, found that 52 per cent of those aged between 18 and 29 were aware of it; of those, more than half were likely to apply.” Apple usually has a great attach rate between the iPhone and new products. Signs point to the Apple Card working out well for Apple and their partners.
Shifting the market with innovation…right?
That snazzy UI and zippy features make me wonder, though, why is this new? Why aren’t these boring, commodified features in banking yet? Let’s broaden this question to banking in general, mostly retail or consumer banking for discussing here.
Perhaps we have an innovation gap in banking, something that’s likely been ignored by existing banks for many years. These FinTechs, and other innovation-focused companies like Apple, have been using innovation as crowbars to take market share, coming up with better ways of servicing customers and new features.
Is that innovation getting FinTechs new business and sucking away customers from existing banks? To get a handle on that kind of market share shift I like to use a chart I call The Dediu Cliff to think about startups vs. incumbents. It’s a simple, quick way of showing how market share shifts between those two, how startups gain share and incumbents lose it. You chart out as many years as you can in a 100% area graph showing the shift in market share between the various players. Getting that data for banking has so far proved difficult, but let’s take a swag at it anyhow.
Whatever the business models, financial services executives seem to think so as one PWC survey found: 73% of those executives “perceive consumer banking as the one most [banking products] likely to be disrupted by FinTech.” Being lazy, I found a pre-made data set to show this, in Sweden thanks to McKinsey:
As the report notes, Sweden is very advanced in digital banking. In comparison, they estimate that in the UK the “specialist” firms have less than 20% share. In this dataset, “specialist” isn’t exactly all new and fun FinTech startups, but this chart shows the shift from “universal,” traditional banks to new types of banks and services. There’s a market shift.
If I had more time, I’d want to make a similar Dediu Cliff for more than just Sweden. As a bad, but quick example, comparing JPMC’s retail banking customer growth to N26’s:
This chart is not too useful because it shows just one bank to one FinTech, though. And JPMC is much lauded for its innovation abilities. At the end, in the summer of 2019 JPMC has 62m household customers, with 49m being “digital,” and N26 has 3.5m, all “digital” we should assume. Here’s the breakdown:
Growth, as you’d expect, is something else: JPMC had a CAGR of 8%, while N26’s was 227%. If N26 survives, that of course means their growth will flatten, eventually.
Even if it’s hard to chart well, we should take it that the new bread of FinTechs are taking market share. Financial services executives seem to think so as one PWC survey found: 73% of those executives “perceive consumer banking as the one most [banking products] likely to be disrupted by FinTech.”
To compound the fogginess, as in the original Dediu Cliff, charting the dramatic shift from PCs to smart phones, the threat often comes from completely unexpected competitors. The market is redefined, from just PCs for example, to PCs and smart phones. This leaves existing businesses (PC manufacturers) blind-sided because their markets are redefined. Customer’s desires and buying habits change: they want to spend their computer share of wallet and time on iPhones, not Wintels.
Taking this approach in banking, there are numerous FinTechs going over underserved markets that are “underbanked” and usually deprioritized by existing banks. This is a classic, “Big D” disruption strategy. One of the more fascinating examples are ride-sharing companies that become de facto banks because they handle the money otherwise bankless drivers earn.
So, there is a shift going on. What are these FinTechs doing? Let’s simplify to three things:
Mobile – an emphasis on mobile as the core branch and workflow, often 100% mobile.
Speed – from signing up, to transferring money, to, as with the Apple Card, faster cash back. While it’ll take awhile to get my card, actually signing up with N26 was quick, including taking pictures of my Netherlands residency card for ID verification. I signed up at 11:29am and was ready to go at 4:05pm, on a Sunday no less.
Innovation – sort of. It’s not really about new features, but innovations in how people interact with the banks. N26 let you create “spaces” which are just sub accounts used to organize budgets and reports; bunq lets you create 25 new accounts; many FinTechs (like the Apple Card) bundle in transaction type reporting and budgeting tools. All of those are interesting, but not ground breaking…yet.
From a competitive analysis stand-point, what’s frustrating is that feature-by-feature, traditional banks and FinTechs seems to be on par. Throw in services like mint.com and all the supposedly new features that FinTechs have seem to be available don’t look so unique anymore. Paying with your phone amazing, to be sure, but that’s long been done by existing banks.
For all the charts and surveys you can pile on, the difference amounts to a subjective leap of faith. FinTech companies are more customer centric, focusing on the customer experience. When you look at the broader “tech companies” that enterprises aspire to imitate, customer experience is one of the primary differentiators. Their software is really good. More precisely, how their software helps people accomplish tasks is well designed and ever improving.
Responding to all of this seems easy on the face of it: if these FinTechs can do it, why not the thousands of developers with their bank-sized budgets do it?
As ever, banks suffer from the shackles of success: all the existing processes, IT, and thought technologies that was wildly successful and drives their billions in revenue….but hasn’t been modernized in years, or even decades.
In part 2, we’ll look at what banks can do to unshackle themselves, and maybe slip on some new shackles for the next ten years.
(There are some footnotes that didn’t get over here. For those, and if you want to see me wrastlin’ through part two, or leave a comment, check out the raw Google Doc of this.)
The traditional approach to corporate strategy is a poor fit for this new type of digital-driven business and software development. Having worked in corporate strategy I find that fitting its function to an innovation-led business is difficult. If strategy is done on annual cycles, predicting and proscribing what the business should be doing over the next 12 months, it seems a poor match for the weekly learning you get from a small batch process. Traditionally, strategy defines where a company focuses: which market, which part of the market, what types of products, how products are sold, and therefore, how money should be allocated. The strategy group also suggests mergers and acquisitions, M&A, that can make their plans and the existing business better. If you think of a company as a portfolio of businesses, the strategy group is constantly asses that each business in that portfolio to figure out if it should buy, sell, or hold.
The dominant strategy we care about here goes under the name “digital transformation.” Sort of. The idea that you should use software as a way of doing business isn’t new. A strategy group might define new markets and channels where software can be used: all those retail omnichannel combinations, new partnerships in open banking APIs, and new products. They also might suggest businesses to shut down, or, more likely divest to other companies and private equity firms, but that’s one of the less spoken about parts of strategy: no one likes the hand that pulls the guillotine cord.
A moment of pedantry
First, pardon a bit of strategy-splaining. Having a model of what strategy is, however, is a helpful baseline to discuss how strategy needs to change to realize all these “digital transformation” dreams. Also, I find that few people have a good grasp of what strategy is, nor, what I think it should be.
I like to think of all “markets” as flows of cash, big tubes full of money going from point A to point B. For the most part, this is money from a buyer’s my wallet flowing to a merchant. A good strategy figures out how to grab as much of that cash as possible, either by being the end-point (the merchant), reducing costs (the buyer), or doing a person-in-the-middle attack to grab some of that cash. That cash grabbing is often called “participating in the market.”
When it comes to defining new directions companies can take, “payments” is a good example. We all participate in that market. Payments is one of the more precise names for a market: tools people use to, well, pay for things.
First, you need to wrap your head around the payments industry, this largely means looking at cashless transactions because using cash requires no payment tool. “Most transactions around the world are still conducted in cash,” The Economist explains, “However, its share is falling rapidly, from 89% in 2013 to 77% [in 2019].” There’s still a lot of cash used, oddly in the US, but that’s changing quickly, especially in Asia, for example in China, The Economist goes on, “digital payments rose from 4% of all payments in 2012 to 34% in 2017.” That’s a lot of cash shifting and now shooting through the payments tube. So, let’s agree that “payments” is a growing, important market that we’d like to “participate” in.
There are two basic participants here:
New companies enter the market by creating new ways of paying for things that compete with existing ways to pay for things. For example, new entrants are services like Alipay, Bunq, Apple Pay, and GrabPay. While this is the domain of startups in most people’s minds, large companies play this role often.
“Strategy,” then, is (1.) deciding to participate in these markets, and, (2.) the exact way these companies should participate, how they grab money from those tubes of cash. Defining and nailing strategy, then, is key to success and survival. For example, an estimated 3.3 trillion dollars flowed through the credit card tube of money in 2016. As new ways of processing payments gain share, they grab more and more from that huge tube of cash. Clearly, this threatens the existing credit card companies, all of whom are coming up with new ways to defend their existing businesses and new payment methods.
As an example of a general strategy for incumbents, a recent McKinsey report on payments concludes:
The pace of digital disruption is accelerating across all components of the GTB value chain, placing traditional business models at risk. If they fail to pursue these disruptive technologies, banks could become laggards servicing less lucrative portions of the value chain as digital attackers address the friction points. To avoid this fate, banks must embrace digitized transaction banking with a goal of eliminating discrepancies, simplifying payments reconciliation, and streamlining infrastructure to operate profitably at lower price points. They must take proactive strategic steps to leverage their current favorable market position, or watch new market entrants pass them by.
New methods of payments will destroy your business, those pesky “tech companies.”
So you should create some new (not credit card) payment methods
At the same time make your back-end systems more efficient so they can drive down costs for your existing credit card based business, increasing your profit margins despite overall revenue declining as “tech companies” grab more and more money out of your cash-tubes.
Also, take advantage of your existing capabilities in security, fraud handling, and governance compliance to differentiate both your new, not credit card payment offerings and defend your existing credit card business.
That’s pretty good strategic direction and it comes, as you can see in the PDF, from a very deep analysis of market conditions, and trends – there’s even a Mekko chart!
Now, how you actually put all that into practice is what strategy is. Each company and industry has its own peccadilloes. The reason McKinsey puts out all those fine charts is to do the pre-sales work of getting to invite them in and ask “yes, but how?”
We came to the realization that, ultimately, we are a technology company operating in the financial-services business. So, we asked ourselves where we could learn about being a best-in-class technology company. The answer was not other banks, but real tech firms.
This type of thinking has gone on for years, but change in large organizations has been glacial. If you search for the phrase “digital transformation” you’ll daily find sponsored posts on tech news sites preaching this, as they so often say, “imperative.” They’re long on blood curdling pronouncements and short on explaining what to actually do.
We’re all tired of this facile, digital genuflection. But maybe it’s still needed.
If survey and sentiment are any indication, digital strategies are not being rolled out broadly across organizations as one survey, below, suggests. It shows that the part of the businesses that creates the actual thing being sold, product design and development, is being neglected:
As with all averages, this means that half of the firms are doing better…and half of them worse. Curiously, IT is getting most of the attention here: as I say, the IT bottleneck is fixed. My anecdotes-as-data studies match up with the attention customer service is getting: as many of my examples here show, like the Orange one, early digital transformation applications focus on moving people from call centers to apps. And, indeed, “improving customer experience” is one of the top goals of most app work I see.
But, it drops off after there. There’s plenty of room for improvement and much work to be done by strategy groups to direct and decide digital strategy. Let’s look at a two part toolkit for how they might could do it:
Sensing your market – how to observe your market to time and plan changes.
Validating strategy – a new method to safely and accurately define what your organization does.
Sensing your market
Changing enterprise strategy is costly and risky. Done too early, and you deliver perfectly on a vision but are unable to scale to more customers: the mainstream is not yet “ready.” Done too late, and you’re in a battle to win back customers, often with price cutting death spirals and comically disingenuous brand changes: you don’t have time for actual business innovation, so you put lipstick on discount pigs.
An innovation strategy relies on knowing the right time to enter the market. You need a strategy tool to continually sense and time the market. Like all useful strategy tools, it not only tells you when to change, but also when to stay the same, and how to prioritize funding and action. Based on our experience in the technology industry, we suggest starting with a simple model based on numerous tech market disruptions and market shifts. This model is Horace Dediu’s analysis of the post-2007 PC market. 2007, of course, is the year the iPhone was introduced. I’m not sure what to call it, but the lack of a label doesn’t detract from its utility. Let’s call it The Dediu Cliff:
To detect when a market is shifting, Dediu’s model emphasizes looking beyond your current definition of your market. In the PC market, this meant looking at mobile devices in addition to desktops and laptops. Microsoft Windows and x86 manufacturers had long locked down the definition and structure of the PC market. Analyst firms like IDC tracked the market based on that definition and attempted disruptors like Linux desktop aspirants competed on those terms.
When the iPhone and Android were introduced in 2007, the definition of the PC market changed without much of anyone noticing. In a short 10 years, these “phones” came to dominate the “PC” market by all measures that mattered: time spent staring at the screen, profits, share increases, corporate stability and high growth, and customer joy. Meanwhile, traditional PCs were seen mostly as work horses, as commodities like pens and copy machines bought on refresh cycles with little regard to differentiation.
Making your own charts will often require some art. For example, another way to look at the PC market changing is to look at screen time per device, that is, how much time people spend on each device:
You have to find the type of data that fits your industry and the types of trends you’re looking to base strategy on. Those trends could be core assumptions that drive how your daily business functions. For example, many insurance businesses are still based on talking with an agent. So, in the insurance industry, you might chart online vs. offline browsing and buying:
While more gradual than Deidu’s PC market chart, this slope will still allow you to track trends. Clearly, some companies aren’t paying attention to that cliff: as the Gartner L2 research goes on to say, once people look to go from quote to purchasing, only 38% of insurance companies allow for that purchase online.
Gaining this understanding of shifts in the very definition of your market is key. Ideally, you want to create the shift. If not, you want to enter the market once the shift is validated, as early as possible, even if the new entrant has single digit market share. Deploying your corporate resources (time, attention, and money) often takes multiple years despite the “overnight success” myths of startups.
Timing is everything. Nailing that, per industry, is fraught, especially in highly regulated industries like banking, insurance, pharmaceuticals, and other markets that can use regulations to, uh, artificially bolster barriers to entry. Don’t think that high barriers to entry will save you though: Netflix managed to wreak havoc in the cable industry, pushing top telcoes even more into being dumb pipes, moving them to massive content acquisitions to compete.
I suggest the following general tactics to keep from falling off The Dediu Cliff:
Know your customer – study their Jobs to be Done, maintain a good, “speaking” relationship with them.
Consider Cassandras that use footnotes – track trend spotting, especially year over year (over year, over year).
Try new things – experiment and incubate new ideas to continually test and participate in the market.
We’ll take a look at each of these, and then expand on how the third is generalized into your core innovation function.
Know your customer
Measuring what your customer things about you is difficult. Metrics like NPS and churn give you trailing indicators of satisfaction, but they won’t tell you when your customer’s expectations are changing, and, thus, the market.
You need to understand how your customer spends their time and money, and what “problems” they’re “solving” each day. For most strategy groups, getting this hands on is too expensive and not in their skill set. Frameworks like Jobs to Be Done and customer journey mapping can systemize this research, as we’ll see below, using a small batch process to implement your application allows you to direct strategy by observing what your customers actually interact with your business do day-to-day.
Case Study: “The front door of the store is in your pocket,” Home Depot
In the ever challenging retail world, The Home Depot has managed to prosper by knowing their customer in detail. The company’s omnichannel strategy provides an example. Customers expect “omnichannel” options in retail, the ability to order products online, buy them in-store, order online but pick-up in-store, return items from online in-store…you get the idea. Accomplishing all of those tasks seems simple from the outside, but integrating all of those inventory, supply-chain, and payment systems is extremely difficult. Nonetheless, as Forrester has documented, The Home Depot’s concerted, hard fought work to get better at software is delivering on their omnichannel strategy: “[a]s of fiscal year 2018, The Home Depot customers pick up approximately 50% of all online orders in the store” and a 28% growth in online sales.
Advances in this business have been fueled by intimate knowledge of The Home Depot’s customers and in-store staff by actually observing and talking with them. “Every week, my product and design teams are in people’s homes or [at] customer job sites, where we are bringing in a lot of real-time insights from the customers,” Prat Vemana, The Home Depot’s Chief Digital Office said at the time.
The company focuses on customer journeys, the full, end-to-end process of customers thinking to, researching, browsing, acquiring, installing, and then using a product. For example, to hone in on improving the experience of buying appliances, the product team working on this application spent hours in stores studying how customers bought appliances. They also spent time with customers at home to see how they browsed appliance options. The team also traveled with delivery drivers to see how the appliances are installed.
Here, we see a company getting to know their customer and their problems intimately. This leads to new insights and opportunities to improve the buying experience. In the appliances example, the team learned that customers often wanted to see the actual appliance and would waste time trying to figure out how they could see it in person. So, the team added a feature to show which stores had the appliances they were interested in, thus keeping the customer engaged and moving them along the sales process.
Spanning all these parts of the customer journey gives the team research-driven insights into how to deliver on The Home Depot’s omnichannel strategy. As customers increasingly start research on their phone, in social media, go instore to browse, order online, pick up instore, have items delivered, and so forth, many industries are figuring out their own types of omnichannel strategies.
All of those different combinations and changing options will be a fog to strategy groups unless they start to get to know their customers better. As Allianz’s Firuzan Iscan puts it: “When we think from the customer perspective, most of our customers are hybrid customers. They are starting in online, and they prefer an offline purchasing experience. So that’s why when we consider the journey end to end, we need to always take care of online and offline moments of this journey. We cannot just focus on online or offline.”
Corporate strategy didn’t sign up for this
The level of study done at The Home Depot may seem absurd for the strategy team to do. Getting out of the office may seem like a lot of effort, but the days spent doing it will give you a deep, ongoing understanding of what your customers are doing, how you’re fulfilling their needs, and how you can better their overall journey with you to keep their loyalty and sell more to them. Also, it’s a good excuse to get out of beige cubicle farms and dreary conference rooms. Maybe you can even expense some lunches!
As we’ll see, when the product teams building these applications are put in place, strategy teams will have a rich source of this customer information. In the meantime, if you’re working on strategy, you’d be wise to fill that gap however you can. We’ll discuss one method next, listening to those people yelling and screaming doom and disruption.
In Western mythos, Cassandra was cursed to always have 100% accurate prophecies but never be believed. For those of us in the tech industry, cloud computing birthed many Cassandras. Now, in 2019, the success of public cloud is indisputable. The on-premises market for hard and software is forever changed. Few believed that a “booker seller” would do much here or that Microsoft could reinvent itself as an infrastructure provider, turning around a company that was easily dismissed in the post-iPhone era.
Despite this, as far back as 2007, early Casandras were pointing out that software developers were using AWS in increasing numbers. Early on, RedMonk made the case that developers were the kingmakers of enterprise IT spend. And, if you tracked developer choice, you’d see that developers were choosing cloud. More Cassandras emerged over the years as cloud market share grew. Traditional companies heard these Cassandras,some eventually acting on the promises.
Finally, traditional companies took the threat seriously, but as Charles Fitzgerald wickedly chronicled, it was too late. As his chart above shows, entering the public cloud market at this stage would cost $100’s of billions of dollars, each year, to catch up. The traditional companies in the infrastructure market failed to sense and act on The Cliff early enough – and these were tech companies, those outfits that are supposed to outmaneuver and outsmart the market!
Now, don’t take this to mean that these barriers to entry are insurmountable. Historically, almost every tech leader has been disrupted. That’s what happened in this market. There’s no reason to think that cloud providers are immune. We just don’t know when and how they’ll succumb to new competitors or, like Microsoft, have to reinvent themselves. What’s important, rather is for these companies to properly sense and respond to that threat.
To consider Cassandras, you need a disciplined process that looks at year over year trends, primarily how your customers spend their time and money. Mary Meeker’s annual slide buffet is a good example: where are your customers spending their time? RedMonk’s analysis of developers is another example. A single point in time Cassandra is not helpful, but a Cassandra that reports at regular intervals gives you a good read on momentum and when your market shifts.
Finally, putting together your own Dediu Cliff can self-Cassandraize you. Doing this can be tricky as you need to imagine what your market will look like – or several scenarios. You’ll need to combine multiple market share numbers from industry analysts into a Cliff chart, updating it quarterly. Having managed such a chart, I can say it’s exhilarating (especially if someone else does the tedious work!) but can be disheartening when quarter by quarter you’re filed into an email inbox labeled “Cassandras.”
Thus far, our methods for sensing the market have been a research, even “assume no friction” methods. Let’s look at the final method that relies on actually doing work, and then how it expands into the core of the new type of strategy and breaking The Business Bottleneck.
Try new things
The best way to understand and call market shifts is to actually be in the market, both as a customer and a producer. Being a customer might be difficult if you’re, for example, manufacturing tractors, but for many businesses being a customer is possible. It means more than trying your competitor’s products. To the point of tracking market redefinition, you want to focus on the Jobs to Be Done, problems customers are solving, and try new ways of solving those problems. If this sounds like it’s getting close to the end goal of innovation, it’s because it is: but doing it in a smaller, lower cost and lower risk way.
For example, if you’re in the utility business, become a customer of in-home IoT devices, and how that technology can be used to steal your customer relationship, further pushing your business into a commodity position. In the PC market, some executives at PC companies made it a point of pride to never have tried, or “understood” the appeal of small screens – that kind of willful, proud ignorance isn’t helpful when you’re trying to be innovative.
You need to know the benefits of new technologies, but also the suffering your products cause. There’s a story that management at US car manufacturers were typically given a company car and free mechanical service during the day while their car was parked at the company parking lot. As a consequence, they didn’t know first hand how low quality affected the cars. As Nassem Talab would put, they didn’t have any skin in the game…and they lost the game. Regularly put your skin in the game: rent a car, file an insurance claim, fill out your own expenses, travel in coach, and eat at your in-store delis.
Ket to trying new things is to be curious, not only in finding these things, but in thinking up new products to improve and solve the problems you are, now, experiencing first hand.
The goal of trying new things is to experiment with new products, using them to direct your strategy and way of doing business. If you have the capability to test new products, you can systematically sense changes in market definition. Tech companies regularly gloat new ideas as test products to sense customer appetite and, thus, market redefinitions. If you’ve ever used an alpha or beta app, or an invite only app, you’ve played a part in this process. These are experiments, ways the company tries new things. We laud companies like Google for their innovation successes, but we easily forget the long list of failed experiments. The website killedbygoogle.com catalogs 171 products that Google killed. Not all of these are “experiments,” some were long-running products that were killed off. Nonetheless, once Google sensed that an experiment wasn’t viable or a product no longer valid, they killed it, moving on.
When it comes to trying things, we must be very careful about the semantics of “failure.” Usually, “failure” is bad, but when it comes to trying new things, “failure” is better thought of as “learning.” When you fail at something, you’ve learned something that doesn’t work. When you’re feeling your way through foggy, frenetic market shifts requires tireless learning. So, in fact, “failing” is often the fastest way to success. You just need a safe, disciplined system to continually learn.
Innovation requires failure. There are few guarantees that all that failure will lead to success, but without trying new things, you’ll never succeed at creating new businesses and preventing disruption. Historically, the problems with strategy has been the long feedback cycles required to tell you if your strategy “worked.”
First, budgets are allocated annually, meaning your strategy cycle is annual as well.Worse, to front-load the budget cycle, you need to figure out your strategy even earlier. Most of the time, this means the genesis of your current strategy was two, even three years ago. The innovation and business rollout cycles at most organizations are huge. TK( some long roll out figure). It can be even worse: five years, if not ten years in many military projects. Clearly, in “fast moving markets,” to use the cliché, that kind of idea-to-market timespan is damaging. Competing against companies that have shorter loops is key for organizations now. As one pharmacy executive put it, taking six months to release competitive features isn’t much use if Amazon can release them in two months.
Your first instinct might be the start trying many new things, creating an incubation program as a type of beta-factory of your own. The intention is good, but the risks and costs are too high for most large organizations. Learning-as-failure is expensive and can look downright stupid and irresponsible to share holders. Instead, you need a less costly, lower risk way to fail than throwing a bunch of things at the wall and seeing what sticks.
The small batch cycle
Many organizations using what we’ll call the small batch cycle. This is a feedback loop that relies on four simple steps:
Identify a problem to solve.
Create a theory of how to solve the problem.
Validate this theory by trying it out in real life.
Analyze the results to see if the theory is valid or not.
This is, essentially, the scientific method. The lean startup method and, later, lean design has adapted this model to software development. This same loop can be applied “above the code” to strategy. This is how you can use failure-as-learning to create validated strategy and, then, start innovating like a tech company.
As described above, due to long cycles, most corporate strategy is theoretical, at worse, PowerPoint arts and crafts with cut-and-pasting from a few web searches. The implementation details can become dicey and then there’s seeing if customers will actually buy and use the product. In short, until the first customer buys and uses the “strategy,” you’re carrying the risk of wasting all your budget and time on this strategy, often a year or more.
That risk might pay off, or it might not. Not knowing either way is why it’s a risk. A type of corporate “double up to catch up” mentality adds to the risk as well. Because the timeline is so long, the budget so high, and the risk of failure so large, managers will often seek the biggest bang possible to make the business case’s ROI “work.” Taking on a year’s time and $10m budget must have a significant pay off. But with such high expectations, the risk increases because more must be done, and done well. And yet, the potential downside is even higher as well.
This risky mentality has been unavoidable in business for the most part – building factories, laying phone lines, manufacturing, etc. require all sorts of up-front spending and planning. Now, however, when your business relies on software, you can avoid these constraints and better control the risks. Done well, software costs relatively little and is incredibly malleable. It’s, as they say, “agile.” You just need to connect the agile nature of software to strategy. Let’s look at an example.
Case Study: Most viable strategy: Duke Energy validates RFID strategy
They’re applying the mechanics of small batches and agile software to their strategy creation. “Journey teams” are used to test out strategies before going through the full-blown, annual planning process. “They’re small product-type teams led by design thinkers that help them really map out that new [strategic] journey and then identify [what] are the big assumptions,” Duke’s John Mitchell explained. Once identified, the journey teams test those assumptions, quickly proving or disproving the strategy’s viability.
Mitchell gives a recent example: labor is a huge part of the operating costs for a nuclear power plant, so optimizing how employees spend their time can increase profits and the time it takes to address issues. For safety and compliance reasons, employees work in teams of five on each job in the plant, typically scheduled in hour-long blocks. Often, the teams finish in much less than an hour, creating spare capacity that could be used on another job.
If Duke could more quickly, in near real-time, move those teams to new jobs they could optimize each person’s time. “So the idea was, ‘How can we use technology?’” Mitchell explains. “What if we had an RFID chip on all of our workers? Not to ‘Big Brother’ check in on them,” he quickly clarifies, but to better allocate the spare capacity of thousands of people. Sounds promising, for sure.
Not so fast though, Mitchell says: “You need to validate, will that [approach] work? Will RFID actually work in the plant?” In a traditional strategy cycle, he goes on, “[You’d] order a thousand of these things, assuming the idea was good.” Instead, Duke took a validated strategy approach. As Mitchell says, they instead thought, “let’s order one, let’s take it out there and see if it actually works in plant environment.” And, more importantly, can you actually put in place the networking and software needed: “Can we get the data back in real time? What do we do with data?” The journey team tested out the core strategic theories before the company invested time and money into a longer-term project and set of risks.
Key to all this, of course, is putting these journey teams in place and making sure they have the tools needed to safely and realistically test out these prototypes. “[T]he journey team would have enough, you know, a very small amount of support from a software engineer and designer to do a prototype,” Mitchell explains. “[H]opefully, a lot of the assumptions can be validated by going out and talking to people,” he goes on, “and, in some cases there’s a prototype to be taken out and validated. And, again, it’s not a paper prototype—unless you can get away with it—[it’s] working software.”
Once the strategic assumptions are validated (or invalidated, the entire company has a lot more confidence in the corporate strategy. “Once they … validate [the strategy],” Mitchell explains, “you’ve convinced me—the leader, board, whatever—that you know you’re talking about.”
With software, as I laid out in Monolithic Transformation, the key ways to execute the loop are short release cycles, smaller amounts of code in each release, and the infrastructure capabilities to reliably reverse changes and maintain stability if things go wrong.
These IT changes lead directly to positive business outcomes. Using a small batch cycle increases the design quality and cost savings of application design, directly improving your business. First, the shorter, more empirical, customer-centered cycles mean you better match what your customers actually want to do with your software. Second, because your software’s features are driven by what customers actually do, you avoid overspending on your software by putting in more features than are actually needed.
For example, The Home Depot kept close to customers and “found that by testing with users early in the first two months, it could save six months of development time on features and functionality that customers wouldn’t use.” That’s 4 months time and money saved, but also functionality in the software that better matches what customers want.
As you mature, these capabilities lead to even wider abilities to experiment with new features like A/B testing, further honing down the best way to match what your software does to how your customers want to use it, and, thus, engage with your business. TK( quick example would be nice here ).
Software is the reason we call tech companies tech companies. They rely on software to run, even define their business. Thus, it’s TK( maybe? ) software strategy where we need to look at next.
This is a draft excerpt from a book I’m working on, tentatively titled The Business Bottleneck. If you’re interested in the footnotes, leaving a comment, and the further evolution of the book, check out the Google Doc for it.
The Business Bottleneck
All businesses have one core strategy: to stay alive. They do this by constantly offering new reasons for people to buy from them and, crucially, stay with them. Over the last decade, traditional businesses have been freaked by competitors that are figuring out better offerings and stealing those customers. The super-clever among these competitors innovate entirely new business models: hourly car rentals, next day delivery, short term insurance for jackets, paying for that jacket with your phone, banks with only your iPhone as a branch, incorporating real-time weather information into your reinsurance risk analysis.
In the majority (maybe all) of these cases, surviving and innovating is done well with small business and software development cycles. The two work hand-in-hand are ineffective without the other. I’d urge you think of them as the same thing. Instead of business development and strategy using PowerPoint and Machiavellian meeting tactics as their tool, they now use software.
You innovate by systematically failing weekly, over and over, until you find the thing people will buy and the best way to deliver it. We’ve known this for a long time and enshrined it in processes like The Lean Startup, Jobs to Be Done, agile development and DevOps, and disruption theory. While these processes are known and proven, they’ve hit several bottlenecks in the rest of the organization. In the past, we had IT bottlenecks. Now we have what I’ve been thinking of as The Business Bottleneck. There’s several of them. Let’s start by looking at the first, and, thus, most pressingly damaging one. The bottleneck that cuts off business health and innovation before it even starts: finance.
Most software development finance is done wrong and damages business. Finance seeks to be accurate, predictable, and works on annual cycles. This is not at all what business and software development is like.
Business & software development is chaos
Software development is a chaotic, unpredictable activity. We’ve known this for decades but we willfully ignore it like the advice to floss each day. Mark Schwartz has a clever take on the Standish software project failure reports. Since the numbers in these reports stay the same each year, basically, the chart below shows that that software is difficult and that we’re not getting much better at it:
What this implies, though, is something even more wickedly true: it’s not that these project failed, it was that we had false hopes. In fact, the red and yellow in the original chart actually shows that software is performs consistent to its true nature. Let me rework the chart to show this:
What this second version illustrates is that the time and budget it takes to get software software right can’t be predicted with any useful accuracy. The only useful accuracy is that you’ll be wrong in your predictions. We call it software engineering, and even more accurately “development” because it’s not scientific. Science seeks to describe reality, the be precise and correct – to discover truths that can be repeated. Software isn’t like that at all. There’s little science to what software organizations do, there’s just the engineering mentality of what works with what we have time and budget to do.
What’s more, business development is chaotic as well. Who knows what new business idea, what exact feature will work and be valuable to customers? Worse, there is no science behind business innovation – it’s all trial and error, constantly trying to both sense and shape what people and businesses will buy and at what price. Add in competitors doing the same, suppliers gasping for air in their own chaos quicksand, governments regulating, and culture changing people’s tastes, and it’s all a swirling cipher.
In each case, the only hope is rigorously using a system of exploration and refining. In business, you can study all the charts and McKinsey PDFs you want, but until you actually experiment by putting a product other there, seeing what demand and pricing is, and how your competitors will respond, you know nothing. The same is true for software.
Each domain has tools for this exploration. I’m less familiar with business development, and only know the Jobs to Be Done tool. This tool studies customer behaviors to discover what products they actually will spend money on, to find the “job” they hire your company to solve, and then change the business to profit from that knowledge.
The discovery cycle in software follows a simple recipe: you reduce your release cycle down to a week and use a theory-driven design process to constantly explore and react to customer preferences. You’re looking to find the best way to implement a specific feature in the UI to maximize revenue and customer satisfaction. That is, to achieve whatever “business value” you’re after. It has many names and diagrams, but I call this process the “small batch cycle.”
For example, Orange used this cycle when perfecting its customer billing app. Orange wanted to reduce traffic to call centers, thus lower costs but also driving up customer satisfaction (who wants to call a call center?). By following a small batch cycle, the company found that its customers only wanted to see the last two month’s worth of bills and their employees current data usage. That drove 50% of the customer base to use the app, helping remove their reliance on actual call centers, driving down costly and addressing customer satisfaction.
These business and software tools start with the actual customers, people, who are doing the buying and use these people as the raw materials and lab to run experiments. The results of these experiments are used to validate, more often invalidate theories of what the business should be and do. That’s a whole other story, and the subject of my previous book, Monolithic Transformation.
We were going to talk about finance, though, weren’t we?
The Finance Bottleneck
Finance likes certainly – forecasts, plans, commits, and smooth lines. But if you’re working in the chaos of business and software development, you can’t commit to much. The only certainty is that you’ll know something valuable once you get out there and experiment. At first all you’ll learn is that your idea was wrong. In this process, failure is as valuable as success. Knowing what doesn’t work, a failure, is the path to finding what does work, a success. You keep trying new things until you find success. To finish the absurd truth: failure creates success.
Software organizations can reliably deliver this type of learning each week. The same is true for business development. We’ve known this for decades, and many organizations have used it as their core differentiation engine.
But finance doesn’t work in these clever terms. “What they hell do you mean ‘failure creates success’? How do I put that in a spreadsheet?” we can hear the SVP of Finance saying, “Get the hell out of this conference room. You’re insane.”
Instead, when it comes to software development, finance focuses only on costs. These are easy to know: the costs of staff, the costs of their tools, and the costs of the data centers to run their software. Business development has similar easy to know costs: salary, tools, travel, etc.
When you’re developing new businesses and software, it’s impossible to know the most important number: revenue. Without that number, knowing if costs are good or bad is difficult. You can estimate revenue and, more likely, you can wish-timate it. You can declare that you’re going to have 10% of your total addressable market (TAM). You can just declare – ahem, assume – that you’re chasing a $9bn market opportunity. Over time, once you’ve discovered and developed your business, you can start to use models like consumer spending vs. GDP growth, or the effect of weather and political instability on the global reinsurance market. And, sure, that works as a static model so long as nothing ever changes in your industry.
For software development, things are even worse when it comes to revenue. No one really tells IT what the revenue targets are. When IT is asked to make budgets, they’re rarely involved in, nor given revenue targets. Of course, as laid out here, these targets in new businesses can’t be known with much precision. This pushes IT to just focus on costs. The problem here, as Mark Schwartz points out in all of his books, is that cost is meaningless if you don’t know the “value” you’re trying to achieve. You might try to do something “cheaply,” but without the context of revenue, you have no idea what “cheap” is. If the business ends up making $15m, is $1m cheap? If it ends up making $180m, is $5m cheap? Would it have been better to spend $10m if it meant $50m more in revenue?
IT is rarely involved in the strategic conversations that narrow down to a revenue. Nor are they in meetings about the more useful, but abstract notion of “business value.” So, IT is left with just one number to work with: cost. This means they focus on getting “a good buy” regardless of what’s being bought. Eventually, this just means cutting costs, building up a “debt” of work that should have been done but was “too expensive” at the time. This creates slow moving, or completely stalled out IT.
A rental car company can’t introduce hourly rentals because the back office systems are a mess and take 12 months to modify – but, boy, you sure got a good buy! A reinsurance company can’t integrate daily weather reports into its analytics to reassess its risk profile and adjust its portfolio because the connection between simple weather APIs and rock-solid mainframe processing is slow – but, sister, we sure did get a good buy on those MIPS! A bank can’t be the first in its market to add Apple Pay support because the payments processing system takes a year to integrate with, not to mention the governance changes needed to work with a new clearinghouse, and then there’s fraud detection – but, hoss, we reduced IT costs by $5m last year – another great buy!
Worse than shooting yourself in the foot is having someone else shoot you in the foot. As one pharmacy executive put it, taking six months to release competitive features isn’t much use if Amazon can release them in two months. But, hey! Our software development processes cost a third less than the industry averages!
Business development is the same, just with different tools and people who wear wing-tips instead of toe-shoes. Hopefully you’re realizing that the distinction between business and software development is unhelpful – they’re the same thing.
The business case is wrong from the start
So, when finance tries to assign a revenue number, it will be wrong. When you’re innovating, you can’t know that number, and IT certainly isn’t going to know it. No one knows the business value that you’re going to create: you have to first discover it, and then figure out how to deliver it profitably.
As is well known, the problem here is the long cycle that finance follows: at least a year. At that scope, the prediction, discovery, and certainty cycle is sloppy. You learn only once a year, maybe with indicators each quarter of how it’s going. But, you don’t really adjust the finance numbers: they don’t get smarter, more accurate, as you learn more each week. It’s not like you can go get board approval each week for the new numbers. It takes two weeks just to get the colors and alignment of all those slides right. And all that pre-wiring – don’t even get me started!
In business and software development, each week when you release your software you get smarter. While we could tag shipping containers with RFID tags to track them more accurately, we learn that we can’t actually collect and use that data – instead, it’s more practical to have people just enter the tracking information at each port, which means the software needs to be really good. People don’t actually want to use those expensive to create and maintain infotainment screens in cars, they want to use their phones – cars are just really large iPhone accessories. When buying a dishwasher, customers actually want to come to your store to touch and feel them, but first they want to do all their research ahead of time, and then buy the dishwasher on an app in the store instead of talking with a clerk.
These kinds of results seem obvious in hindsight, but business development people failed their way to those success. And, as you can imagine, strategy and finance assumptions made 12 to 18 months ago that drove businesses cases often seem comical in hindsight.
A smaller cycle means you can fail faster, getting smarter each time. For finance, this means frequently adjusting the numbers instead of sticking to the annual estimates. Your numbers get better, more accurate over time. The goal is to make the numbers adjust to reality as you discover it, as you fail your way to success, getting a better idea of what customers want, what they’ll pay, and how you can defend against competition.
Small batch finance
Some companies are lucky to just ignore finance and business models. They burn venture capital funding as fuel to rocket towards stability and profitability. Uber is a big test of this model – will it become a viable business model (profitable), or will it turn out that all that VC money was just subsidizing a bad business model? Amazon is a positive example here, over the past 20 years cash-as-rocket-fuel launched them to boatloads of profit.
Most organizations prefer a less expensive, less risky methods. In these organizations, what I see are programs that institutionalize these failure driven cycles. They create new governance and financing models that enforce smaller business cycles, allowing business and software development to take work in small batches. Allianz, for example, used 100 day cycles discover and validate new businesses. Instead of one chance every 365 days to get it right, they have three, almost four. As each week goes by, they get smarter, there’s less waste and risk, and finance gets more accurate. If their business theory is validated, the new business is graduated from the lab and integrated back into the relevant line of business. The Home Depot, Thales, Allstate, and many others institutionalize similar practices.
Each of these cycles gives the business the chance to validate and invalidate assumptions. It gives finance more certainly, more precision, and, thus, less errors and risk when it comes to the numbers. Finance might even be able to come up with a revenue number that’s real. That understanding makes funding business and software development less risky: you have ongoing health checks on the viability of the financial investment. You know when to stop throwing good money after bad when you’ve invalidated your business idea. Or, you can change your assumptions and try again: maybe no one really wants to rent cars by the hour, maybe they want scooters, or maybe they just want a bus pass.
Finance has to get involved in this fail-to-success cycle. Otherwise, business and software development will constantly be driven to be the cheapest provider. We saw how this generally works out with the outsourcing craze of my youth. Seeking to be the cheapest, or the synonomic phrase, the “most cost effective,” option ends up saving money, but paralyzing present and future innovation.
The problem isn’t that IT is too expensive, or can’t prove out a business case. As the Gartner study above shows, the problem is that most financing models we use to gate and rate business and software development are a poor fit. That needs to be fixed, finance needs to innovate. I’ve seen some techniques here and there, but nothing that’s widely accepted and used. And, certainly, when I hear about finance pushing back on IT businesses cases, it’s symptomatic of a disconnect between IT investment and corporate finance.
Businesses can certainly survive and even thrive. The small, failure-to-success learning cycles used by business and software developers works, are well known, and can be done by any organization that wills it. Those bottlenecks are broken. Finance is the next bottleneck to solve for.
I don’t really know how to fix it. Maybe you do!
Crawl into the bottleneck
After finance, for another time, my old friends: corporate strategy. And if you peer past that blizzard of pre-wired slides and pivot tables, you can see just in past the edges of the next bottleneck, that mysterious cabal called “The C-Suite.” Let’s start with strategy first.
Here’s a recording of one of my talks. It’s on what the operations team does when running in a platform, DevOps-y, whatever style:
Developers don’t need “services” from ops, they need products: continuously innovated platforms that evolve weekly. Once ops toil is removed, ops can focus on their customers’ – development – needs. Using stories & tactics from the real-world, this talk helps launch a platform-as-a-product strategy.
Most ops groups can’t give developers what they need. Ops is limited by traditional service delivery mindset and tools. Stability & reliability are now table-stakes when you’re releasing software daily. What developers need now from ops is innovation. Operations has rarely takes this innovation-driven, product approach to providing services, & instead focuses on delivering to specification & limiting SLAs. As with development, ops creates value with continuous operations, product managing their platforms and releasing frequently.
This talk covers how ops groups are transforming from a service delivery mindset a platform-as-a-product approach. With examples from Discover Financial Services, Rabobank, the US Air Force, & others the talk covers the concept, technologies & tools commonly used, & ops tactics needed to kick-off a platform-as-a-product strategy.
The cliché we all recite is that technology isn’t the problem, culture is. Put another way: if the hardware and software are fine and fresh, it must be the meatware that smells. Come hear several de-funking recipes from the world’s largest companies whose meat now smells proper.
I answered a few attendee questions in the webinar, and answered the rest in a Twitter thread afterwards.
Using Puppet and Chef (and then Ansible and Chef) to replace Opsware and BladeLogic.
Full stack engineers to setup EC2, load-balancers, and other Morlock shit.
Full stack engineers are bad, but sort of the same thing. Also, you can’t have a DevOps “group” or title. But, you know, someone should do all that automation.
Putting all the people on one team, having them focus on a product, and establishing a culture of caring and learning.
SRE is not DevOps.
So…actually five. Maybe some of them just being footnotes on the evolving concept. (And, if you, dear reader, feel these are wrong, then let’s compromise and make the list six.)
All of them evolved around bringing down The Wall of Confusion, allowing “developers” to deploy their software to production more frequently, weekly, if not daily. And, of course, making sure production stays up. (You’re supposed to call that “resiliency” and instead of SLAs use SLOs and some other newly named metrics that answer the question “IS MY SHIT WORKING?” Whatever you do, just don’t say “uptime,” or you’re in for it and will be relegated to running the AS/400’s.)
I used to snide that the developers seemed to have been yanked out of DevOps, sometime around 2014 and 2015. All the talks I saw were, basically, operations talks. I haven’t really checked in on DevOps conference talks recently, but at the time, I don’t think there was much application development stuff. (I’m not sure if there ever was?)
Nowadays, I try to stick to that forth one: you want to setup autonomous teams that have all the skills and responsibility/authority/tools needed to “own” the software being specified, designed, developed, and run. This means you have to, basically, remove-by-automating all the operations stuff it takes to stand-up environments, deploy things, and do all that “day 2” stuff.
Now, I think this product-centric notion of DevOps is, well, kind of an over-extension of the term “DevOps.” But since SRE has sucked out the “ops” part (but, remember, dear reader, don’t commit the embarrassing act of saying SRE is DevOps – no, no, you’d never do that, right? SO SHAMEFUL! (SRE is totally different – no overlap or similar goals shared between them at all. I mean, they have separate groups, silos! COME ON!)), slicing “DevOps” back to just “Dev,” but with a product-not-project focus isn’t too shabby.
Anyhow. I came across a good overview of this product notion of DevOps, all the way back from 2016, while re-reading Schwartz’s evergreen excellent The Art of Business Value:
Agile approaches attempt to bring together developers and the business in an atmosphere of mutual respect and joint contribution. Until now, however, the focus has been on users of the software, product visionaries, and developers. Recent developments in the Agile world—notably DevOps—have broadened this idea of respect and inclusion to encompass Operations and Security. The DevOps model, in other words, looks to break down the silos that have resulted from technical specialization over the last few decades. But the DevOps spirit goes further, looking to eliminate the conflicting incentives of organizational silos and the inhumane behaviors that can result from those conflicting incentives.
Perhaps we can take this idea even further still. There is no reason why the DevOps team’s responsibility needs to stop at the border of what used to be considered IT. The team is part of a broader enterprise, whose collective knowledge, skills, and judgment need to be part of the value creation process.
Look a’ that guy! Business Value just effortlessly jets out of his pores like a peripatetic thought-monarch!
This is from an executives perspective, but it drives home the point we’re always trying to get to with software: doing whatever it takes to figure out, create, and give users features that are actually useful to them. Somewhere beyond that, if you’re lucky, it’ll help out “the business.” Also, it should implement The Unspoken User Story: user would like software to actually work.
Eirini For DevelopersFor Developers there are two big wins from Eirini. Firstly, if you want a Cloud Foundry cluster and you have access to Kubernetes but not VMs, Eirini lets you get it and kick the tires really fast. Secondly when you do need or want to pull the escape hatch and drop down to Kubernetes, everything you’ve cf push-ed is available as native Kubernetes objects under the covers.
Eirini For OperatorsThe big win from Eirini, though, is for Operators. Many platform operators already need to maintain a Kubernetes stack, for the stateless services their Cloud Foundry uses. Today, in order to provide an Easy Switch for developers, those operators need to manage two schedulers (Diego and Kubernetes), and any tooling and monitoring they use needs to be duplicated between the two. Deploying both the Diego and Kubernetes via Bosh can make this a bit better, but it doesn’t solve the bulk of the problem. Eirini standardises the underlying infrastructure so it’s all Kubernetes under the covers.
Inside this interview, there’s an excellent explanation of what product management means in an enterprise. By “enterprise,” I mean a company who’s product is not technology. That is, most every company and organization out there. To that end, there’s a great example of doing product management and design at a food services company: discovering the actual problem to solve to meet business needs, and solving it by experimenting with a small batch loop.
I get asked to talk with “executives” more and more. That’s part of why Pivotal moved me over to Europe. People make lots of claims about what executives want to hear, the conversations you can have with them as a vendor. They don’t have time. You have have to be concise. They don’t want to hear the details. They just want to advance their careers.
None of those are really my style, even part of my core epistemes. When I have a good conversation with anyone, it’s because we’re both curious about something we don’t know. The goal is to understand it, sort of hold it out on a meat-selfie-stick and look at it from all angles. This find that most people, especially people in management positions charged with translating corporate strategy to cash enjoy this. Some don’t, of course.
Anyhow, I’ve been writing down some common themes and “unknowns” for IT executives:
Innovation – use IT to help change how the current business functions and create new businesses. Rental car companies want to streamline the car pick-up process, governments want to go from analog and phone driven fulfillment to software, insurers want to help ranchers better track and protect the insured cows. Innovation is now a vacuous term, but when an organization can reliably create and run well designed software, innovation can actually mean something real, revenue producing, and strategic.
Keep making money – organizations already have existing, revenue producing businesses, often decades old. The IT supporting those businesses has worked for all that time – and still works! While many people derisively refer to this as “keeping the lights on,” it’s very difficult to work in the dark. Ensuring that the company can keep making money from their existing IT assets is vital – those lights need to stay on.
Restoring trust in IT’s capabilities – organizations expect little from IT and rarely trust them with critical business functions, like innovating. After decades of cost cutting, outsourcing, and managing IT like a series of projects instead of a continuous stream of innovation. The IT organization has to rebuild itself from top to bottom – how it runs infrastructure, how it developer and runs software, and the culture of IT. Once that trust is built, the business needs to re-set its expectations of what IT can do, reinventing IT back into everyday business.
What happens next is the fun part: how do executives reprogram their organization to do the above?
That’s my take on “to talk with executives,” then: learning what they’re doing, even validating my assumptions like the above. This is, or course, filled in with all sorts of before/afterr performance anecdotes (“proof points” and “cases”). Those are just conversational accelerants, though. They’re the things that move the narrative forward by keeping the reader engaged, so to speak, by keeping you interested (my self as well).
Anyhow. Even all this is a theory on my part, something to be validated. As I have more of these conversations, we’ll see what happens.
I’m too wordy when I reply to reporters. This is mostly true everywhere I produce content. I don’t like trite, simple answers. Brevity and clarity makes me suspicious, especially on topics I know well. As a consequence, I don’t think this interview by email was ever published.
What’s a DevOps advocate?
If you mean what I do, it means studying people and organizations who are trying trying to improve how they do software, summarizing all those, ongoing, into several different types of content, and then trying to help, advise, educate people on how they can improve how they do software. A loop of learning and then trying to teach, in a limited way. For example, I’m working on finish up a book that contains a lot of this stuff that I’ve found over the past couple of years.
What is the foundation of DevOps: automation, agility, tools, continuous or all of them?
Yes, those are the core tools. The traditional foundation is “CALMS” which means Culture, Automation, Lean, Measurement, and Sharing. Ultimately, these are things any innovation-driven process follows, but they’re called out explicitly because traditional IT has lost its way and doesn’t usually focus on these common sense thing. A lot of what DevOps is trying to do is just get people to follow better software development and delivery practices…ones they should have been doing all along but got distracted from with outsourcing, SLAs, cost cutting, and the idea of treating IT like a service, or utility rather than an innovation engine for “the business.”
Anyhow, CALMS means:
Culture – the norms, processes, and methodology IT follows. You want to shift from a project delivery culture to a product culture, from service management to innovation. Defining “culture,” let along how to change it and how to use it is slippery. I wrote up what I’ve figured out so far here.
Automation – this is the easiest to understand of all the DevOps things. It means, to focus on automating as much as possible. If you find yourself manually doing some configuration or whatever, or relying on people opening a ticket to get something
(Like a database, etc.), figure out how to automate that instead.
Lean – software development has been borrowing a lot from Lean for the past 15 years. DevOps takes most all of it, but the key concepts it brings in are eliminated waste (effort spent that has “no value” to customers, in IT, often wait time for things like setting up servers and such) and working on incremental, more frequent (like weekly) releases rather than big, yearly releases.
Measurement – DevOps, like agile, is actually very disciplined if done properly. In addition to monitoring your applications and such in production, in order to continuously improve, DevOps is interested in measuring metrics around process. How many bugs are in each release? How frequently do we deploy software? And so forth. The point is to use these measurements to indicate areas of improvement and figure out if you’re actually improving or not.
Sharing – this was added after the initial four concepts. It’s straight forward and means that people across groups and even across organizations should share knowledge with each other. It also means, within organizations, having more unified teams of people rather than different groups that try to work with each other.
Today, we can ship every day. What impact for the teams and developers?
Shipping more frequently means you have more input on the usefulness of your software and it also adds much more stability and predictably into your software process. Because you’re shipping weekly, or daily, you can observe how people use your software and make very frequent changes to improve your software. There’s a loop of trying our a new feature, releasing it and observing how people use it, and then coming up with a new way to solve that problem better.
Stability and predictability are introduced because you establish a realistic rate of feature delivery each week. When you’re delivering each week, you quickly learn how much code (or features) you can do each week. This means that rather than having developers estimate how many features they can deliver in a year, for example, you learn how much they can actually deliver each week. Estimates are pretty much always wrong, and complete folly. But, once you calibrate and know how many features the team can deliver each week, they’re predictable and the overall process is more stable.
Monolithic’ architecture vs modular’ approach. Are we talking micro-service? Container?
Yes, a monolithic architecture implies software that’s made of many different parts, but that all depend strongly on each other. To be frank, it also means software that’s complex, poorly tested, and, thus, not well understood. “Monolith” is often used for “software I’m scared to change,” that is, “legacy software.” In contrast, if you’re fine to change software and don’t fear doing so, you just call it “software.”
A microservice architecture is the current approach to break up “monoliths” into more independent components, different services that evolve on their own but are composed together for an application. Buying a product online is a classic example. If you look at the product page, it could be composed of many different services: pictures of the product, figuring out the pricing for your region, checking inventory for the product, listing reviews, etc. A monolithic architecture would find all of that information all at once, in “one” piece of code. An application following a microservices architecture would treat all of these things as third party, not under your control services and compose the page from calling all those services.
To over simplify it, we used to call this idea “mashups” in the Web 2.0 era: pulling data from a lot of different sources and “mashing” that data up into a web page. All the rotating ads and suggested content you see on news sites are a metaphoric example as well: each of those components are pulled in from some other service rather than managed and collected together by the news site CMS. This is why the ads and suggested content are often awful, of course: there’s no editorial control over them.
Infra as Code? Another thing?
“Infrastructure as code” means using automation tools the building and configuring of servers (the software parts, not the hardware) and other “infrastructure” and then treating those automation workflows as if they were software code: you check them into version control and track them like a version of your application. This means that you can check out, for example, a version of the server you’re configuring and automatically create it. The point of doing this is get more visibility and control over that configuration by removing manual, human-driven configuring and such. Humans create errors, forget how things were done, have bad hair days, and otherwise foul things up. Computers don’t (unless those annoying humans tell them to).
For you, what is the ideal architecture?
An annoying, though accurate answer would be “it depends. I don’t really code anymore, so I couldn’t really say. Usually, you start with the minim needed and just add in more complex architectures as needed. That sounds like the opposite of architecture, but it’s worse to end up with something like all those giant, built out cities that end up having few people living in them.
Kanban, craftsmanship: friend or enemy of DevOps?
Kanban is used a lot in DevOps, maybe not fully. But, the idea of having cards that represent a small feature, a backlog that contains those cards ranked by some priority, and then allowing people to pull those cards and put them in columns marked something like “working on” and “complete” is used all the time.
I’m not sure what “craftsmanship” is in this context, but it it means perfecting things like some master furniture maker, most DevOps people would encourage you to instead “release” the cabinets more frequently to find out how they should be designed than assuming you knew what was needed and working on it all at once: maybe they want brutalist square legs instead of elegant rounded legs topped with a swan.
And, of course, if “craftsmanship” means “doing a good job and being conscious of how you’re evolving your trade,” well, everyone would say they do that, right? :)
Anywhere there is lack of speed, there is massive business vulnerability:
Speed to deliver a product or service to customers.
Speed to perform maintenance on critical path equipment.
Speed to bring new products and services to market.
Speed to grow new businesses.
Speed to evaluate and incubate new ideas.
Speed to learn from failures.
Speed to identify and understand customers.
Speed to recognize and fix defects.
Speed to recognize and replace business models that are remnants of the past.
Speed to experiment and bring about new business models.
Speed to learn, experiment, and leverage new technologies.
Speed to solve customer problems and prevent reoccurrence.
Speed to communicate with customers and restore outages.
Speed of our website and mobile app.
Speed of our back-office systems.
Speed of answering a customer’s call.
Speed to engage and collaborate within and across teams.
Speed to effectively hire and onboard.
Speed to deal with human or system performance problems.
Speed to recognize and remove constructs from the past that are no longer effective.
Speed to know what to do.
Speed to get work done.
— John Mitchell, Duke Energy.
When enterprises need to change urgently, in most cases, The Problem is with the organization, the system in place. Individuals, like technology, are highly adaptable and can change. They’re both silly putty that wiggle into the cracks as needed. It’s the organization that’s obstinate and calcified.
How the organization works, it’s architecture, is the totally the responsibility of the leadership team. That ream owns it just like a product team owns their software. Leadership’s job is to make sure the organization is healthy, thriving, and capable.
DevOps’ great contribution to IT is treating culture as programmable. How your people work is as agile and programmable as the software. Executives, management, and enterprise architects — leadership — are product managers, programmers, and designers. The organization is their product. They pay attention to their customers — the product teams and the platform engineers — and do everything possible to get the best outcomes, to make the product, the organization, as productive and well designed as possible.
I’ve tried to collect together what’s worked for numerous organizations going through — again, even at the end, gird your brain-loins, and pardon me here — digital transformation. Of course, as in all of life, the generalized version of Orwell’s 6th rule applies: “break any of these rules rather than doing anything barbarous.
As you discover new, better ways of doing software I’d ask you to share those learnings a widely as possible, especially outside of your organization. There’s very little written on the topic of how regular, large organization managing the transformation to becoming software-driven enterprises.
Know that if your organization is dysfunctional, is always late and over budget, that it’s your fault. Your staff may be grumpy, may seem under-skilled, and your existing infrastructure and application may be pulling you down like a black-hole. All of that is your product: you own it.
As I recall, a conclusion is supposed to be inspirational instead of a downer. So, here you go. You have the power to fix it. Hurry up and get to work.
We had assumed that alignment would occur naturally because teams would view things from an enterprise-wide perspective rather than solely through the lens of their own team. But we’ve learned that this only happens in a mature organization, which we’re still in the process of becoming. — Ron van Kemenade, ING.
The enterprise architect’s role in all of this deserves some special attention. Traditionally, in most large organizations, enterprise architects define the governance and shared technologies. They also enforce these practices, often through approval processes and review boards. An enterprise architect (EA) is seldom held in high regard by developers in traditional organizations. Teams (too) often see EAs as “enterprise astronauts,” behind on current technology and methodology, meddling too much in day-to-day decisions, sucking up time with change-advisory boards (CABs), and forever working on work that’s irrelevant to “the real work” done in product teams.
If cruel, this sentiment often has truth to it. “If I’m doing 8 or 15 releases a week,” HCSC’s Mark Ardito says, “how am I going to get through all those CABs?” While traditional EAs may do “almost nothing” of value for high performing organizations, the role does play a significant part in cloud native leadership.
First, and foremost, EAs are part of leadership, acting something like the engineer to the product manager on the leadership team. An EA should intimately know the current and historic state of the IT department, and also should have a firm grasp on the actual business IT supports.
While EAs are made fun of for ever defining their enterprise architecture diagrams, that work is a side-effect of meticulously keeping up with the various applications, services, systems and dependencies in the organization. Keeping those diagrams up-to-date is a hopeless task, but the EAs who make them at least have some knowledge of your existing spaghetti of interdependent systems. As you clean-up this bowl of noodles, EAs will have more insights into the overall system. Indeed, tidying up that wreckage is an under appreciate task.
The EA’s dirty hands
I like to think of the work EAs do as gardening the overall organization. This contrasts with the more tops-down idea of defining and governing the organization, down to technologies and frameworks used by each team. Let’s look at some an EAs gardening tasks.
Setting technology & methodology defaults
Even if you take an extreme, developer friendly position, saying that you’re not going to govern what’s inside each application, there are still numerous points of governance about how the application is packaged, deployed, how it interfaces and integrates with other applications and services, how it should be instrumented to be managed, and so on. In large organizations, EAs should play a large role in setting these “defaults.” There may be reasons to deviate, but they’re the prescribed starting points.
I think that it’s important that as you’re doing this you do have to have some standards about providing a tap, or an interface, or something to be able to hook anything you’re building into a broader analytics ecosystem called a data-lake — or whatever you want to call it — that at least allows me to get at your data. It’s not you know, like “hey I wrote this thing using a gRPC and golang and you can’t get at my data!” No you got to have something where people can get at it, at the very least.
Beyond software, EAs can also set the defaults for the organization’s meatware, all the process, methodology, and other “code” that actual people execute. Before Home Depot started standardizing their process, Tony McCully says, “everyone was trying to be agile and there was this very disjointed fragmented sort of approach to it… You know I joke that we know we had 40 scrum teams and we were doing it 25 different ways.” Clearly, this is not ideal, and standardizing how your product teams operate is better.
First, someone has to define all the applications and services that all those product teams form around. At a small scale, the teams themselves can do this, but as you scale up to 1,000’s of people and 100’s of teams, gathering together aStar Wars scale Galactic Senate is folly. EAs are well suited to define the teams, often using domain-driven design (DDD) to first find and then form the “domains” that define each team. A DDD analysis can turn quickly into its own crazy wall of boxes and arrows, of course. Hopefully, EAs can keep the lines as helpfully straight as possible.
Rather than checking in on how each team is operating, EAs should generally focus on the outcomes these teams have. Following the rule of team autonomy (described elsewhere in this booklet), EAs should regularly check on each team’s outcomes to determine any modifications needed to the team structures. If things are going well, whatever’s going on inside that black box must be working. Otherwise, the team might need help, or you might need to create new teams to keep the focus small enough to be effective.
Most cloud native architectures use microservices, hopefully, to safely remove dependencies that can deadlock each team’s progress as they wait for a service to update. At scale, it’s worth defining how microservices work as well, for example: are they event based, how is data passed between different services, how should service failure be handled, and how are services versioned?
Again, a senate of product teams can work at a small scale, but not on the galactic scale. EAs clearly have a role in establishing the guidance for how microservices are done and what type of policy is followed. As ever, this policy shouldn’t be a straight-jacket. The era of SOA and ESBs has left the industry suspicious of EAs defining services. Those systems became cumbersome and slow moving, not to mention expensive in both time and software licensing. We’ll see if microservices avoid that fate, but keeping the overall system light-weight and nimble is clearly a gardening that EAs are well suited for.
As we’ll discuss later, at the center of every cloud native organization is a platform. This platform standardizes and centralizes the runtime environment, how software is packaged and deployed, how it’s managed in production, and otherwise removes all the toil and sloppiness from traditional, bespoke enterprise application stacks. Most of the platform cases studies I’ve been using, for example, are from organizations using Pivotal Cloud Foundry.
Occasionally, EAs become the product managers for these platforms. The platform embodies the organization’s actual enterprise architecture and evolving the platform, thus, evolves the architecture. Just as each product team orients their weekly software releases around helping their customers and users, the platform operations team runs the platform as a product.
EAs might also get involved with the tools groups that provide the build pipeline and other shared services and tools. Again, these tools embody part of the overall enterprise architecture, more of the running cogs behind all those boxes and arrows.
As a side-effect of product managing the platform and tools, EAs can establish and enforce governance. The packaging, integration, runtime, and other “opinions” expressed in the platform can be crafted to force policy compliance. That’s a command-and-control way of putting it, and you certainly don’t want your platform to be restrictive. Instead, by implementing the best possible service or tool, you’re getting product teams to follow policy and best practices by bribing them with easy of use and toil-reduction.
It’s the same as always
I’ve highlighted just three areas EA contribute to in a cloud native organization. There are more, many of which will depend on the peccadilloes of your organization, for example:
Identifying and solving sticky cultural change issues is one such, situational topic. EAs will often know individual’s histories and motivations, giving them insights into how to deal with grumps that want to stall change.
EA groups are well positioned to track, test, and recommend new technologies and methodologies. This can become an “enterprise astronaut” task of being too far afield of actual needs and not understanding what teams need day-to-day, of course. But, coupled with being a product manager for the organizations’ platform, scouting out new technologies can be grounded in reality.
EAs are well positioned to negotiate with external stakeholders and blockers. For example, as covered later, auditors often end-up liking the new, small batch and platform-driven approach to software because it affords more control and consistency. Someone has to work with the auditors to demonstrate this and be prepared to attend endless meetings that product team members are ill-suited and ill-tempered for.
What I’ve found is that EAs do what they’ve always done. But, as with other roles, EAs are now equipped with better process and technology to do their jobs. They don’t have to be forever struggling eyes in the sky and can actually get to the job of architecting, refactoring, and programming the enterprise architecture. Done well, this architecture becomes a key asset for the organization, often the key asset of IT.
The CIO is the enterprise architect and arbitrates the quality of the IT systems in the sense that they promote agility in the future. The systems could be filled with technical debt but, at any given moment, the sum of all the IT systems is an asset and has value in what it enables the company to do in the future. The value is not just in the architecture but also in the people and the processes. It’s an intangible asset that determines the company’s future revenues and costs and the CIO is responsible for ensuring the performance of that asset in the future.
Hopefully the idea of architecting and then actually creating and gardening that enterprise asset is attractive to EAs. In most cases, it is. Like all technical people, they pine for the days when they actually wrote software. Now’s their chance to get back to it.
In banking, you don’t often get a clean slate like you would at some of the new tech companies. To transform banking, you not only need to be equipped with the latest technology skills, you also need to transform the culture and skill sets of existing teams, and deal with legacy infrastructure. — Siew Choo Soh, DBS Bank
Most organizations have a damaging mismatch between the culture of service management and the strategic need to become a product organization. In a product culture, you need the team to take on more responsibility, essentially all of the responsibility, for the full life cycle of the product. Week-to-week they need to experiment with new features and interpret feedback from users. In short, they need to become innovators.
Service delivery cultures, in contrast, tend more towards a culture of following up-front specification, process, and verification. Too often when put into practice, IT Service Management (ITSM) becomes a governance bureaucracy that drives project decision. This governance-driven culture tends to be much slower at releasing software than a product culture.
[A] key goal for DevOps teams is the establishment of a high cadence of trusted, incremental production releases. The CAB meeting is often seen as the antithesis of this: a cumbersome and infrequent process, sucking a large number of people into a room to discuss whether a change is allowed to go ahead in a week or two, without in reality doing much to ensure the safe implementation of that change.
Mainstream organizational management work has helpful definitions of culture: “Culture can be seen in the norms and values that characterize a group or organization,” O’Reilly and Tushman write, “that is, organizational culture is a system of shared values and norms that define appropriate attitudes and behaviors for its members.”
Jez Humble points out another definition, from Edgar Schein:
[Culture is] a pattern of shared tacit assumptions that was learned by a group as it solved its problems of external adaptation and internal integration, that has worked well enough to be considered valid and, therefore, to be taught to new members as the correct way to perceive, think, and feel in relation to those problems.
We should take “culture,” then, to mean the mindset used by people in the organization to make day-to-day decisions, policy, and best practices. I’m as guilty as anyone else for dismissing “culture” as simple, hollow acts like allowing dogs under desks and ensuring that there’s six different ways to make coffee in the office. Beyond trivial pot-shots, paying attention to culture is important because it drives of how people work and, therefore, the business outcomes they achieve.
Year after year, the DevOps reports show that “high performing” organizations are much more generative than pathologically…as you would suspect from the less than rosy words chosen to describe “power-oriented” cultures. It’s easy to identify your organization as pathological and equally easy to realize that’s unhelpful. Moving from the bureaucratic column to the generative column, however, is where most IT organizations struggle.
Core values of product culture
There are two layers of product culture, at least that I’ve seen over the years and boiled down. The first layer describes the attitudes of product people, the second the management tactics you put in place to get them to thrive.
Risk takers — I don’t like this term much, but it means something very helpful and precise in the corporate world, namely, that people are willing to do something that has a high chance of failing. The side that isn’t covered enough is that they’re also focused on safety. “Don’t surf if you can’t swim,” as Andrew Clay Shafer summed it up. Risk takers ensure they know how to “swim” and they build safety nets into their process. They follow a disciplined approach that minimizes the negative consequences of failure. The small batch process, for example, with its focus on a small unit of work (a minimal amount of damage if things go wrong and an easier time diagnosing what caused the error) and studying the results, good and bad, creates a safe, disciplined method for taking risks.
People focused — products are meant to be used by people, whether as “customers” or “employees.” The point of everything I’m discussing here is to make software that better helps people, be that delivering a product the like using or one that allows them to be productive, getting banking done as quickly as possible so they can get back to living their life, to lengthen DBS Bank’s vision. Focusing on people, then, is what’s needed. Too often, some people are focused on process and original thinking, sticking to those precepts even if they prove to be ineffective. People-focused staff will instead be pragmatic, looking to observe how their software is helping or hindering the people we call “users.” They’ll focus on making people’s lives better, not achieving process excellence, making schedules and dates, or filling out request tickets correctly.
Finding people like this can seem like winning the lottery. Product-focused people certainly are hard to find and valuable, but they’re a lot less rare than you’d think. More importantly, you can create them by putting the right kind of management policy and nudges in place. A famous quip by Adrian Cockcroft then at Netflix, now at Amazon) illustrates this. As he recounts:
[A]t a CIO summit I got the comment “we don’t have these Netflix superstar engineers to do the things you’re talking about”, and when I looked around the room at the company names my response was “we hired them from you and got out of their way.”
There is no talent shortage, just shortage of management imagination and gumption. As most recently described in the 2018 DORA DevOps report, over and over again, research finds that the following gumptions give you the best shot at creating a thriving, product-centric culture: autonomy, trust, and voice. Each of these three support and feed into each other as we’ll see.
People who’re told exactly what to do tend not to innovate. Their job is not to think of new ways to solve problems more efficiently and quickly, or solve them at all. Instead, their job is to follow the instructions. This works extremely well when you’re building IKEA furniture, but following instructions is a port fit when the problem set is unknown, when you don’t even know if you know that you don’t know.
Your people and the product teams need to autonomy to study their users, theorize how to solve their problems, and fail their way to success. Pour on too much command-and-control, and they’ll do exactly what you don’t want: they’ll follow your orders perfectly. A large part of a product-centric organization’s ability to innovate is admitting that people closest to the users — the product team — are the most informed about what features to put into the software and even what the user’s problems are. You, the manager, should be overseeing multiple teams and supporting them by working with the rest of the organization. You’ll lack the intimate, day-to-day knowledge of the users and their problems. Just as a the business analysts and architects in a waterfall process are too distant from the actual work, you will be too and will make the same errors.
Establishing and communicating goals, but letting the team decide how the work will be done.
Removing roadblocks by keeping rules simple.
Allowing the team to change rules if the rules are obstacles to achieving the goals.
Letting the team prioritize good outcomes for customers, even if it means bending the rules.
This list is a good start. As ever, apply a small batch mentality to how you’re managing this change and adapt according to your findings.
There are some direct governance and technology changes needed to give teams this autonomy. The product teams need a platform and production tools that allow them to actually manage the full-life cycle of their product. “[I]f you say to your team that ‘when you build it you also run it,’” Rabobanks’ Vincent Oostindië says, “you cannot do that with a consolidated environment. You cannot say to a team ‘you own that stuff, and by the way somebody else can also break it.’”
Taking risks, suggesting new features, resolving problems in production, and otherwise innovating in software requires a great deal of trust, both from management and of management. The DORA report defines trust, in this context as “how much a person believes their leader or manager is honest, has good motives and intentions, and treats them fairly.”
To succeed at digital transformation, the people in the product teams must trust management. Changing from a services-driven organization of a product organization requires a great deal of upheaval and discomfort. Staff are being asked to behave much differently than they’ve been told to in the past. The new organization can seem threatening to careers. People will gripe and complain, casting doubt on success. Management needs to first demonstrate that their desire to change can be trusted. Doing things like celebrating failures, rewarding people for using the new methods, and spending money on the trappings of the new organization (like free breakfast or training) will demonstrate management commitments.
Just as staff must trust management, managers must trust the product teams to be responsible and independent. This means managers can’t constantly check in on and meddle in the day-to-day affairs of product teams. Successful managers will find it all too tempting to get their hands dirty and volunteer to help out with problems. Getting too involved on a day-to-day basis is likely to hurt more than help, however.
Felten Buma uses Finding Nemo as a metaphor for the trust managers must have in their product teams…if you’ll pardon a cartoon reference in this book. Nemo’s father, Marlin, is constantly worried about and micromanaging his son, having been shocked by the death of his wife, Nemo’s mother. They’re fish as you might recall, so his mother was eaten one day. Not only that, but Nemo has a weak flipped on one side. Overall, this means Nemo’s father is a helicopter parent, but is also forever telling Nemo that he’s not skilled enough can’t do risky things, like swimming beyond the reef. While most leaders haven’t experienced the loss of one of their parents from fish’s meal-making, they’ve likely experienced some disasters in the past that could make them helicopter managers, always looking to “help” staff with advice about what works and doesn’t work. As in the movie, until that manager actually trusts the product team and demonstrates that trust by backing off, the product teams will lack the full moral and self-trust needed to perform well.
Buma suggests an exercise to help transform helicopter managers. In a closed meeting of managers, ask them to each share one of their recent corporate failures. Whether or not you discuss how it was fixed is immaterial to the exercise, the point is to have the managers practice being vulnerable and then show them that their career doesn’t end. Then, to practice giving up control, ask them to deligrate an important task of theirs to someone else. Buma says that surprisingly, most managers find these two tasks very hard and some outright reject it. Those managers who can go through these two exercises are likely mentally prepared to be good, transformational leaders.
The third leg of transformative leadership is giving product teams voice. Once teams trust management and start acting more autonomously, they’ll need to have the freedom to speak up and suggest ways to improve not only the product, but the way they work. A muzzled product team is much less valuable than one that can speak freely. As the DORA report defines it:
Voice is how strongly someone feels about their ability and their team’s ability to speak up, especially during conflict — for example, when team members disagree, when there are system failures or risks, and when suggesting ideas to improve their work.
Put another way, you don’t want people to be “courageous.” Instead, you want open discussions of failure and how to improve to be common and ordinary, “boring,” not “brave.” The opposite of giving your team’s voice is suppressing their suggestions, dismissing them, and explaining why such thinking is dangerous or “won’t work here.” Traditional managers tend to be deeply offended when “their” staff speaks to the rest of the organization independently, when they “go around” their direct line managers. This kind of thinking is a good indication that the team lacks true voice. While it’s certain more courteous to involve your manager in such discussions, management should trust teams to be autonomous enough to do the right thing.
In an organization like the US Air Force, where you literally have to ask permission to “speak freely,” giving product teams voice can seem impossible. To solve this problem, the Kessel Run team devised a relatively simple fix: they asked the airmen and women to wear civilian clothes when they were working on their products. Without the explicit reminder of rank that a uniform and insignia enforces, team members found it easier to talk freely with each other, regardless of rank. Of course, managers also explicitly told and encouraged this behavior. Other organizations like Allstate have used this same sartorial trick, encouraging managers to change from button-up shirts and suits to t-shirts and hoodies instead. Dress can be surprisingly key for changing culture. As a Nissan factory manager put it, “[i]f I go out to the plant in a $400 suit and tie, people don’t talk to me so freely.”
Managing ongoing culture change
Improving culture is a never ending process. Pivotal, for example has created an excellent, beloved culture over the past 25 years but is still constantly monitoring and improving it. And while I might sigh at yet another employee survey to fill out, the company has demonstrated that it actually listens and changes. This is very rare for any company and it shows how much work is needed to maintain a good culture.
Employee surveys are a good way to monitor progress. You should experiment with what to put in these surveys, and even other means of getting feedback on your organization’s culture. Dick’s Sporting Goods narrowed down to ENPS as small and efficient metric. Longer term, Dick’s Jason Williams says that they’ve seen some former employees come back to their team, another good piece of feedback for how well you’re managing your organization’s cultural change.
How you react to these surveys and feedback is even more important than gathering the feedback. Just as you expect your product teams to go through a small batch process, reacting to feedback from users, you should cycle through organizational improvement theories, paying close attention to the feedback you get from surveys and other means.
The ultimate feedback, of course, will be if you achieve the business goals derived from your strategy. But, you need to make sure that success isn’t at the cost of incurring cultural debt that will come due in the future. This debt often comes due in the form of stressed out staff leaving or, worse, going silent and no longer telling you about what they’re learning from failures. Then you’re back in the same situation you were trying to escape from all this digital transformation, an organization that’s scared and static, rather than savvy and successful.
As a DevOpsDays sponsor you’re often given the chance to give a one minute pitch to the entire audience. Back stage at DevOps Rex, this week, I was talking with a first timer. One minute seems like such a small amount of time: how could you say anything consequential in 60 seconds? You’re presenting in front of the full audience, anywhere between 150 to 500 people. They probably also loath vendors, or, at least are bored by them. The stage in Paris is intimidating. It’s a huge room in an old cinema, imagine the most stereotypical movie theater from whatever “the golden age” is: double decked seats, a huge screen. Plus, the organizers are meticulous: there’s a rehearsal for these 1 minute pitches in the morning. Like a full one where you’re given a minute to talk. Normally, these pitches are very informal. Overall, it can be a public speaking challenge…plus you have to get up an hour earlier than normal.
People get rattled by these 1 minute talks and they can also give boring pitches. Here’s what how I think about them and what I do:
The goal of this pitch is to tell people the name of your company, what you do, and to get them to come to your table to talk more.
So, tell them the name of your company, what you do, if you have time, the story of a customer who did something remarkable, and give people a reason to come to your booth.
People will come to your booth if you give them something: books, socks, flash lights, whatever. More senior, “decision makers” also want free stuff (they’re not monsters!), but they’ll also come to see how you can help them accomplish their work goals.
If you can tell a joke, even a lame one (of course, not an offensive one), do it, usually towards the start of your pitch. Getting a room full of people to laugh gets them engaged and listening closer. Your pitch more memorable, and also will give you a confidence boost to coast off for the next 45 to 30 seconds.
Finally, if you screw up, it does’t matter. It’s just 60 seconds so it’ll be over quickly and people will still come by your booth if the organizers have done their job for vendors: arranged the sponsor booth placement to drive foot traffic.
Figuring out what to say
Despite how disorganized and spontaneous I may appear (that’s part of my well planned out and cultivated schtick, a safety valve for when I haven’t prepared, plus it’s a fantastically caustic feedback loop for my self-loathing — yay!) I usually prepare content before each talk.
I write a bunch of points down and reduce it to three points that I want to make. As I wait to get on the stage, I go over these three points my head; I usually write them down and look at them. Ask the local sales people what the make up of the audience is (are the developers, ops people, management, or just a general audience?), and any local events they want to drive people to.
Now, I often forget most, if not all, of that content, but that’s fine, really. Some of it will show-up. And definitely don’t let your three points constrict you, just use them as a fallback and a suggestion.
Being at a DevOps event, you should probably talk about how your company relates to DevOps. I tell people that Pivotal Cloud Foundry removes all the toil of lower-level automation that DevOps is looking to eliminate, the A in CAMS. It makes DevOps real, solves you DevOps problems, et. al., so you can get to the whole reason (the “outcome,” in business speak) for doing DevOps: creating better software and running it reliability in production.
The main thing you want to avoid is being stuffy. If you’re wearing a sports jacket (without being ironic), I’ve found that you’re more likely to give a stuffy talk — someone like Damon Edwards can sports-jacket all day, but he’s the exception that proves the rule.
If you’re just naturally wooden in public speaking situations, try to say something about your involvement in the pitch: how does it make you feel and how do you relate to it? Talking about yourself is easy as you’re the expert on the topic and have hopefully been there the whole time.
A little bit of humor goes a long way in these tiny talks. For example, Pivotal’s main product is well known for being more expensive than free, but it works and changes the fortunes of organizations that use it. That’s a good thing to joke about (“good thing it actually works ’cause it ain’t cheap”), or weird branding names (“for some reason, we call these ‘platform engineers’ rather than ‘SREs’”). I sometimes make a joke about PaaS, the cloud category Pivotal Cloud Foundry is in: “remember PaaS from five or so years back? It was terrible! Well, we’re a PaaS, but we doesn’t suck so much this time, it actually works!”
Don’t worry about it
The stakes of this pitch are extremely low. Look at it as more of a learning experience for yourself, practice for next time. If you biff, nothing bad will happen unless you work for shitty management that punishes you for 60 seconds of time (start looking for a new job — Pivotal is hiring!).
Some people like to memorize pitches, which is fine if that helps you. Most of all, the way to succeed at these pitches it to have fun, be playful.
When you’re changing, you need to know what you’re changing to. It’s also handy to know how you’re going to change, and, equally, how you’re not going to change. In organizations, vision and strategy are the tools management uses to define why and how change happens.
Use vision to set your goals and inspiration
“Vision” can be a bit slippery. Often it means a concise phrase of hope that can actually happen, if only after a lot of work. Andy Zitney’s vision of starting on Monday and shipping on Friday is a classic example of vision. Vision statements are often more than a sentence, but they give the organization a goal and the inspiration needed to get there. Everyone wants to know “why I’m here,” which the vision should do, helping stave off any corporate malaise and complacency.
Vision refers to a picture of the future with some implicit or explicit commentary on why people should strive to create that future. In a change process, a good vision serves three important purposes. First, by clarifying the general direction for change, by saying the corporate equivalent of “we need to be south of here in a few years instead of where we are today,” it simplifies hundreds or thousands of more detailed decisions. Second, it motivates people to take action in the right direction, even if the initial steps are personally painful. Third, it helps coordinate the actions of different people, even thousands and thousands of individuals, in a remarkably fast and efficient way.
Creating and describing this vision is one of the first tasks a leader, and then their team, needs to do. Otherwise, your staff will just keep muddling through yesterday’s success, unsure of what to change, let alone, why to change. In IT, a snappy vision also keeps people focused on the right things instead of focusing on IT for IT’s sake. “Our core competency is ‘fly, fight, win’ in air and space,” says the US Air Force’s Bill Marion, for example, “It is not to run email servers or configure desktop devices.”
The best visions are simple, even quippy sentences. “Live more, bank less” is a great example from DBS Bank. “[W]e believe that our biggest competitors are not the other banks,” DBS’s Siew Choo Soh says. Instead, she continues, competitive threats are coming new financial tech companies “who are increasingly coming into the payment space, as well as the loan space.”
DBS Bank’s leadership believes that focusing on the best customer experience in banking will fend off these competitors and, better, help DBS become one of the leading banks in the world. This isn’t just based on rainbow whimsey, but strategic data: in 2017, 63% of total income and a 72% of profits came from digital customers. Focusing on that customer set and spreading whatever magic brought in that much profit to the “analog customers” is clearly a profitable course of action.
“We believe that we need to reimagine banking to make banking simple, seamless, as well as invisible to allow our customers to live more bank less,” Soh says. A simple vision like that is just the tip the of the iceberg but it can easily be expanded into strategy and specific, detailed actions that will benefit DBS Bank for years to come. Indeedm DBS has already won several awards, including Global Finance Magazine’s best bank in the world for 2018.
Creating an actionable strategy
“Strategy” has many, adorably nuanced and debated definitions. Like enterprise architecture, it’s a term that at first seems easily knowable, but becomes more obtuse as you stare into the abyss. A corporate strategy defines how a company will create, maintain, and grow business value. At the highest level, the strategy is usually increasing investor returns, usually through increasing the company’s stock price (via revenue, profits, or investor’s hopes and dreams thereof), paying out dividends, or engineering the acquisition of the company at a premium. In not-for-profit organizations, “value” often means how effective and efficiently the organization can execute its mission, be that providing clean water, collecting taxes, or defending a country. The pragmatic part of strategy is cataloging the tools the organization has at its disposal to achieve, maintain, and grow that value. More than specifying which tools to use, strategy also says what the company will not do.
People often fail at writing down useful strategy and vision. They want to serve their customers, be the best in their industry, and other such thin bluster. I like to use the check cashing test to start defining an organization’s strategy. Your organization always want to make more money with good profits. Well, check cashing is a profit rich, easy business. You just need a pile of cash and good insurance for when you get robbed. Do you want to cash checks? No? OK, then we know at least one thing you don’t want to do…
How broad or narrow is your product or service offering?
Why should customers prefer your product or service to a competitor’s?
What are the competencies you possess that others can’t easily imitate?
How do you make money in these segments?
Strategy should explain how to deliver on the vision with your organization’s capabilities, new capabilities enabled by technologies, customers needs and jobs to be done, your market, and your competitors. “This is where strategy plays an important role,” Kotter says, “Strategy provides both a logic and a first level of detail to show how a vision can be accomplished.”
A strategy for the next 10 years of growth at Dick’s Sporting Goods
Dick’s Sporting Goods, the largest sporting good retailer in the US, provides a recent example of translating higher level vision and strategy. As described by Jason Williams, over the past 10 years Dick’s rapidly built out its e-commerce and omni-channel capabilities, an enviable feat for any retailer. As always, success created a new set of problems, esp. for IT. It’s worth reading William’s detailed explanation of these challenges:
With this rapid technological growth, we’ve created disconnects in our overall enterprise view. There were a significant number of store technologies that we’ve optimized or added on to support our e-commerce initiatives. We’ve created an overly complex technology landscape with pockets of technical debt, we’ve invested heavily in on premise hardware — in the case of e-commerce you have to plan for double peak, that’s a lot of hardware just for one or two days of peak volume. Naturally, this resulted in a number of redundant services and applications, specifically we have six address verification services that do the same thing. And not just technical issues, we often had individuals and groups that have driven for performance, but it doesn’t align to our corporate strategy. So why did we start this journey? Because of our disconnect in enterprise view, we lack that intense product orientation that a lot of our competitors already had.
These types of “disconnects” and “pockets of technical debt” are universal problems in enterprises. Just as with Dick’s, these problems are usually not the result of negligence and misfeasance, but of the actions needed to achieve and maintain rapid growth.
To clear the way for the next 10 years of success, Dick’s put a new IT strategy in place, represented by 4 pillars:
Product architecture — creating an enterprise architecture based around the business, for example, pricing, catalog, inventory, and other business functions. This focus helps shift from a function and service centric mindset to product-centric mindset.
Modern software development practices — using practices like test-driven development, pairing, CI/CD, lean design, and all the proven, agile best practices.
Software architecture — using a microservices architecture, open source, following 12 factor principles to build cloud native applications on-top of Pivotal Cloud Foundry. This defines how software will be created, reducing the team’s toil so that they can focus on product design and development.
Balanced teams — finally, as Williams describes it, having a unified, product-centric team is the “the most critical part” of Dick’s strategy. The preceding three provider the architectural and infrastructural girding to shift IT from service delivery over to product delivery.
Focusing on these four areas gives staff very clear goals which translate easily into next steps and day-to-day work. Nine months into executing this strategy, Dick’s has achieved tangible success: they’ve created 31 product teams, increased developer productivity by 25%, ramped their testing up to 70% coverage, and improved the customer experience by increasing page load time and delivering more features, more frequently.
Keep your strategy agile
Finally, keep your strategy agile. While your vision is likely to remain more stable year to year, how you implement it might need to change. External forces will put pressure on a perfectly sound strategy: new government regulations or laws could change your organization’s needs, Amazon might finally decide to bottom out your market. Figure out a strategy review cycle to check your assumptions and course correct your strategy as needed. That is, apply a small batch approach to strategy.
Organizations usually review and change strategy on an annual basis as part of corporate planning, which is usually little more than a well orchestrated fight between business units for budget. While this is an opportunity to review and adjust strategy, it’s at the whim of finance’s schedule and the mercurial tactics of other business units.
Annual planning is also an unhelpfully waterfall-centric process, as pointed out by Mark Schwartz in The Art of Business Value. “The investment decision is fixed,” he writes, but “the product owner or other decision-maker then works with that investment and takes advantage of learnings to make the best use possible of the investment within the scope of the program. We learn on the scale of single requirements, but make investment decisions on the scale of programs or investment themes — thus the impedance mismatch.”
A product approach doesn’t thrive in that annual, fixed mindset. Do at least an additional strategy review each year, and many more in the first few years as you’re learning about your customers and product with each release. Don’t let your strategy get hobbled by the fetters of the annual planning and budget cycle.
By now, the reasons to improve how your organization does software are painfully obvious. Countless executives feel this urgency in their bones, and have been saying so for years:
“There’s going to be more change in the next five to ten years than there’s been in the last 50” — Mary Barra, CEO, GM
Intuitively, we know that business cycles are now incredibly fast: old companies die out, or are forced to dramatically change, and new companies rise to the top…soon to be knocked down by the new crop of sharp-toothed ankle biters.
Innosight’s third study of companies’ ability to maintain leadership positions estimates that by 2018, 50% of the companies on the S&P 500 will drop off, replaced by competitors and new market entrants. Staying at the top of your market-heap is getting harder and harder.
Profesor Rita McGrath has dubbed this the age of “transient advantage,” which is an apt way of describing how long — not very! — a company can rely on yesterday’s innovations. A traditional approach to corporate strategy is too slow moving, as she says: “[t]he fundamental problem is that deeply ingrained structures and systems designed to extract maximum value from a competitive advantage become a liability when the environment requires instead the capacity to surf through waves of short-lived opportunities.” Instead, organizations must be more agile: “to win in volatile and uncertain environments, executives need to learn how to exploit short-lived opportunities with speed and decisiveness.”
Software defined businesses
“We’re in the technology business. Our product happens to be banking, but largely that’s delivered through technology.” — Brian Porter, CEO, Scotiabank
We’re now solidly in an innovation phase of the business cycle. Organizations must become faster and more agile in strategy formulation, execution, and adaptation to changing markets. Again and again, IT is at the center of how startups enter new markets (often, disruptively) and how existing enterprises retain and grow market-share.
Organizations are seeking to become software defined businesses. In this mode of thinking, custom written software isn’t just a way of “digitizing” analog process (like making still lengthy mortgage applications or insurance claims processes “paperless”), but the mission critical tool for executing and evolving business models.
While software might have played merely a supporting role in the business for so long, successful organizations are casting software as the star. “It’s no longer a business product conversation, it’s a software product that drives the business and drives the market,” McKesson’s Andy Zitney says, later adding, “[i]t’s about the business, but business equals software now.”
Retail is the most obvious example. There’s an anecdote that Home Depot realized how important innovation was to them when they found out that Amazon sold more hammers than Homer. While other retailers languish, Home Depot grew revenue 7.5% year-over-year in Q4 2017. This isn’t solely due to software, but controlling its own software destiny has played a large part. As CIO Matt Carey says of competition from Amazon, “I don’t run their roadmap; I run my roadmap.”
These cases can seem pedestrian compared to self-driving cars and AIs that will (supposedly) create cyber-doctors. However, unlike these gee-whiz technologies, these small changes work incredibly fast and have large impacts.
Organizations often focus on the process, not the software
Most large organizations have massive IT departments, and equally large pools of developers working on software. However, many of these organizations haven’t updated their software practices and technologies for a decade or more. The results are predictable as three years of a Cutter Consortium survey shows. The study found that just 30% of respondents felt that IT helped their business innovate. As the chart below shows, this has fallen from about 50 percent in 2013:
This usefulness gap continues because IT departments are using an old approach to software. IT departments still rely on three-tier architectures, process hardened, dedicated infrastructure “service management” processes, and use functional organizations and long release cycles to (they believe) reliably produce software. I have to assume that this “waterfall” method was highly innovative and better than alternatives at the time…years and years ago.
In trying to be reliable and cost effective, IT departments have become excellent at process, even projects. In the 1990s, IT was in chaos with a shift from mainframes to Unix, then to Linux and Windows Server. On the desktop, the Windows GUI took over, but then the web browser hit mid-decade and added a whole new set of transitions and worries. Oh, and then there was the Internet, and the tail-end of massive ERP stand-ups that were changing core business processes. With all this chaos, IT often failed even on the simplest task like changing a password. Addressing this, the IT community created a school of thought called IT Service Management (ITSM) that sought to understand and codify each thing IT did for the business, conceptualizing those things as “services”: email, supply chain management, CRM, and, yes, changing passwords. Ticket desks were used to manage each process, and project management practices erected to lovingly cradle requests to create and change each IT service.
The result was certainly better than nothing, and much better than chaos. However, the ITSM age too often resulted in calcified IT departments that focused more on doing process perfectly than delivering useful services, that is, “business value.” The paladins of ITSM are quick to say this was never the intention, of course. It’s hard to know who’s the blame, or if we just need Jeffersonian table-flipping every ten years to hard reboot IT. Regardless, the traditional way of running IT is now a problem.
Most militaries, for example, can take anywhere between five to 12 years to roll out a new application. In this time, the nature of warfare can change many times over, a generations of soldiers can churn through the ranks, and the original requirements can change. Release cycles of even a year often result in the paradox of requirements perfection. In the best case scenario, the software you specified a year ago is delivered completely to spec, well tested, and fully function. But now, a year later, new competitor and customer demands nullifies the requirements from 12 months ago: that software is no longer needed.
Stretch this out to ten years, and you can see why the likes of US Air Force are treating transforming their software capabilities as a top priority. As General James “Mike” Holmes, Commander, Air Combat Command put it, “[y]ears of institutional risk aversion have led to the strategic dilemma plaguing us today: replacing our 30- year old fleet on a 30-year timeline.”
It’s easy to dismiss this as government work at its worst, clearly nothing like private industry. I’d challenge you, though, to find a large, multinational enterprise that doesn’t suffer from a similar software malaise. This misalignment is clearly unacceptable. IT needs to drastically change or it risks slowing down their organization’s innovation.
Small Batch Thinking
“If you aren’t embarrassed by the first version of your product, you shipped too late.” — Reid Hoffman, LinkedIn co-founder and former PayPal COO
How is software done right, then? Over the past 20 years, I’ve seen successful organizations use the same, general process: continuously executing small batches of software, over short iterations that put a rapid feedback loop in place. IT organizations that follow this process are delivering a different type of outcome than a set of requirements. They’re giving their organization the ability to adapt and change monthly, weekly, even daily.
By “small batches,” I mean identifying the problem to solve, formulating a theory of how to solve the problem, creating a hypothesis that can prove or disprove the theory, doing the smallest amount of application development and deployment needed to test your hypothesis, deploying the new code to production, observing how users interact with your software, and then using those observations to improve your software. The cycle, of course, repeats itself.
This whole process should take at most a week — hopefully just a day. All of these small batches, of course, add up over time to large pieces of software, but in contrast to a “large batch” approach, each small batch of code that survives the loop has been rigorously validated with actual users. Schools of thought such asLean Startup reduce this practice to helpfully simple sayings like “think, make, check.” Meanwhile, the Observe, Orient, Decide, Act (OODA) loop breaks the cycle down into even more precision. However you label and chart the small batch cycle, make sure you’re following a hypothesis driven cycle instead of assuming up-front that you know what how your software should be implemented.
As Liberty Mutual’s’ Chris Bartlow says, “document this hypothesis right because if you are disciplined in doing that you actually can have a more measurable outcome at the end where you can determine was my experiment successful or not.” This discipline gives you a tremendous amount of insight into decisions about the your software — features to add, remove, or modify. A small batch process gives you a much richer, fact-based ability to drive decisions.
“When you get to the stoplight on the circle [the end of a small batch loop] and you’re ready to make a decision on whether or not you want to continue, or whether or not you want to abandon the project, or experiment [more], or whether you want to pivot, I think [being hypothesis driven] gives you something to look back on and say, ‘okay, did my hypothesis come true at all,” Bartlow says, “is it right on or is it just not true at all?”
Long-term, you more easily avoid getting stuck in the “that’s the way we’ve always done it” lazy river current. The record of your experiments will also serve as an excellent report of your progress, even something auditors will cherish once you explain that log to them. These well-documented and tracked records are also your ongoing design history that you rely on to improve your software. The log helps makes even your failures valuable because you’ve proven something that does not work and, thus, should be avoided in the future. You avoid the cost and risk of repeating bad decisions.
This is the realm of multi-year projects that either underwhelm or are consistently late. As one manager at a large organization put it, “[w]e did an analysis of hundreds of projects over a multi-year period. The ones that delivered in less than a quarter succeeded about 80 percent of the time while the ones that lasted more than a year failed at about the same rate.”
No stranger to lengthy projects with, big, up-front analysis, the US Air Force is starting to think in terms of small batches for its software as well. “A [waterfall] mistake could cost $100 million, likely ending the career of anyone associated with that decision. A smaller mistake is less often a career-ender and thus encourages smart and informed risk-taking,” said M. Wes Haga.
Shift to user-centric design
If a small batch approach is the tool your organization now wields, a user-centric approach to software design is the ongoing activity you enable. There’s little new about taking a user-centric approach to software. What’s different is how much more efficient and fast creating good user experience and design is done thanks to highly networked applications and cloud-automated platforms.
When software was used exclusively behind the firewall and off networks as desktop applications, software creators had no idea how their software was being used. Well, they knew when there were errors because users fumed about bugs. Users never reported how well things were going when everything was working as planned. Worse, users didn’t report when things were just barely good enough and could be improved. This meant that software teams had very little input into what was actually working well in their software. They were left to, more or less, just make it up as they went along.
This feedback deficit was accompanied by slow release cycles. The complex, costly infrastructure used required a persnickety process of hardware planning, staging, release planning, and more operations work before deploying to production. Even the developers’ environments, needed to start any work, often took months to provision. Resources were scarce and expensive, and the lack of comprehensive automation across compute, storage, networking, and overall configuration required much slow, manual work.
The result of these two forces was, in retrospect, a dark age of software design. Starting in the mid-2000s, the ubiquity of always-on users and cloud automation removed these two hurdles.
Because applications were always hooked up to the network, it was now possible to observe every single interaction between a user and the software. For example, a 2009 Microsoft study found that only about one third of features added to the web properties achieved the team’s original goals — that is, were useful and considered successful. If you can quickly know which features are effective and ineffective, you can more rapidly improve your software, even eliminating bloat and the costs associated with unused, but expensive to support code.
By 2007, it was well understood that cloud automation dramatically reduced the amount of manual work needed to deploy software. The problem was evenly distributing those benefits beyond Silicon Valley and companies unfettered by the slow cycles of large enterprise. Just over 10 years later, we’re finally seeing cloud efficiencies spreading widely through enterprises. For example, Comcast realized a 75 percent lift in velocity and time to market when they used a cloud platform to automated their software delivery pipeline and production environment.
When you can gather, and thus, analyze all user interactions as well as deploy new releases at will, you can finally put a small batch cycle in place. And. this, you can create better user interaction and product design. And as we’ve seen in the past ten years, well designed products handily win out and bring in large profits.
Good design is worth spending time on. As Forrester consistently finds, organizations that focus on design tend to perform better financially than those that don’t. As such, design can be a highly effective competitive tool. Looking at the relationship between good design and revenue growth, Forrester found that organizations that focus on better design have a 14% lead on those that don’t. For example, “in two industries, cable and retail, leaders outperformed laggards by 24 percentage and 26 percentage points, respectively.”
I haven’t done a great job at describing what exactly good design looks like, let alone what the day-to-day work is. Let’s next look at simple case study with clear business results as an example.
The IRS historically used call centers to provide basic account information and tax payment services. Call centers are expensive and error prone: one study found that only 37% of calls were answered. Over 60% of people calling the IRS for help were simply hung-up on! With the need to continually control costs and deliver good service, the IRS had to do something.
In the consumer space, solving this type of account management problem has long been taken care of. It’s pretty easy in fact; just think of all the online banking systems you use and how you pay your monthly phone bills. But at the IRS, viewing your transactions had yet to be digitized.
When putting software around this, the IRS first thought that they should show you your complete history with the IRS, all of your transactions, as seen in the before UI example above. This confused users and most of them still wanted to pick up the phone. Think about what a perfect failure that is: the software worked exactly as designed and intended, it was just the wrong way to solve the problem.
Thankfully, because the IRS was following a small batch process, they caught this very quickly, and iterated through different hypotheses of how to solve the problem until they hit on a simple finding: when people want to know how much money they owe the IRS, they only want to know how much money they owe the IRS. When this version of the software was tested, most people didn’t want to use the phone.
Now, if the IRS was on a traditional 12 to 18 months cycle (or longer!) think of how poorly this would have gone, the business case would have failed, and you would probably have a dim view of IT and the IRS. But, by thinking about software in an agile, small batch way, the IRS did the right thing, not only saving money, but also solving people’s actual problems.
This project has great results: after some onerous up-front red-tape transformation, the IRS put an app in place which allows people to look up their account information, check payments due, and pay them. As of October 2017, there have been over 2 million users and the app has processed over $440m in payments. Clearly, a small batch success.
Create business agility with small batches
A small batch approach delivers value very early in the process with incremental releases of feature to production. This contrasts to a large batch approach which waits until the very end to (attempt to) deliver all of the value in one big lump. Of course, delivering early doesn’t delivering 1 year’s worth of work in one week. Instead, it means delivering just enough software to validate your design with user feedback.
Delivering early also allows you to prioritize your backlog, the list of requirements to implement. Organizations delivering weekly often find that a feature has been implemented “enough” and further development on the feature can be skipped. For example, to give people their hotel stay invoice, just allowing them to print a stripped down webpage might suffice instead of writing the code that creates and downloads a PDF. Once further development on that feature is de-prioritised, the team can decided to bring a new feature to the top of the backlog, likely ahead of schedule. This flexibility in priorities is one of the core of reasons agile software delivery makes business more agile and innovative.
Done properly a small batch approach also gives you a steady, reliable release train. This concept means that each week, your product teams will deliver roughly the same amount of “value” to production. Here, “value” means whatever changes they make to the software in production: typically, this is adding code that creates new features of modifies existing ones, but it could also be performance, security improvements, patches that ensure the software runs properly.
A functioning small batch process, then, gives you business agility and predictability. Trying out multiple ideas is now much cheaper, one of the keys to innovating new products and business models. The traditional, larger batch approach often requires millions of dollars in budget, driving the need for high-level approval, driving the need…to wait for the endless round of meetings and finance decisions, often on the annual budget cycle. This too often killed off ideas, as Allstate’s Opal Perry explains: “by the time you got permission, ideas died.” But with an MVP approach, as she contrasts, “a senior manager has $50,000 or $100,000 to do a minimum viable product” and can explore more ideas.
Case study: the lineworker knows best at Duke Energy
The team working on this went further than just trusting the VP’s first instincts, doing some field research with the actual line-workers. After getting to know the line-workers, they discovered a solution that redinfed the business problem. While the VP’s map would be a fine dashboard and give more information to central office, what really helped was developing a job assignment application for line-workers. This app would let line-workers locate their peers to, for example, partner with them on larger jobs, and avoid showing up at the same job. The app also introduced an Uber-like queue of work where line-workers could self-select which job to do next.
In retrospect this change seems obvious, but it’s only because the business paid attention to the feedback loop and user research and then reprioritized their software plans accordingly.
Transforming is easy…right?
Putting small batch thinking in place is no easy task: how long would it take you, currently, to deploy a single line of code, from a whiteboard to actually running in production? If you’re like most people, following the official process, it’d take weeks — just getting on the change review board’s schedule would take a week or more, and hopefully key approvers aren’t on vacation. This single-line of code thought experiment will start to flesh out what you need to do — rather, fix — to switch over to doing small batches.
Transforming one team, one piece of software isn’t easy, but it’s often very possible. Improving two applications usually works. How do you go about switching 10 applications over to a small batch process? How about 500?
Supporting hundreds of applications and teams — plus the backing services that support these applications — is a horse of a different color, rather, a drove of horses of many different colors. There’s no comprehensive manual for doing small batches at large scale, but in recent years several large organizations have been stampeding through the thicket. Thankfully, many of them have shared their success, failures, and, most importantly, lessons learned. We’ll look at their learnings next, with an eye, of course, at taming your organization’s big batch bucking.
Speed is the currency of business today and speed is the common attribute that differentiates companies and industries going forward. Anywhere there is lack of speed, there is massive business vulnerability:
● Speed to deliver a product or service to customers.
● Speed to perform maintenance on critical path equipment.
● Speed to bring new products and services to market.
● Speed to grow new businesses.
● Speed to evaluate and incubate new ideas.
● Speed to learn from failures.
● Speed to identify and understand customers.
● Speed to recognize and fix defects.
● Speed to recognize and replace business models that are remnants of the past.
● Speed to experiment and bring about new business models.
● Speed to learn, experiment, and leverage new technologies.
● Speed to solve customer problems and prevent reoccurrence.
● Speed to communicate with customers and restore outages.
● Speed of our website and mobile app.
● Speed of our back-office systems.
● Speed of answering a customer’s call.
● Speed to engage and collaborate within and across teams.
● Speed to effectively hire and onboard.
● Speed to deal with human or system performance problems.
● Speed to recognize and remove constructs from the past that are no longer effective.
● Speed to know what to do.
● Speed to get work done.
Continuous innovation only works with an enterprise that embraces speed and the data required to measure it. By creating conditions for continuous innovation, we must bring about speed. While this is hard, it has a special quality that makes the job a little easier. Through data, speed is easy to measure.
Innovation, on the other hand, can be extremely difficult to measure. For example, was that great quarterly revenue result from innovation or market factors? Was that product a one hit wonder or result of innovation? How many failures do we accept before producing a hit? These questions are not answerable. But we can always capture speed and measure effects of new actions. For example, we can set compliance expectations on speed and measure those results.
Speed is not only the key measurement, it becomes a driver for disruptive innovation. Business disruption has frequently arisen from startups and new technologies, not seeking optimization, but rather discovering creative ways to rethink problems to address speed. Uber is about speed. Mobile is about speed. IoT is about speed. Google is about speed. Drones are about speed. AirBnB is about speed. Amazon is about speed. Netflix is about speed. Blockchain is about speed. Artificial Intelligence is about speed.
Continuous Innovation then is the result of an enterprise, driven by speed, which is constantly collecting data, developing and evaluating ideas, experimenting and learning, and through creativity and advancing technologies, is constructing new things to address ever evolving customer needs.
Skilled, experienced team members are obviously valuable and can temper the risk failure by quickly delivering software. Everyone would like the mythical 10x developer, and would even settle for a 3 to 4x “full stack developer.” Surely, management often thinks, doing something as earth-shattering as “digital transformation” only works with highly skilled developers. You see this in surveys all the time: people say that lack of skills is a popular barrier to improving their organization’s software capabilities.
This mindset is one of the first barriers to scaling change. Often, an initial, team of “rockstars” has initial success, but attempts to clone them predictably fails and scaling up change is stymied. It’s that “lack of skills” chimera again. It’s impossible to replicate these people, and companies rarely want to spend the time and money to actually train existing staff.
Worse, when you use the only ninjas need apply tactic, the rest of the organization loses faith that they could change as well. “When your project is successful,” Jon Osborn explains, “and they look at your team, and they see a whole bunch of rockstars on it, then the excuse comes out, ‘well, you took all the top developers, of course you were successful.’”
Instead of only recruiting elite developers, also staff your initial teams with a meaningful dose of normals. This will not only help win over the rest of the organization as you scale, but also means you can actually find people. A team with mixed skill levels also allows you train your “junior” people on the job, especially when they pair with your so called “rockstars.”
Rockstars known to destroy hotel rooms
I met a programmer with 10x productivity once. He was a senior person and required 10 programmers to clean up his brilliant changes. –Anonymous on the c2 wiki
Usually what people find, of course, is that this rockstar/normal distinction is situational and the result of a culture that awards the lone wolf hero instead of staff that helps and supports each other. Those mythical 10x developers are lauded because of a visual cycle of their own creation. At some point, they spaghetti coded out some a complicated and crucial part of the system “over the weekend,” saving the project. Once in production, strange things started happening to that piece of code, and of course our hero was the only one who could debug the code, once again, over the weekend. This cycle repeats itself, and we laud this weekend coder, never realizing they’re actually damaging our business.
Relying on these heros, ninjas, rockstars, or what have you is a poor strategy in a large organization. Save the weekend coding for youngsters in Ramen chomping startups that haven’t learned better yet. “Having a team dynamic and team structure that the rest of the organization can see themselves in,” Osborn goes on to say, “goes a long way towards generating a buy in that you’re actually being successful and not cheating by using all your best resources.”
When possible, recruiting volunteers is the best option for your initial projects, probably for the first year. Forcing people to change how they work is a recipe for failure, esp. at first. You’ll need motivated people who are interested in change or, at least, will go along with it instead of resisting it.
Osborn describes this tactic at Great American Insurance Group: “We used the volunteer model because we wanted excited people who wanted to change, who wanted to be there, and who wanted to do it. I was lucky that we could get people from all over the IT organisation, operations included, on the team… it was a fantastic success for us.”
This might be difficult at first, but as a leader of change you need to start finding and cultivating these change-ready volunteers. Again, you don’t necessarily want rockstars, so much as open minded people who enjoy trying new things.
Rotating out to spread the virus of digital transformation
Few organizations have the time or budget-will to train their staff. Management seems to think that a moist bedding of O’Reilly books in a developer’s dark room will suddenly pop-up genius skills like mushrooms. Rotating pairing in product teams addresses this problem in a minimally viable way inside a team: team members learn from each other on a daily basis. Event better, staff is actually producing value as they learn instead of sitting in a neon-light buzzing conference room working on dummy applications.
To scale this change, you can selectively rotate staff out of a well functioning team into newer teams. This seeds their expertise through the organization, and once you repeat this over and over, knowledge will spread faster. One person will work with another, becoming two skilled people, who each work with another person, become four skilled people, then eight, and so on. Organizations like Synchrony go so far as the randomly shuffle desks every six months to ensure people are moving around.
More than just skill transfer and on the job training, rotating other staff through your organization will help spread trust in the new process. People tend to trust their peers more than leaders throwing down change from high, and much more than external “consultants,” and worse, vendor shills like myself. As ever, building this trust through the organization is key to scaling change.
Orange France is one of the many examples of this strategy in practice. After the initial success revitalizing their SMB customer service app, Orange started rotating developers to new teams. Developers that worked on the new mobile application pair with Orange developers from other teams, the website team. As ever with pairing, they both teach the peers how to apply agile and improve the business with better software at the same time. Talking about his experience with rotating pairing, Orange’s Xavier Perret says that “it enabled more creativity in the end. Because then you have different angles, [a] different point of view. As staff work on new parts of the system they get to know the big picture better and being “more creative problem solving” to each new challenge, Perret ads.
While you may start with ninjas, you can take a cadre of volunteers and slowly by surely build up a squad of effective staff that can spread transformation throughout your organization. All with less throwing stars and trashed hotel rooms than those 10x rockstars leave in their wake.
Every journey begins with a single step, they say. What they don’t tell you is that you need to pick your first step wisely. And there’s also step two, and three, and then all of the n + 1 steps. Picking your initial project is important because you’ll be learning the ropes of a new way of developing and running software, and hopefully, of running your business.
When it comes to scaling change, choosing your first project wisely is also important for internal marketing and momentum purposes. The smell of success is the best deodorant, so you want your initial project to be successful. And…if it’s not, you quietly sweep it under the rug so no one notices. Few things will ruin the introduction of a new way of operating into a large organization than initial failure. Following Larman’s Law, the organization will do anything it can — consciously and unconsciously — to stop change. One sign of weakness early, and your cloud journey will be threatened by status quo zombies.
The USAF had been working for at least 5 years to modernize the 43 applications used in Central Air Operations Command, going through several hundreds of millions of dollars. These applications managed the US’s and allie’s daily air missions throughout Iraq, Syria, Afghanistan, and nearby countries. No small task of import. The applications were in sort need of modernizing, and some weren’t even really applications: the tanker refueling scheduling team used a combination of Excel spreadsheets and a whiteboard to plan the daily jet refueling missions.
Realizing that they’re standard 5 to 12 years cycle to create new applications wasn’t going to cut it, the US Air Force decided to try something new: a truly agile, small batch approach. Within 120 days, a suitable version of the tanker refueling application was in production. The tanker team continued to release new features on a weekly, even daily basis. The project was considered a wild success: the time to make the tanker schedule was reduced from 8 hours to 2, from 8 airmen to 1, and the USAF ended up saving over $200,000 a day in fuel that no longer needed to be flown around as backup for error in the schedule.
The success of this initial project, delivered in April of 2017, called JIGSAW, proved that a new approach would work, and work well. This allowed the group driving change at the USAF to start another project, and then another one, eventually getting to 13 projects in May of 2018 (5 in production and 8 in development. The team estimates that by January of 2018 they’ll have 15 to 18 applications in production.
The team’s initial success, though just a small part of the overall 43 applications, gave them the momentum to starting scale change to the rest of the organization and more applications.
Project picking peccadilloes
Picking the right projects to start with is key. They should be material to the business, but low risk. They should be small enough that you can quickly show success in the order of months and also technically feasible for using cloud technologies. These shouldn’t be science projects or automation of low value office activities — no augmented reality experiments or conference room schedulers (unless those are core to your business). On the other hand, you don’t want to do something too big, like migrate the .com site. Christopher Tretina recounts Comcast’s initial cloud-native ambitionsin this way:
We started out with a very grandiose vision… And it didn’t take us too long to realize we had bitten off a little more than we could chew. So around mid-year, last year, we pivoted and really tried to hone in and focus on what were just the main services we wanted to deploy that’ll get us the most benefit.
Your initial projects should also enable you to test out the entire software life cycle — all the way from conception to coding to deployment to running in production. Learning is a key goal of these initial projects and you’ll only do that by going through the full cycle.
The Home Depot’s Anthony McCulley describes the applications his company chose in the first six or so months of its cloud-native roll-out. “They were real apps. I would just say that they were just, sort of, scoped in such a way that if there was something wrong, it wouldn’t impact an entire business line.” In The Home Depot’s case, the applications were projects like managing (and charging for!) late tool rental returns and running the in store, custom paint desk.
A special case for initial projects is picking a microservice to deploy. Usually, such a service is a critical backend service for another application. A service that’s taken forever to actually deliver, or has been unchanged and ancient for years is an impactful choice. This is not as perfect a use case as a full-on, human-facing project, but it will allow you to test out cloud-native principals and rack up a success to build momentum. The microservice could be something like a fraud detection or address canonicalization service. This is one approach to migrating legacy applications in reverse order, a strangler from within!
Picking projects by portfolio pondering
There are several ways to select your initial projects. Many Pivotal customers use a method perfected over the past 25 years by Pivotal Labs called discovery. In the abstract, it follows the usual BCG matrix approach, flavored with some Eisenhower matrix. This method builds in intentional scrappiness to do a portfolio analysis with the limited time you can secure from all of the stakeholders. The goal is to get a ranked list of projects based on your organization’s priorities and the easiness of the projects.
First, gather all of the relevant stakeholders. This should include a mix of people from the business and IT sides, as well as the actual team that will be doing the initial projects. A discovery session is typically led by a facilitator, preferably someone familiar with coaxing a room through this process.
The facilitator typically hands out stacks of sticky notes and markers, asking everyone to write down projects that they think are valuable. What “valuable” means will depend on each stakeholder. We’d hope that the more business minded of them would have a list of corporate initiatives and goals in their heads (or a more formal one they brought to the meeting). One approach used in Lean methodology is to ask management this question: “If we could do one thing better, what would it be?” Start from there, maybe with some five whys spelunking.
Participants then put up their sticky notes in the quadrant, forcing themselves not to weasel out and put the notes on the lines. Once everyone finishes, you get a good sense of projects that all stakeholders think are important, sorted by the criteria I mentioned, primarily that they’re material to the business (important) and low risk (easy). If all of the notes are clustered in one quadrant (usually, in the upper right, of course), the facilitator will redo the 2×2 lines to just that quadrant, forcing the decision of narrowing down to just projects to do now. The process might repeat itself over several rounds. To enforce project ranking, you might also use techniques like dot voting which will force the participants to really think about how they would prioritize the projects given limited resources.
At the end, you should have a list of projects, ranked by the consensus of the stakeholders in the room.
Planning out the initial project
You may want to refine your list even more, but to get moving, pick the top project and start breaking down what to do next. How you proceed to do this is highly dependent on how your product teams breaks down tasks into stories, iterations, and releases. More than likely, following the general idea of a small batch process you’ll
Create an understanding of the user(s) and the challenges they’re trying to solve with your software through personas and approaches like scenarios or Jobs to be Done.
Come up with several theories for how those problems could be solved.
Distill the work to code and test your theories into stories.
Add in more stories for non-functional requirements (like setting up build processes, CI/CD pipelines, testing automation, etc.).
Arrange stories into iteration-sized chunks without planning too far ahead (least you’re not able to adapt your work to the user experience and productivity findings from each iteration)
The chart measures application instances in Pivotal Cloud Foundry, which does not map exactly to a single application. As of December 2016, The Home Depot had roughly 130 applications deployed in Pivotal Cloud Foundry. What’s important is the general shape and acceleration of the curve. By starting small, with real applications, The Home Depot became learned the new process and at the same time delivered meaningful results that helped them scale their transformation.
The phrase “digital transformation” is mostly bull-shit, but then again, it’s perfect. The phrase means executing a strategy to innovate new business models driven by rapidly delivered, well designed, and agile software. For many businesses, fixing their long dormant, lame software capabilities is an urgent need: companies like Amazon loom as over-powering competitors in most every industry. More threatening, clever, existing enterprises have honed their ability software capabilities over the past five years.
Liberty Mutual, for example, entered a new insurance market on the other side of the world in 6 months, doubling the average close rate. Home Depot has grown it’s online business by around $1bn each of the past four years, is the #2 ranked digital retailer by Gartner L2, and is adding more than 1,000 technical hires in 2018. The US Air Force modernized their air tanker scheduling process in 120 days, driving $1m in fuel savings each week, and leading to canceling a long-standing $745m contract that hadn’t delivered a single line of code in five years.
Whatever businesses you’re in, able, ruthless competition is coming from all sides: new entrants and existing behemoths. Their success is driven by an agile, cloud-driven software strategy that transforms their organizations into agile businesses.
Let’s take a breath.
That’s some full tilt bluster, but we’ve been in an era of transient advantage for a long time. Businesses need every tool they can lay hands on to grow their business, sustain their existing cash-flows, and fend off competitors. IT has always been a powerful tool for enabling strategies, as they say, but in the past 10 years seemingly helpful but actually terrible practices like outsourcing have ruined most IT department’s ability to create useful software for the businesses they supposedly support.
These organizations need to improve how they do software to transform their organizations into programmable businesses.
Studying how large organizations plan for, initially fail, and then succeed at this kind of transformation is what I spend my time doing. This book (which I’m still working on) collects together what I’ve found so far, and is constructed from the actual experiences and stories of people of who’ve suffered through the long journey to success.
Enjoy! And next time someone rolls their eyes at the phrase “digital transformation,” ask them, “well, what better phrase you got, chuckle-head?”
I’m posting draft chapters of this book as I MVP-polish them up. In sort of the right order, here they are:
If a strategy is presented in the boardroom but employees never see, is it really a strategy? Obviously, not. Leadership too often believes that the strategy is crystal clear but staff usually disagree. For example, in a survey of 1,700 leaders and staff, 69% of leaders said their vision was “pragmatic and could easily translated into concrete projects and initiatives.” Employees, had a glummer picture: only 36% agreed.
Your staff likely doesn’t know the vision and strategy. More than just understanding it, they rarely know how they can help. As Boeing’s Nikki Allen put it:
In order to get people to scale, they have to understand how to connect the dots. They have to see it themselves in what they do — whether it’s developing software, or protecting and securing the network, or provisioning infrastructure — they have to see how the work they do every day connects back to enabling the business to either be productive, or generate revenue.
There’s little wizardry to communicating strategy. First, it has to be compressible. But, you already did that when you established your vision and strategy…right? Next, you push it through all the mediums and channels at your disposal to tell people over and over again. Chances are, you have “town hall” meetings, email lists, and team meetings up and down your organization. Recording videos and podcasts of you explaining the vision and strategy is helpful. Include strategy overviews in your public speaking because staff often scrutinizes these recordings. While “Enterprise 2.0” fizzled out several years ago, Facebook has trained all us to follow activity streams and other social flotsam. Use those habits and the internal channels you have to spread your communication.
You also need to include examples of the strategy in action, what worked and didn’t work. As with any type of persuasion, getting people’s peers to tell their stories are the best examples. Google and others find that celebrating failure with company-wide post mortems is instructive, career-ending crazy as that may sound. Stories of success and failure are valuable because you can draw a direct line between high-level vision to fingers on keyboard. If you’re afraid of sharing too much failure, try just opening up status metrics to staff. Leadership usually underestimates the value of organization-wide information radiators, but staff usually wants that information to stop prairie dogging through their 9 to 5.
As you’re progressing, getting feedback is key: do people understand it? Do people know what to do to help? If not, then it’s time to tune your messages and mediums. Again, you can apply a small batch process to test out new methods of communicating. While I find them tedious, staff surveys help: ask people if they understand your strategy. Be to also ask if know how to help execute the strategy.
Manifestos can help decompose a strategy into tangible goals and tactics. The insurance industry is on the cusp of turbulent competitive landscape. To call it “disruptive,” would be too narrow. To pick one sea of chop, autonomous vehicles are “changing everything about our personal auto line and we have to change ourselves,” says Liberty Mutual’s Chris Bartlow. New technologies are only one of many fronts in Liberty’s new competitive landscape. Every existing insurance company and cut-throat competitors like Amazon are using new technologies to both optimize existing business models and introduce new ones.
“We have to think about what that’s going to mean to our products and services as we move forward,” Bartlow says. Getting there required re-engineering Liberty’s software capabilities. Like most insurance companies, mainframes and monoliths drove their success over past decades. That approach worked in calmer times, but now Liberty is refocusing their software capability around innovation more than optimization. Liberty is using a stripped down set of three goals to make this urgency and vision tangible.
“The idea was to really change how we’re developing software. To make that real for people we identified these bold, audacious moves — or ‘BAMS,’” says Liberty Mutual’s John Heveran:
These BAMs grounded Liberty’s strategy, giving staff very tangible, if audacious, goals. With these in mind, staff could start thinking about how they’d achieve those goals. This kind of manifesto, makes strategy actionable.
So far, it’s working. “We’re just about cross the chasm on our DevOps and CI/CD journey,” says Liberty’s Miranda LeBlanc. “I can say that because we’re doing about 2,500 daily builds, with over a 1,000 production deployments per a day,” she adds. These numbers are tracers of putting a small batch process in place that’s used to improve the business. They now support around 10,000 internal users at Liberty and are better provisioned for the long ship ride into insurance’s future.
Choosing the right language is important for managing IT transformation. For example, most change leaders suggest dumping the term “agile.” At this point, near 25 years into “agile,” everyone feels like they’re agile experts. Whether that’s true is irrelevant. You’ll faceplam your way through transformation if you’re pitching switching to a methodology people believe they’ve long mastered.
It’s better to pick your own branding for this new methodology. If it works, steal the buzzwords du jour, from “cloud native,” DevOps, or serverless. Creating your own brand is even better. As we’ll discuss later, Allstate created a new name, CompoZed Labs, for its transformation effort. Using your own language and branding can help bring smug staff onboard and involved. “Oh, we’ve always done that, we just didn’t call it ‘agile,’” sticks-in-the-mud are fond of saying as they go off to update their Gantt charts.
Make sure people understand why they’re going through all this “digital transformation.” And make even more sure they know how to implement the vision and strategy, or, as you start thinking, our strategy.
Lone wolves rarely succeed at transforming business models and behavior at large organizations. True to the halo effect, you’ll hear about successful lone wolves often. What you don’t hear about are all the lone wolves who limped off to die alone. Even CEOs and boards often find that change-by-mandate efforts fail. “Efforts that don’t have a powerful enough guiding coalition can make apparent progress for a while,” as Kotter summarizes, “But, sooner or later, the opposition gathers itself together and stops the change.”
Organizations get big by creating and sustaining a portfolio of revenue sources, likey over decades. While these revenue sources may transmogrify from cows to dogs, if frightened or backed into a corner, hale but mettlesome upstarts will are usually trampled by the status quo stampede. At the very least, they’re constantly protecting their neck from frothy, sharp-tooth jackals. You have to work with those cows and canines, often forming “committees.” Oh, and, you know, they might actually be helpful.
How you use this committee is situation. It might be the placate enemies who’d rather see you fail than succeed, looking to salvage corporate resources from the HMS Transformation’s wreak. The old maxim to keep your friends close and your enemies closer summarizes this tactic well. Getting your “enemies” committed to and involved in your project is an obvious, facile suggestion, but it’ll keep them at bay. You’ll need to remove my cynical tone from your committee and actually rely on them for strategic and tactical input, support in budgeting cycles, and, eventually, involvement in your change.
For example, a couple years back I was working with all the C-level executives at a large retailer. They’d come together to understand IT’s strategy to become a software defined business. Of course, IT could only go so far and needed the the actual lines of business to support and adopt that change. The IT executives explained how transforming to a cloud native organization would improve the company’s software capabilities in the morning. In the afternoon, they all started defining a new application focused on driving repeat business, using the very techniques discussed in the morning. This workshopping solidified IT’s relationship with key lines of business and started working transforming those businesses. It also kicked off real, actual work on the initiative. By seeing the benefits of the new approach in action, IT also won over the CFO who’d been the most skeptical.
As this anecdote illustrates, building an alliance often requires serving your new friends. IT typically has little power to drive change, especially after decades of positioning themselves as a service bureau instead of a core enabler of growth. As seen in the Duke lineworker case above, asking the business what they’d like changed is more effective than presuming to know. As that case also shows, a small batch process discovers what actually needs to happen despite the business’ initial theories. But, getting there requires a more of a “the customer is always right” approach on IT’s part.
Now, there are many tactics for managing this committee; as ever Kotter does an excellent job of cataloging them in Leading Change. In particular, you want to make sure the committee members remain engaged. Good executives can quickly smell a waste of time and will start sending junior staff if the wind of change smells stale (wouldn’t you do the same?). You need to manage their excitement, treating them as stakeholder and customers, not just collaborators. Luckily, most organizations I’ve spoken with find that cloud native technologies and methodologies so vastly improve their software capabilities, in such a short amount of time that winning over peers is easy. As one executive a year intro their digital transformation program told me, “holy-@$!!%!@-cow we are starting to accelerate. It’s getting hard to not overdo it. I have business partners lined up out the door.”
Tracking the health of your overall innovation machine can be both overly simplified and overly complex. What you want to measure is how well you’re doing at software development and delivery as it relates to improving your organization’s goals. You’ll use these metrics to track how your organization is doing at any given time and, when things go wrong, get a sense of what needs to be fixed. As ever with management, you can look at this as a part of putting a small batch process in place: coming up with theories for how to solve your problems and verifying if the theory worked in practice or not.
All that monitoring
In IT most of the metrics you encounter are not actually business oriented and instead tell you about the health of your various IT systems and processes: how many nodes are left in a cluster, how much network traffic customers are bringing in, how many open bugs development has, or how many open tickets the help desk is dealing with on average.
All of these metrics can be valuable, just as all of them can be worthless in any given context. Most of these technical metrics, coupled with ample logs, are needed to diagnose problems as they come and go. In recent years, there’ve been many advances in end-to-end tracing thanks to tools like Zipkin and Spring Sleuth. Log management is well into its newest wave of improvements, and monitoring and IT management analytics are just ascending another cycle of innovation — they call it “observability” now, that way you know it’s different this time!
Instead of looking at all of these technical metrics, I want to look at a few common metrics that come up over and over in organizations that are improving their software capabilities.
Six common cloud native metrics
Some metrics consistently come up when measuring cloud native organizations:
Lead time is how long it takes to go from an idea to running code in production, it measures how long your small batch loop takes. It includes everything in-between from specifying the idea, writing the code and testing it, passing any governance and compliance needs, planning for deployment and management, and then getting it up and running in production.
If your lead time is consistent enough, you have a grip on IT’s capability to help the business by creating and deploying new software and features. Being this machine for innovation through software is, as you’ll hopefully recall, the whole point of all this cloud native, agile, DevOps, and digital transformation stuff.
As such, you want to monitoring your lead closely. Ideally, it should be a week. Some organizations go longer, up to two weeks, and some are even shorter, like daily. Target and then track an interval that makes sense for you. If you see your lead time growing, then you should stop everything, find the bottlenecks and fix them. If the bottlenecks can’t be fixed, then you probably need to do less each release.
Velocity shows how many features are typically deployed each week. Whether you call features “stories,” “story points,” “requirements,” or whatever else, you want to measure how many of them the team can complete each week; I’ll use the term “story.” Velocity tells you three things:
Your progress to improving and ongoing performance — at first, you want to find out what your team’s velocity is. They will need to “calibrate” on what they’re capable of doing each week. Once you establish this base line, if it goes down something is going wrong and you can investigate.
How much the team can deliver each week — once you know how many features your team can deliver each week, you can more reliability plan your road-maps. If a team can only deliver, for example, 3 stories each week, asking them to deliver 20 stories in a month is absurd. They’re simply not capable of doing that. Ideally, this means your estimates are no longer, well, always wrong.
If the the scope of features is getting too big or too small — if previously, reliability performing team’s velocity starts to drop, it means that they’re scoping their stories incorrectly: they’re taking on too much work, or someone is forcing them to. On the other hand, if the team is suddenly able to deliver more stories each week or finds themselves with lots of extra time each week, it means they should take on more stories each week.
There are numerous ways to first calibrate on the number of stories a team can deliver each week and managing that process at first is very important. As they calibrate, your teams will, no doubt, get it wrong for many releases, which is to be expected (and one of the motivations in picking small projects at first instead of big, important ones). Other reports like burn down charts can help illustrate how the team’s velocity is getting closer to delivering across major releases (or in each release) and help you monitor any deviation from what’s normal.
In general, you want your software to be as responsive as possible. That is, you want it to be fast. We often think of speed in this case, how fast is the software running and how fast can it respond to requests? Latency is a slightly different way of thinking about speed, namely, how long does a request take end-to-end to process, returning back to the user.
Latency is different than the raw “speed” of the network. For example, a fast network will send a static file very quickly, but if the request requires connecting to a database to create and then retrieve a custom view of last week’s Austrian sales, it will take awhile and, thus, the latency will be much longer than downloaded an already made file.
From a user’s perspective, latency is important because an application that takes 3 minutes to respond versus 3 milliseconds might as well be “unavailable.” As such, latency is often the best way to measure if your software is working.
Measuring latency can be tricky….or really simple. Because it spans the entire transaction, you often need to rely on patching together a full view — or “trace” — of any given user transaction. This can be done by looking at locks, doing real or synthetic user-centric tracing, and using any number of application performance monitoring (APM) tools. Ideally, the platform you’re using will automatically monitor all user requests and also put together catalog all of the sub-processes and sub-sub-processes that make up the entire request. That way, you can start to figure why things are so slow.
Often, your systems and software will tell when there’s an error: an exception is thrown in the application layer because the email service is missing, an authentication service is unreachable so the user can’t login, a disk is failing to write data. Tracking and monitoring these errors is, obviously, a good idea. Some of them will range from “is smoke coming out of the box?” to more obtuse ones like servicing being unreachable because DNS is misconfigured. Oftentimes, errors are roll-ups of other problems: when a web server fails, returning a 500 response code, it means something went wrong, but doesn’t the error doesn’t usually tell you what happened.
Error rates also occur before production, while the software is being developed and tested. You can look at failed tests as error rates, as well as broken builds and failed compliance audits.
Fixing errors in development can be easier and more straight forward, whereas triaging and sorting through errors in production is an art. What’s important to track with errors is not just that one happened, but the rate at which they happen, perhaps errors per second. You’ll have to figure out an acceptable level of errors because there will be many of them. What you do about all these errors will be driven by your service targets. These targets may be foisted on you in the form of heritage Service Level Agreements or you might have been lucky enough to negotiate some sane targets.
Chances are, a certain rate of errors will be acceptable (have you ever noticed that sometimes, you just need to reload a web-page?) Each part of your stack will throw off and generate different errors: some are meaningless (perhaps they should be more warnings or even just informative notices, e.g., “you’re using an older framework that might be deprecated sometime in the next 30 years) and others could be too costly, or even impossible to fix (“1% of user’s audio uploads fail because their upload latency and bandwidth is too slow”). And some errors may be important above all else: if an email server is just losing emails every 5 minutes…something is terribly wrong.
Generally, errors are collected from logs, but you could also poll the service in question and it might send alerts to your monitoring systems, be that an IT management system or just your phone.
If you can accept the reality that things will go wrong with software, how quickly you can fix those problems becomes a key metric. It’s bad when an error happens, but it’s really bad if it takes you a long time to fix it.
Tracking mean-time-to-repair is an ongoing measurement of how quickly you can recovering from errors. As with most metrics, this gives you a target to improve towards and then allows you to make sure you’re not getting worse.
If you’re following cloud native practices and using a good platform, you can usually shrink your MTTR with the ability to roll back changes. If a release turns out to be bad (an error), you can back it out quickly, removing the problem. This doesn’t mean you should blithely roll out bad releases, of course.
Measuring MTTR might require tracking support tickets and otherwise manually tracking the time between incident detection and fix. As you automate remediations, you might be able to easily capture those rates. As with most of these metrics, what becomes important in the long term is tracking changes to your acceptable MTTR and figuring out why the negative changes are happening.
Everyone wants to measure cost, and there are many costs to measure. In addition to the time spent developing software and the money spent on infrastructure, there are ratios you’ll want to track like number of applications to platform operators. Typically, these kinds of ratios give you a quick sense of how efficiently IT runs. If each application takes one operator, something is probably missing from your platform and process. T-Mobile, for example, manages 11,000 containers in production with just 8 platform operators.
There are also less direct costs like opportunity and value lost due to waiting on slow release cycles. For example, the US Air Force calculated that is saved $391M by modernizing it’s software methodology. The point is that you need to obviously track the cost of what you’re doing, but you also need to track the costs of doing nothing, which might be much higher.
Of course, none of the metrics so far has measured the most valuable, but difficult metric: value delivered. How do you measure your software’s contribution to your organization’s goals? Measuring how the process and tools you use contributes to those goals is usually harder. This is the dicey plain of correlation versus causation.
Somehow, you need to come up with a scheme that shows and tracks how all this cloud native stuff you’re spending time and money on is helping the business grow. You want to measure value delivered over time to:
Prove that you’re valuable and should keep living and get more funding,
Figure out when you’re failing to deliver so that you can fix it
There are a few prototypes of linking cloud native activities to business value delivered. Let’s look at a few examples:
As described in the case study above, when the IRS replaced call centers with poor availability with software, IT delivered clear business value. Latency and error rates decreased dramatically (with phone banks, only 37% of calls made it through) and the design improvements they discovered led to increased usage of the software, pulling people away from the phones. And, then, the results are clear: by the Fall of 2017, the this application had collected $440m in back taxes.
Running existing businesses more efficiently is a popular goal, especially for large organizations. In this case, the value you deliver with cloud native will usually be speeding up businesses processes, removing wasted time and effort, and increasing quality. Duke Energy’s lineworker case is a good example, here. Duke gave lineworkers a better, highly tuned application that queue and coordinate their work in the field. The software increased lineworker’s productivity and reduced waste, directly creating business value in efficiencies.
The US Air Force’s tanker scheduling case study is another good example here: by adapting a cloud native, software model they were able to ship the first version in 120 days and started saving $100,000’s in fuel costs each week. Additionally, the USAF computed the cost of delay — using the old methods that took longer — at $391M, a handy financial metric to consider.
And, then, of course, there comes raw competition. This most easily manifests itself as time-to-market, either to match competitors or get new features out before them. Liberty Mutual’s ability to enter the Australian motorcycle market, from scratch, in six months is a good example. Others, like Comcast demonstrate competing with major disruptors like Netflix.
It’s easy to get very nuanced and detailed when you’re mapping IT to business value. You need to keep things as simple as possible, or, put another way, only as complex as needed. As with the example above, clearly link your cloud native efforts to straight forward business goals. Simply “delivering on our commitment to innovation” isn’t going to cut it. If you’re suffering under vague strategic goals, make them more concrete before you start using them to measure yourself. On the other end, just lowering costs might be a bad goal to shoot for. I talk with many organizations who used outsourcing to deliver on the strategic goal of lowering costs and now find themselves incapable of creating software at the pace their business needs to compete.
Fleshing out metrics
I’ve provided a simplistic start at metrics above. Each layer of your organization will want to add more detail to get better telemetry on itself. Creating a comprehensive, umbrella metrics system is impossible, but there are many good templates to start with.
Pivotal has been developing a cloud native centric template of metrics, divided into 5 categories:
These metrics cover platform operations, product, and business metrics. Not all organizations will want to use all of the metrics, and there’s usually some that are missing. But, this 5 S’s template is a good place to start.
If you prefer to go down rabbit holes rather than shelter under umbrellas, there are more specialized metric frameworks to start with. Platform operators should probably start by learning how the Google SRE team measures and manages Google, while developers could start by looking at TK( need some good resource ).
Whatever the case, make sure the metrics you choose are
targeting the end goal of putting a small batch process in place to create better software,
reporting on your ongoing improvement towards that goal, and,
alerting you that you’re slipping and need to fix something…or find a new job.
After transforming management, comes changing theactual teams working on your software. Most organizations think about software as projects: something that has a finite set of requirements, schedule and budget, and a finite end date, the date it’s delivered. There’s some consideration put into updating the software, managing its full life. This project mentality can seem to make a lot of sense, esp. if you’re outsourcing much of your software.
Mark Schwartz calls this project approach the “contractor control” model. His theory is that this model is based on management’s mis-trust of the people doing the work, whether they’re contractors or internal staff. As he outlines in A Seat at the Table, this drives a heavy emphasis on status meetings, compliance, and lock-step road-maps:
How could you control a contractor? You asked for an estimate and pressured the contractor to deliver at or close to that estimate. Or you agreed on a fixed price. How could you control IT? Same model, but with the twist that the IT staff were your own employees who were paid a fixed salary — a bit awkward. Since their cost was fixed (at least in the short and medium terms), your biggest worry was that they would waste time on frivolous activities. How could you know that they weren’t? Simple: you insisted that they deliver on schedule, and kept the pressure on them to do so. Some of IT’s work was transactional: user support, device provisioning, updates, and maintenance. In those areas, costs and lead times could be benchmarked and monitored. But a good deal of IT’s work involved delivering capabilities — developing and integrating applications, rolling out ERP systems, installing collaboration tools, and so on. For IT to demonstrate that it was performing that type of work responsibly and for the business to verify that it was doing so, the scope of each task had to be defined precisely, bounded, and agreed upon in advance. The work had to be organized into projects, which are units of work with a defined set of deliverables, a beginning, and an end. You could establish control by making sure the project was completed within the bounds of its estimated cost and schedule. How perfect the Waterfall model is for this purpose! How perfectly it aligned with the business’s need to know that IT was under control.
The results are predictable: frustrated people, bad software, and slow failure like two drunks in a bar-brawl.
Sobering up starts with two words: product and team. Doing software well requires thinking about it as a product, not an ongoing project. This is how software vendors think, of course: the software they create it what they sell. A product is interested in accomplishing its user’s (and customer’s) goals, ongoing, until it’s no longer needed. A product evolves to match the user’s evolving needs. The focus is on continually making the best software possible.
Product-centric organizations move from the functional organization, contractor control model to an agile, product-centric model.
Focusing with a product team
Typically our developers, or business analysts, or designer, work on five projects at the same time, and if you are good you work on ten. That is tremendously inefficient. So the first task was to get them out of their traditional environment and put them into the garage, concentrating on one project. That is mandatory that the team members work co-located — all disciplines from market management to analysts, designers, and developers. — Dr Andreas Nolte, CIO, Allianz
One of the most important aspects of a product team is focusing on just one application. The contractor-control model typically drives an organization to be functional based: developers, QA, DBAs, networking engineers, support staff, and on and on. After all these years, I’m not sure why this is: it must have seemed like a better use of scarce resources, people and compute, at the time.
Organizations that are improving their software are finding that this “silo” approach doesn’t, however, optimize much. There are too many handoffs between each group, the lack of trust across silos drives both too much paperwork and a “not my problem” mind-set, all resulting in poor software.
A small batch approach to discovering, and then perfecting, your software requires a lot of trial and error experimentation. You’re operating in an incredibly chaotic, information-poor environment, where decisions need to be made constantly to keep moving ahead. In complex systems like that, speed and quick access to information are important to make decisions. You should always be exploring and looking for the best fit for your software to address the customer problem you’re looking to solve.
When you’re executing on a weekly, if not daily, deployment feedback loop, you don’t have a lot of time to synchronize across teams. You also don’t have time to continually “go up the chain” to ask permission to try something new.
Although the exact implementation of a balanced team can vary, in general, it’s one team of people per application with responsibility for and authority over the software you’re creating and delivering. This team is dedicated full time to the application for which they have ownership.
An innovative, product-centric approach is what’s needed in an exploratory process like software creation, where people don’t know what they want and are often (at first) completely wrong about what they want. Teams in this setting should be more closely attached to the software being written rather than parachuted in as needed. The team’s understanding of the problem being solved, approaches tried in the past, and the overall tribal knowledge will be invaluable in figuring out the right product to build. Integrated teams are not only important for product management continuity, but also for ensuring that the product is resilient in production. It’s vital to keep these teams small and they should have all the skills needed for the full life cycle of the product, including development, testing, design, and operations.
Staffing wise, this typically means teams of roughly 6–12 developers, at least one designer, a product manager, and often some part time roles like QA and operations support. In larger organizations, there will likely be shared resources that may start to look dangerously like traditional teams in silos (e.g.,security testers or domain experts). Ideally, you truly want every role and person on the same team, but that’s not always possible.
Next, let’s look at the types of roles that are typically on these teams.
Product team roles
The composition of product teams will change over time as each team gels, learning the necessary cloud native operations skills and mastering the domain knowledge needed to make good design choices.
As detailed below, the core team is composed of developers, designers, and a product owner. There are also some supporting roles that come and go as needed — testers, architects, DBAs, data scientists, and other specialists.
As developers are the ones implementing your applications, they are closely involved with estimating how long stories in your backlog and prioritizing them for releases. Similarly, developers can help product managers and designers walk through the technical trade-offs involved in deciding how to implement and design features.
Developers are not expected to be experts in all operations concerns. Instead, they rely on the self-service and automation capabilities of cloud platforms for the most common operations needs. This means they don’t need to wait for operations staff to perform configuration management tasks to deploy applications. There will, of course, be operations knowledge that developers need to learn, especially when it comes to designing highly networked, distributed applications. Initially, prescriptive platform patterns help here, as well as embedded operations staff. In addition to relying on the cloud platform to automate and enforce (now) routine operations tasks, over time, developers often gain enough operations knowledge to work without dedicated operations support.
The number of developers on each team is variable, but so far, following the two pizza team rule of thumb, we typically see anywhere from one to three pairs, that is two to six developers, and sometimes more.
Product Owner/Product Manager
This role defines and guides application requirements. It is also one of the roles that varies in responsibilities the most across products. At its core, this role is the owner of the software under development, though it’s more accurate to think of product managers as the one setting the vision week-to-week and pointing the team in the right direction. In that respect, product managers help prioritize, plan, and deliver software that meets requirements or, stories as they’re commonly called. Someone has to be the final word on what happens in the team. The amount of control versus consensus-driven management is the main point of variability in this role, plus the topic areas in which the product owner has knowledge.
It’s best to approach the product owner role as a breadth-first role: these individuals have to understand the business, the customer, and the technical capabilities. This broad knowledge helps them make sure they’re making the right prioritization decisions.
In organizations that are transitioning to cloud native, this role also serves as the barrier between the all-too-fragile new teams and the existing, legacy teams. The product owner becomes the gatekeeper that keeps all the helpful interest and requests at bay so that the teams can focus on their work.
One of the major lessons of contemporary software is that design matters a tremendous amount more than previously believed. While nice looking UIs are, well, nice to have, design in software is so much more than looks. The designer takes responsibility to deeply understand the needs and challenges that users have, and how to create solutions to overcome these challenges. You might think of designers as the empathizers in chief.
The designer focuses on identifying the feature set for the application and translating that to a user experience for the development team. As some put it, design is how it works, not (just) how it looks. Activities may include completing the information architecture, user flows, wireframes, visual design, and high-fidelity mock-ups and style guides. Most important, designers have to get out of the building and not only see what actual users are doing with the software, but get to know those users and their needs intimately.
In addition the core roles above there are many other roles that can, and do, exist in IT organizations. These are roles such as DBAs, security operations, network operations, and storage operations. In general, as with any tool, you should use what you need when you need it. Any given role must reorient itself to enabling the core teams rather than governing them. As the DevOps community has discussed at length for nearly 10 years, the more you divide up your staffing by function, the further you move from a small, integrated team, and achieving your goal of consistently and regularly building quality software will become harder.
Below are four common other roles on product teams. Many of them are “part time,” and some are used to help bootstrap the product team.
Until business capabilities teams in a cloud native environment have learned the necessary skills to operate applications on their own, they will need operations support. This support will come in the form of understanding (and co-learning!) how the cloud platform works, as well as gaining assistance troubleshooting applications in production. Early on, you should plan to have heavy operations involvement to help collaborate with developers and share knowledge, mostly around getting the best from the cloud platform in place. As with development, using rotating pairing will help quickly spread knowledge. You may need to assign operations staff to teams at the beginning, making them designated operations staff instead of dedicated, as explained in Effective DevOps.
In some organizations, the operations role never leaves the team, which is perfectly normal. Indeed, the desired end-state is that application teams have all of the development and operations skills and knowledge needed to be successful. Most organizations find that product teams need operations support early on as the cloud platform is built out and developers learn how to use the self-service functionality of the platform to manage infrastructure. These operations staff frequently rotate out once developers are better skilled.
Although the product manager and overall team are charged with testing their software, some organizations either want, or need, additional testing. Often this is exploratory testing where a third party (the tester[s]) are trying to systematically find the edge cases and other bugs the development team didn’t uncover.
Some Pivotal customers have reported that they’ve been able to dramatically reduce their QA staffing and thus, overall IT spend. While this may not always be the case, if you find yourself with a lot of QA staff, it’s worth questioning the need for separate testers. Much routine QA is now automated (and can be done by the team through automated CI/CD pipelines), but you may want exploratory, manual testing in addition to what the team is already doing to verify that the software does as promised, and functions under acceptable duress. Yet even that verification can be automated in some situations as the Chaos Monkey and Chaos Lemur show.
Traditionally, this role has been responsible for conducting enterprise analysis, design, planning, and implementation using a big picture approach to ensure the successful development and execution of strategy. While those goals can still exist in many large organizations, the role of an architect is evolving to be an enabler for more self-sufficient, decoupled teams. Too often, this role has become a Dr. No in most large organizations, so care must be taken to ensure that the architect supports the team, not the other way around.
Architects are typically more senior technical staff who are domain experts. They may also be more technically astute, and in a consultative way, help ensure the long-term quality and flexibility of the software that the team creates. They may also share best practices and otherwise enable teams to be successful. This last point is crucial for, yet often ignored by, large organizations as we’ll discuss in the section titled dealing with legacy.
If your application requires a large amount of data analysis, you should consider including a data scientist role on the team. This role can follow the dedicated/designated pattern as discussed previously with the operations role.
Data science today is where design was a few years ago. It’s not considered to be a primary role on the product team, but more and more products today are introducing a level of insight not seen before now. Mobile notifications surface contextual information to buyers about flash sales nearby; users are offered deals on movie rentals tuned to their viewing behavior; GE uses fast modeling and analysis to tune wind and jet turbines; and trucking companies are using analytics to program their fleets. These features help turn “dumb,” transactional products into “smart,” differentiated products.
“Compliance” will be one of your top bugbears as you improve how your organization does software. As numerous organizations have been finding, however, compliance is a solvable problem. You can even improve the quality of compliance and risk management in most cases with your new processes and tools, introducing more, reliable controls than traditional approaches.
I’ve seen three approaches to dealing with compliance, often used together as a sort of maturity model:
Ignore compliance, compliantly — select projects to work on that don’t need much compliance, if any. Eventually, you’ll want to work on projects that do, but this buys you time to learn by doing and building up a small series of successful projects.
Minimal Viable Compliance — often, the compliance requirements you must follow have built up over years, even decades. It’s very rare that any control is removed, but it’s very frequent that they should be. Find the smallest set of controls you actually need to satisfy.
Transform compliance — as you scale up your transformation efforts, like most organizations you’ll find that you have to work with auditors. Most organizations are finding that simply involving auditors in your software lifecycle from start to end not only helps you pass compliance with flying colors, but that it improves the actual compliance work.
But first, what exactly is “compliance”?
If you’re a large organization, chances are you’ll have a set of regulations you need to comply with. These are both self- and government-imposed. In software, the point of regulations is often to govern the creation of software, how it’s managed and in run in production, and how data is handled. The point of most compliance is risk management, e.g., making sure developers deliver what was asked for, making sure they follow protocol for tracking changes and who made them, making sure the code and the infrastructure is secure, and making sure that people’s personal data is not needlessly exposed.
Compliance often takes the form of a checklist of controls and verifications that must be passed. Auditors are staff that go through the process of establishing those lists, tracking down their status in your software, and also negotiating if each control must be followed or not. The auditors are often involved before and after the process to establish the controls and then verify that they were followed. It’s rare that auditors are involved during the process, which is a huge source of wasted time, it turns out. Getting involved after your software has been created requires much compliance archaeology and, sadly, much cutting and pasting between emails and spreadsheets, paired with infinite meeting scheduling.
When you’re looking to transform your software capabilities, this traditional approaches to compliance, however, often end up hurting businesses more than helping them. As Liberty Mutual’s David Ehringer describes it
The nature of the risk affecting the business is actually quite different: the nature of that risk is, kind of, the business disrupted, the business disappearing, the business not being able to react fast enough and change fast enough. So not to say that some of those things aren’t still important, but the nature of that risk is changing.
Ehringer says that many compliance controls are still important, but there are better ways of handling them without worsening the largest risk: going out of business because innovation was too late.
Let’s look at three ways that organizations are avoiding failure by compliance.
These may seem disconnected from anything that matters and, thus, not worth working on. Early on, though, the ability to get moving and prove that change is possible often trumps any business value concerns. You don’t want to eat these “empty calorie” projects too much, but it’s better than being killed off at the start.
Minimal Viable Compliance
Part of what makes compliance seem like toil is that many of the controls seem irrelevant. Over the years, compliance builds up like plaque in your steak-loving arteries. The various controls may have made sense at some time — often responding to some crisis that occured because this new control wasn’t followed. At other times, the controls may simply not be relevant to the way you’re doing software.
Clearing away old compliance
When you really peer into the audit abyss, you’ll often find out that many of the tasks and time bottlenecks are caused by too much ceremony and processes no longer needed to achieve the original goals of audibility. Target’s Heather Mickman recounts her experience with just such an audit abyss clean-up in The DevOps Handbook:
As we went through the process, I wanted to better understand why the TEAP-LARB [Target’s existing governance] process took so long to get through, and I used the technique of “the five whys”…which eventually led to the question of why TEAP-LARB existed in the first place. The surprising thing was that no one knew, outside of a vague notion that we needed some sort of governance process. Many knew that there had been some sort of disaster that could never happen again years ago, but no one could remember exactly what that disaster was, either.
As Boston Scientific’s CeeCee O’Connor says, finding your path to minimal viable compliance means you’ll actually need to talk with auditors and understand the compliance needs. You’ll likely need to negotiate if various controls are needed or not, more or less proving that they’re not. When working with auditors on an application that helped people manage a chronic condition, O’Connor group first mapped out what they called “the path to production.”
This was a value-stream like visual that showed all of the steps and processes needed to get the application into production, including, of course compliance steps. Representing each of these as sticky notes on a wall allowed the team to quickly work with auditors to go through each step — each sticky note — and ask if it was needed. Answering such a question requires some criteria, so applying lean they team asked the question “does this process add value for the customer?”
You’re already helping compliance
This mapping and systematic approach allowed the team and auditors to negotiate the actual set controls needed to get to production. At Boston Scientific, the compliance standards had built up over 15 years, growing thick, and this process helped thin them out, speeding up the software delivery cycle.
The opportunity to work with auditors will also let you demonstrate how many of your practices are already improving compliance. For example, pair programming means that all code is continuously being reviewed by a second person and detailed test suite reports show that code is being tested. Once you understand what your auditors need, there are likely other processes that you’re following that contribute to compliance.
Discussing his work at Boston Scientific, Pivotal’s Chuck D’Antonio describes a happy coincidence between lead design and compliance. When it comes to pacemakers and other medical devices, you’re only supposed to build exactly the software needed, removing any extraneous software that might bring bugs. This requirement matches almost exactly with one of the core ideas of minimum viable products and lean: only deliver the code needed. Finding these happy coincidences, of course, requires working closely with auditors. It’ll be worth a day or two of meetings and tours to show your auditors how you do software and ask them if anything lines up already.
Case Study: “It was way beyond what we needed to even be doing.”
Operating in five US states and insuring around 15 million people, health insurance provider HCSC is up to its eyeballs in regulations and compliance. As it started to transform, HCSC initially felt like getting over the compliance hurdle would be impossible. Mark Ardito recounts how easy it actually was once auditors were satisfied with how much better a cloud-native approach was:
Turns out it’s really easy to track a story in [Pivotal] Tracker to a commit that got made in git. So I know the SHA that was in git, that was that Tracker story. And then I know the Jenkins job that pushed it out to Cloud Foundry. And guess what? I have this in the tools. There’s logs of all these things happening. So slowly, I was able to start to prove out auditability just from Jenkins logs, git SHAs, things like that. So we started to see that it became easier and easier to prove audits instead of Word documents, Excel documents — you can type anything you want in a Word document! You can’t fake a log from git and you can’t fake a log in Jenkins or Cloud Foundry.
Automation makes auditors happier and removes huge, time-sucking bottlenecks.
While you may be able to avoid compliance or eliminate some controls, regulations are more likely unavoidable. Speeding up the compliance bottleneck, then, requires changing how compliance is done. Thankfully, using a build pipeline and cloud platforms provides a deep set of tools to speed up compliance. Even better, you’ll find cloud native tools and processes improve the actual quality and accuracy of compliance.
Compliance as code
Many of the controls auditors need can be satisfied by adding minor steps into your development process. For example, as Boston Scientific found, one of their auditors controls specified that a requirement had to be tracked through the development process. Instead of having to verify this after the team was code complete, they made sure to embed the story ID into each git commit, automated build, and deploy. Along these lines, the OpenControl project has put several years of effort into automating even the most complicated government compliance regimes. Chef’s InSpec project is also being used to automate compliance.
Pro-actively putting in these kinds of tracers is a common pattern form organizations that are looking to automate compliance. There’s often a small amount of scripting required to extract these tracers and present them in a human readable format, but that work is trivial in comparison to the traditional audit process.
This makes your entire stack of infrastructure and software a single, unique unit that must be audited each release. This creates a huge amount of compliance work that needs to be done even for a single line of code: everything must be checked from the dirt to screen. As Raytheon’s Keith Rodwell lays out, working with auditors, you can often show them that by using the same, centralized platform for all applications you can inherit compliance from the platform. This allows you to avoid the time taken to re-audit each layer in your stack.
The US federal government’s cloud.gov platform provides a good example of baking controls into the platform. 18F, the group that built and supports cloud.gov described how their platform, based on Cloud Foundry, takes care of 269 controls for product teams:
Out of the 325 security controls required for Moderate-impact systems, cloud.gov handles 269 controls, and 41 controls are a shared responsibility (where cloud.gov provides part of the requirement, and your applications provide the rest). You only need to provide full implementations for the remaining 15 controls, such as ensuring you make data backups and using reliable DNS (Domain Name System) name servers for your websites.
Organizations that bake controls into their platforms find that they can reduce the time to pass audits from months (if not years!) to just weeks or even days. The US Air Force has had similar success with this approach, bringing security certification down from 18 months to 30 days, sometimes even just 10.
Compliance as a service
Finally, as get deeper into dealing with compliance, you might even find that you work more closely with auditors. It’s highly unlikely that they’ll become part of your product team; though that could happen in some especially compliance-driven government and military work where being compliant is a huge part of the business value. However, organizations often find that auditors are involved closely throughout their software life-cycle. Part of this is giving auditors the tools to proactively check on controls first hand.
Home Depot’s Tony McCulley suggests giving auditors access to your continuous delivery process and deployment environment. This means auditors can verify compliance questions on their own instead of asking product teams to do that work. Effectively, you’re letting auditors peer into and even help out with controls in your software. Of course, this will only works if have a well-structured, standardized platform supporting your build pipeline with good UIs that non-technical staff can access.
Making compliance better
There have obviously been culture shocks. What is more interesting though is that the teams that tend to have the worst culture shock are not those typical teams that you might think of, audit or compliance. In fact, if you’re able to successfully communicate to them what you’re doing, DevOps and all of the associated practices seem like common sense. [Auditors] say, ‘Why weren’t we doing this before?’” — Manuel Edwards, E*TRADE, Jan 2016
Understanding and working with auditors gives the product team the chance to write software that more genuinely matches compliance needs.
The traceability of requirements, authorization, and automated test reports give auditors much more of the raw materials needed to verify compliance.
Automating compliance reporting and baking controls into the platform creates much more accurate audits and can give so called “controls” actual, programmatic control to enforce regulations.
As with any discussion that includes the word “automation,” some people take all of this to mean that auditors are no longer needed. That is, we can get rid of their jobs. This sentiment then gets stacked up into the eternal “they” antipattern: “well, they won’t change, so we can’t improve anything around here.
But, also as with any discussion that includes to word “automation,” things are not so clear. What all of these compliance optimizations point to is how much waste and extra work there is in the current approach to compliance.
This often means auditors working overtime, on the weekend, and over holidays. If you can improve the tools auditors use you don’t need to get rid of them. Instead, as we can do with previously overworked developers, you end up getting more value out of each auditor and, at the same time, they can go home on-time. As with developers, happy auditors mean a happier business.
In the cloud, DevOps, agile, whatever is hot and new era, the role of enterprise architects is rarely addressed. There’s probably plenty useful for them to do still. I’ve been trying to figure out what those things are recently.
Whether you’re doing waterfall, DevOps, PRINCE, SAFe, PMBOK, ITIL, or whatever process and certification-scheme you like, chances are you’re not using your time wisely. I’d estimate that most of the immediate, short-term benefit organizations get from switching to cloud native is simply because they’re now actually, truly following a process which both focuses your efforts on creating customer value (useful software that helps customers out, making them keep paying or pay you more) and managing your time wisely. This is like the first 10–20 pounds you lose on any diet: that just happens because you’re actually doing something where before you were doing nothing.
Less developer meetings, more pairing up
When it comes to time management, eliminating meetings is the easiest, biggest productivity booster you can do. Start with developers. They should be doing actual work (probably “coding”) 5–6 hours a day and go to only a handful of meetings a week. If the daily stand-up isn’t getting them all the information they need for the day, look to improve the information flow or limit it to just what’s needed.
Somewhat counter-intuitively, pairing up developers (and other staff, it turns out) will increase productivity as well. When they pair, developers are better synced up on most knowledge they need, learning how all parts of the system work with a built in tutor in their pair. Keeping up to speed like this means the developers have still less meetings to go to, those ones where they learn about the new pagination framework that Kris made. Pairing helps with more than just knowledge maintenance. While it feels like there’s a “halving” of developers by pairing them up, as one of the original pair programming studies put it: “the defect removal savings should more than offset the development cost increase.” Pairs in studies over the past 20+ years have consistently written higher quality code and written it faster than solo coders.
Coupled with the product mindset to software that involves the whole team in the process from start to end, they’ll be up to speed on the use cases and customers. And, by putting small batches in place, the amount of up-front study needed (requiring meetings) will be reduced to bite-sized chunks.
For example, at a large health insurance company, the product owner at first worked with business analysts, QA managers, and operations managers to get developers synced up and working. The product owner quickly realized that most of the content in the conversations was not actually needed, or was overkill. With some corporate slickness, the product owner removed the developers from this meeting-loop, and essentially /dev/null’ed the input that wasn’t needed.
Assign this story to management
Staff can try to reduce the amount of meetings they go to (and start practices like pairing), but, to be effective, managers have the responsibility to make it happen. At Allstate, managers would put “meetings” on developers calendars that said “Don’t go to meetings.” When you read results like Allstate going from 20% productivity to 90% productivity, you can see how effective eliminating meetings, along with all their other improvements, can be on an organization.
If you feel like developers must go to a meeting, first ask how you can eliminate that need. Second, track it like any other feature in the release, accounting for the time and cost of it. Make the costs of the miserable visible.
This concept of attending less meetings isn’t just for developers,The same productivity outcomes can be achieved to QA, the product owners, operations, and everyone else. Once you’ve done this, you’ll likely find having a balanced team easier and possible. Of course, once you have everyone on a balanced team, following this principle is easier.Reducing the time your staff spends in meetings and, instead, increasing the time they spend coding, designing, and doing actual product management (like talking with end users!) get you the obvious benefits of increasing productivity by 4x-5x.
If you feel you cannot do this, at least track the time you’re losing/using on meetings. A good rule of thumb is that context switching (going from one task to another) takes about 30 minutes. So, an hour long meeting will actually take out 2 hours of an employee’s time. To get ahold of how you’re choosing to spend your time, in reality, track these as tasks somehow, perhaps even adding in stories for “the big, important meeting.” And then, when you’re project tracking make sure you actually want to spend your organization’s time this way. If you do: great, you’re getting what you want! More than likely, spending time doing anything by creating and shipping customer value isn’t something you want to keep doing.
It may seem ridiculous to suggest that paying attention to time spent in meetings is even something that needs to be uttered. In my experience, management may feel like meetings are good, helpful, and not too onerous. After all, meetings are a major tool for managers to come to learn how their businesses are performing, discuss growth and optimization options, and reach decisions. Meetings are the whiteboards and IDEs of managers. Management needs to look beyond the utility meetings give them, and realize that for most everyone else, meetings are a waste of time.
I have a lot of tech t-shirts. Here’s an overview of my personal style and opinions. There’s a lot of politics in t-shirt selection, much of it good, still even more of it driven by aesthetics. I’m not seeking to win any points in those games (well, except maybe that all genders should have shirts that are designed for them), just telling you what I like.
Why? I get asked for input on t-shirts at least twice a year (often more). Here’s a URL for that input. And, I end-up getting a lot of tech t-shirts. Thankfully, my mom really likes them, so about 2–3 times a year I give her a couple t-shirt grocery bags full of t-shirts, the shitty ones.
Less shit on the shirt
First, some general comments:
I don’t like any shit on the back of the shirt, unless it’s a tiny brand name or URL right at the top.
I don’t like those shirts with a big, sticky feeling print thing on them (with an exception for pure awesomeness as you’ll see in a couple of them). I think that means I like “screen-print” shirts.
They, of course, have to be that super-soft material. Those “beefy-t” shirts go right into the plastic bag of shirts I give to my mom (well, actually, I just don’t pick them up in less I get tricked into doing so).
I’m overweight — and I think most people who get tech t-shirts are (ducks)— so I don’t like those “slim fit” shirts. No one wants to see me act out the hit song, “My Humps.”
Pictures and designs instead of just words are good, but words are fine.
In general, your company’s logo is crap on a shirt. And for God’s sake, don’t put in on the sleeve. Don’t put anything on the sleeve.
Speaking of logos, those t-shirts where you list a bunch of sponsor logos on the back are garbage.
Colors: this is tricky. I clearly like grey shirts instead of bright colors. Also, I generally don’t like black, as Dan Baskette put it: “I’m not at a Motley Crue concert, so don’t give me a black t-shirt.” Actual color (blue, green, red, etc.) is probably OK. But. I like grey.
Here’s a selection. These are not all, by far, t-shirts from tech conferences, but most of them could be and illustrate my taste:
Apparently, I buy a lot of (super-fucking-expensive-oh-my-God-I-should-just-be-a-dandy-fellow-and-shop-at-Nordstrom-oh-I’m-supporting-indepedent-aritsts-OK-then-here’s-my-wallet-and-ATM-PIN) Cotton Bureau shirts.
Occasionally, you get lucky and there’s a hoodie or jacket. First, hoodies and jackets are super-awesome to get at a conference. The OpenStack people are really good at this, and at Pivotal we’ve had several internal conferences that were awesome on this front too.
For me, hoodies and jackets have slightly different rules:
I’m not in a motorcycle gang, so I don’t want any shit on the back.
Same for the front.
That said, there are some exceptions if it’s subtle. The two OpenStack hoodies I have are good examples of this.
It’s OK to just discreetly put your company’s name and logo on the left breast.
A thin-hoodie is actually pretty nice — I have an OpenStack hoodie that’s an excellent example of this, it’s a good “layering” thing versus the ultra thick ones.
When it comes to fabric, I think “beefy-t” is fine.
My Pivotal hoodie has a clever feature: the Pivotal name is embroidered on the rim of the hood. Nifty!
Slowly but surely, the US government is improving how they do software. Working at Pivotal, I’m lucky to see some of this change and talk with the people who’ve actually done it. Just as we’re seeing huge improvements in the private sector with Pivotal’s cloud native approach, we’re now seeing successful examples of transformation in government. As with any sweeping transformation trend, there are several early case studies that have proven change is possible in the government. The cloud native practices of agility, DevOps, and relying on cloud platforms are spreading through the US Federal government and it is encouraging and cool to see the outcomes they have enabled.
People often complain about red-tape, funding problems, staff’s unwillingness to change, and an overall defeatist attitude. These cases show not only that the cloud native approach works, giving agencies and the military new, modernized capabilities with clear, positive ROI, but also show that it’s possible. In fact, it’s not as hard as it may seem.
If you’ve seen my talks, this IRS story is one of my favorite cases of what it means to do “digital transformation.”
The IRS had been using call centers for many, many years to provide basic account information and tax payment services. Call centers are expensive and error prone: one study found that only 37% of calls were answered. Over 60% of people calling the IRS for help were simply hung-up on! With the need to continually control costs and deliver good service, the IRS had to do something.
In the consumer space, solving this type of account management problem has long been taken care of. It’s pretty easy in fact; just think of all the online banking systems and paying your monthly cellphone bills. But at the IRS, viewing your transactions had yet to be digitized.
When putting software around this, the IRS first thought that they should show you your complete history with the IRS, all your transactions. This confused users and most of them still wanted to pick up the phone. Think about what a perfect failure that is: the software worked exactly as designed and intended, it was just the wrong way to solve the problem. Thankfully, because they were following a small batch process, they caught this very quickly, and iterated through different versions of it until they hit on a simple finding: when people want to know how much money they owe the IRS, they just want to know how much money they owe the IRS. When this version of the software was tested, people didn’t need to use the phone.
Now, if the IRS was on a traditional 12 to 18 months cycle (or longer!) think of how poorly this would have gone, the business case would have failed, you would probably continue to have a dim view of IT and the IRS. But, by thinking about software correctly — in an agile, small batch way — the IRS did the right thing, not only saving money, but also solving people’s actual problems.
Digitization projects like this, however, can be hard in the government due to the all too well meaning process and oversight. The IRS has been working with Pivotal to introduce a very advanced agile approach, e.g., shipping frequently, pairing across roles, and intense user-testing. Along the way, they had to manage various stakeholders expectations, winning over their trust, interest, and eventually support for transforming how the IRS does their software.
This project has great results: after some onerous up-front red-tape transformation, they put an app in place which allows people to look up their account information, check payments due, and pay them. As of October 2017, there have been over 2 million users and the app has processed over $440m in payments.
It’s rare to get details on military IT projects, so these stories are particularly delicious as it’s a literal case of “digital transformation,” going from analog to digital.
The US military has for a long time realized that they need to rapidly respond to changes in the field, not only a weekly or daily basis, but on an hourly basis. Software drives a huge amount of how the military operates now, “Everything we do in the military, and everything we do in combat, is now software based,” as Lt. Col. Enrique Oti put it. With so much reliance on software, when most IT projects take five to seven years to ship, there’s a bit of a crisis in how IT is done. “This idea of not taking action is not an option that the United States Army actually has,” said Army CIO Lt. Gen. Bruce Crawford in a recent talk.
Much can be blamed on the procurement process (and the associated needs of oversight, but overall the issue is putting more agile approach to software in place. The Air Force has several projects under its harness that are showing the way.
One of them is a story of literally going from analog to digital. They’d been planning out refueling schedules in the Middle East with a large white board. While the staff were working earnestly, it took about 8 hours and, clearly, was not the ideal state for planning something as vital as refueling.
After working with Pivotal, they digitized this process and dramatically reduced the time it took to prepare the whiteboard. They shipped their first version in 120 days (an amazing speed for any organization, private or public sector). Even better, they now regularly ship new features each week, continually improving the system. Moving from shipping every 5 years to every week, adding in the ability to adapt to new needs and operational challenges means this piece of software is directly supporting and improving the overall mission.
Because they could schedule more precisely, they were also able to remove one tanker from regular usage each day (see at about 1h47m in this video), saving about a million dollars a day. The ROI on this project, clearly, was off the charts. In fact, they were able to make back their investment in this project in seven days, based on the fuel savings. They were also able to cut the staff needed dramatically, while at the same time improving the service and freeing up staff to work on other important missions and tasks.
Previously, every time we added a new capability, we would have had to build, test, and deploy the entire IT stack. A mistake could cost $100 million, likely ending the career of anyone associated with that decision. A smaller mistake is less often a career-ender and thus encourages smart and informed risk-taking.
“You gave me what I asked for, but not really what I wanted.”
Raytheon is with the program as well, having recognized the need to need to become more agile in its delivery practices. The software needs to evolve as quickly as possible, years long contracts just won’t cut it. As one of Raytheon’s engineers put it: “employing Agile and DevOps is going to speed up the software lifecycle, getting new features into the hands of the men and women of the Armed Forces a lot quicker.”
They’ve been working with Pivotal to switch over to faster feedback cycles and apply DevOps practices to their software life-cycle.
Working with the Air Force, as with all these types of transformations, they started with one project, built up skills and knowledge, and have been expanding to other products. The first project was the Air Force’s Air and Space Operations Center Weapon System (AOC Pathfinder). They’re also working on one of the Air Forces intelligence systems, the Distributed Common Ground System.
Software release cycle speed (from years to months, if not weeks) is important in these systems, but matching the evolving and emerging needs for those systems is equally — perhaps even more! — important. “The DevOps model allows our customers to ask for the products they really want,” Raytheon’s Quynh Tran said, “The results [are that] we are shortening deployment times and prioritizing work based on their needs. We’re going to be better at meeting their expectations…. Military users get their requests changed in months instead of years and see the results of continuous feedback.”