You own it

This post is an early draft of a chapter in my book,  Monolithic Transformation.

Anywhere there is lack of speed, there is massive business vulnerability:

Speed to deliver a product or service to customers.

Speed to perform maintenance on critical path equipment.

Speed to bring new products and services to market.

Speed to grow new businesses.

Speed to evaluate and incubate new ideas.

Speed to learn from failures.

Speed to identify and understand customers.

Speed to recognize and fix defects.

Speed to recognize and replace business models that are remnants of the past.

Speed to experiment and bring about new business models.

Speed to learn, experiment, and leverage new technologies.

Speed to solve customer problems and prevent reoccurrence.

Speed to communicate with customers and restore outages.

Speed of our website and mobile app.

Speed of our back-office systems.

Speed of answering a customer’s call.

Speed to engage and collaborate within and across teams.

Speed to effectively hire and onboard.

Speed to deal with human or system performance problems.

Speed to recognize and remove constructs from the past that are no longer effective.

Speed to know what to do.

Speed to get work done.

— John Mitchell, Duke Energy.

When enterprises need to change urgently, in most cases, The Problem is with the organization, the system in place. Individuals, like technology, are highly adaptable and can change. They’re both silly putty that wiggle into the cracks as needed. It’s the organization that’s obstinate and calcified.

How the organization works, it’s architecture, is the totally the responsibility of the leadership team. That ream owns it just like a product team owns their software. Leadership’s job is to make sure the organization is healthy, thriving, and capable.

DevOps’ great contribution to IT is treating culture as programmable. How your people work is as agile and programmable as the software. Executives, management, and enterprise architects — leadership — are product managers, programmers, and designers. The organization is their product. They pay attention to their customers — the product teams and the platform engineers — and do everything possible to get the best outcomes, to make the product, the organization, as productive and well designed as possible.

I’ve tried to collect together what’s worked for numerous organizations going through — again, even at the end, gird your brain-loins, and pardon me here — digital transformation. Of course, as in all of life, the generalized version of Orwell’s 6th rule applies: “break any of these rules rather than doing anything barbarous.

As you discover new, better ways of doing software I’d ask you to share those learnings a widely as possible, especially outside of your organization. There’s very little written on the topic of how regular, large organization managing the transformation to becoming software-driven enterprises.

Know that if your organization is dysfunctional, is always late and over budget, that it’s your fault. Your staff may be grumpy, may seem under-skilled, and your existing infrastructure and application may be pulling you down like a black-hole. All of that is your product: you own it.

As I recall, a conclusion is supposed to be inspirational instead of a downer. So, here you go. You have the power to fix it. Hurry up and get to work.

This post is an early draft of a chapter in my book,  Monolithic Transformation.

Enterprise architecture still matters

This post is an early draft of a chapter in my book,  Monolithic Transformation.

A typical enterprise CAB.

We had assumed that alignment would occur naturally because teams would view things from an enterprise-wide perspective rather than solely through the lens of their own team. But we’ve learned that this only happens in a mature organization, which we’re still in the process of becoming. — Ron van Kemenade, ING.

The enterprise architect’s role in all of this deserves some special attention. Traditionally, in most large organizations, enterprise architects define the governance and shared technologies. They also enforce these practices, often through approval processes and review boards. An enterprise architect (EA) is seldom held in high regard by developers in traditional organizations. Teams (too) often see EAs as “enterprise astronauts,” behind on current technology and methodology, meddling too much in day-to-day decisions, sucking up time with change-advisory boards (CABs), and forever working on work that’s irrelevant to “the real work” done in product teams.

It’s popular, even, for the DevOps community to poke fun at them, going so far as to show that the traditional, change advisory board methods of governance actually damage the organization. “Using external change approval processes such as a change advisory board, as opposed to peer-based code review techniques,” Jez Humble writes summarizing the 2014 DevOps Report, “significantly impacts throughput while doing almost nothing to improve stability.”

If cruel, this sentiment often has truth to it. “If I’m doing 8 or 15 releases a week,” HCSC’s Mark Ardito says, “how am I going to get through all those CABs?” While traditional EAs may do “almost nothing” of value for high performing organizations, the role does play a significant part in cloud native leadership.

First, and foremost, EAs are part of leadership, acting something like the engineer to the product manager on the leadership team. An EA should intimately know the current and historic state of the IT department, and also should have a firm grasp on the actual business IT supports.

While EAs are made fun of for ever defining their enterprise architecture diagrams, that work is a side-effect of meticulously keeping up with the various applications, services, systems and dependencies in the organization. Keeping those diagrams up-to-date is a hopeless task, but the EAs who make them at least have some knowledge of your existing spaghetti of interdependent systems. As you clean-up this bowl of noodles, EAs will have more insights into the overall system. Indeed, tidying up that wreckage is an under appreciate task.

The EA’s dirty hands

I like to think of the work EAs do as gardening the overall organization. This contrasts with the more tops-down idea of defining and governing the organization, down to technologies and frameworks used by each team. Let’s look at some an EAs gardening tasks.

Setting technology & methodology defaults

Even if you take an extreme, developer friendly position, saying that you’re not going to govern what’s inside each application, there are still numerous points of governance about how the application is packaged, deployed, how it interfaces and integrates with other applications and services, how it should be instrumented to be managed, and so on. In large organizations, EAs should play a large role in setting these “defaults.” There may be reasons to deviate, but they’re the prescribed starting points.

As Stuart Charlton explains:

I think that it’s important that as you’re doing this you do have to have some standards about providing a tap, or an interface, or something to be able to hook anything you’re building into a broader analytics ecosystem called a data-lake — or whatever you want to call it — that at least allows me to get at your data. It’s not you know, like “hey I wrote this thing using a gRPC and golang and you can’t get at my data!” No you got to have something where people can get at it, at the very least.

Beyond software, EAs can also set the defaults for the organization’s meatware, all the process, methodology, and other “code” that actual people execute. Before Home Depot started standardizing their process, Tony McCully says, “everyone was trying to be agile and there was this very disjointed fragmented sort of approach to it You know I joke that we know we had 40 scrum teams and we were doing it 25 different ways.” Clearly, this is not ideal, and standardizing how your product teams operate is better.

It may seem constricting at first, but setting good defaults leads to good outcomes like Allstate reporting going from 20% developer productivity to over 80%. As someone once quipped: they’re called “best practices” because they are the best practices.

Gardening product teams

First, someone has to define all the applications and services that all those product teams form around. At a small scale, the teams themselves can do this, but as you scale up to 1,000’s of people and 100’s of teams, gathering together a Star Wars scale Galactic Senate is folly. EAs are well suited to define the teams, often using domain-driven design (DDD) to first find and then form the “domains” that define each team. A DDD analysis can turn quickly into its own crazy wall of boxes and arrows, of course. Hopefully, EAs can keep the lines as helpfully straight as possible.

It’s always spaghetti.

Rather than checking in on how each team is operating, EAs should generally focus on the outcomes these teams have. Following the rule of team autonomy (described elsewhere in this booklet), EAs should regularly check on each team’s outcomes to determine any modifications needed to the team structures. If things are going well, whatever’s going on inside that black box must be working. Otherwise, the team might need help, or you might need to create new teams to keep the focus small enough to be effective.

Gardening microservices

Most cloud native architectures use microservices, hopefully, to safely remove dependencies that can deadlock each team’s progress as they wait for a service to update. At scale, it’s worth defining how microservices work as well, for example: are they event based, how is data passed between different services, how should service failure be handled, and how are services versioned?

@pczarkowski asks, “do you even microservice?”

Again, a senate of product teams can work at a small scale, but not on the galactic scale. EAs clearly have a role in establishing the guidance for how microservices are done and what type of policy is followed. As ever, this policy shouldn’t be a straight-jacket. The era of SOA and ESBs has left the industry suspicious of EAs defining services. Those systems became cumbersome and slow moving, not to mention expensive in both time and software licensing. We’ll see if microservices avoid that fate, but keeping the overall system light-weight and nimble is clearly a gardening that EAs are well suited for.

Platform operations

As we’ll discuss later, at the center of every cloud native organization is a platform. This platform standardizes and centralizes the runtime environment, how software is packaged and deployed, how it’s managed in production, and otherwise removes all the toil and sloppiness from traditional, bespoke enterprise application stacks. Most of the platform cases studies I’ve been using, for example, are from organizations using Pivotal Cloud Foundry.

Occasionally, EAs become the product managers for these platforms. The platform embodies the organization’s actual enterprise architecture and evolving the platform, thus, evolves the architecture. Just as each product team orients their weekly software releases around helping their customers and users, the platform operations team runs the platform as a product.

EAs might also get involved with the tools groups that provide the build pipeline and other shared services and tools. Again, these tools embody part of the overall enterprise architecture, more of the running cogs behind all those boxes and arrows.

As a side-effect of product managing the platform and tools, EAs can establish and enforce governance. The packaging, integration, runtime, and other “opinions” expressed in the platform can be crafted to force policy compliance. That’s a command-and-control way of putting it, and you certainly don’t want your platform to be restrictive. Instead, by implementing the best possible service or tool, you’re getting product teams to follow policy and best practices by bribing them with easy of use and toil-reduction.

It’s the same as always

I’ve highlighted just three areas EA contribute to in a cloud native organization. There are more, many of which will depend on the peccadilloes of your organization, for example:

  • Identifying and solving sticky cultural change issues is one such, situational topic. EAs will often know individual’s histories and motivations, giving them insights into how to deal with grumps that want to stall change.
  • EA groups are well positioned to track, test, and recommend new technologies and methodologies. This can become an “enterprise astronaut” task of being too far afield of actual needs and not understanding what teams need day-to-day, of course. But, coupled with being a product manager for the organizations’ platform, scouting out new technologies can be grounded in reality.
  • EAs are well positioned to negotiate with external stakeholders and blockers. For example, as covered later, auditors often end-up liking the new, small batch and platform-driven approach to software because it affords more control and consistency. Someone has to work with the auditors to demonstrate this and be prepared to attend endless meetings that product team members are ill-suited and ill-tempered for.

What I’ve found is that EAs do what they’ve always done. But, as with other roles, EAs are now equipped with better process and technology to do their jobs. They don’t have to be forever struggling eyes in the sky and can actually get to the job of architecting, refactoring, and programming the enterprise architecture. Done well, this architecture becomes a key asset for the organization, often the key asset of IT.

Though he poses it in terms of the CIO’s responsibility, Mark Schwartz describes the goals of enterprise architects well:

The CIO is the enterprise architect and arbitrates the quality of the IT systems in the sense that they promote agility in the future. The systems could be filled with technical debt but, at any given moment, the sum of all the IT systems is an asset and has value in what it enables the company to do in the future. The value is not just in the architecture but also in the people and the processes. It’s an intangible asset that determines the company’s future revenues and costs and the CIO is responsible for ensuring the performance of that asset in the future.

Hopefully the idea of architecting and then actually creating and gardening that enterprise asset is attractive to EAs. In most cases, it is. Like all technical people, they pine for the days when they actually wrote software. Now’s their chance to get back to it.

Check out the video version of this:

This post is an early draft of a chapter in my book,  Monolithic Transformation.

Creating a culture of change, continuous learning, & comfort

This post is an early draft of a chapter in my book,  Monolithic Transformation.

In banking, you don’t often get a clean slate like you would at some of the new tech companies. To transform banking, you not only need to be equipped with the latest technology skills, you also need to transform the culture and skill sets of existing teams, and deal with legacy infrastructure. — Siew Choo Soh, DBS Bank

Most organizations have a damaging mismatch between the culture of service management and the strategic need to become a product organization. In a product culture, you need the team to take on more responsibility, essentially all of the responsibility, for the full life cycle of the product. Week-to-week they need to experiment with new features and interpret feedback from users. In short, they need to become innovators.

Service delivery cultures, in contrast, tend more towards a culture of following up-front specification, process, and verification. Too often when put into practice, IT Service Management (ITSM) becomes a governance bureaucracy that drives project decision. This governance-driven culture tends to be much slower at releasing software than a product culture.

The sadly maligned architectural change advisory boards (CABs) are an example, well characterized by by Jon Hall:

[A] key goal for DevOps teams is the establishment of a high cadence of trusted, incremental production releases. The CAB meeting is often seen as the antithesis of this: a cumbersome and infrequent process, sucking a large number of people into a room to discuss whether a change is allowed to go ahead in a week or two, without in reality doing much to ensure the safe implementation of that change.

Recent studies have even suggested that too much of this process, in the form of change advisory boards, actually damages the business. Most ITSM experts don’t so much disagree as suggest that these governance bureaucracies are doing it wrong. ITSM has been evolving and can evolve more to fit all this new-fangled product think, they add.

Despite the best intentions of ITSM adherents, IT organizations that put service management into practice tend to become slow and ineffective, at least when it comes to change and innovation.

The most difficult challenge for leaders is changing this culture.

What even is culture?

Coffee is important, but not as much as culture.

Culture is a funny word in the DevOps, agile, and digital transformation world. I don’t particularly like it, but it’s the word we have.

Mainstream organizational management work has helpful definitions of culture: “Culture can be seen in the norms and values that characterize a group or organization,” O’Reilly and Tushman write, “that is, organizational culture is a system of shared values and norms that define appropriate attitudes and behaviors for its members.”

Jez Humble points out another definition, from Edgar Schein:

[Culture is] a pattern of shared tacit assumptions that was learned by a group as it solved its problems of external adaptation and internal integration, that has worked well enough to be considered valid and, therefore, to be taught to new members as the correct way to perceive, think, and feel in relation to those problems.

We should take “culture,” then, to mean the mindset used by people in the organization to make day-to-day decisions, policy, and best practices. I’m as guilty as anyone else for dismissing “culture” as simple, hollow acts like allowing dogs under desks and ensuring that there’s six different ways to make coffee in the office. Beyond trivial pot-shots, paying attention to culture is important because it drives of how people work and, therefore, the business outcomes they achieve.

For many years, the DevOps community has used the Westrum spectrum to describe three types of organizational culture, the worst of which ring too true with most people:

From continuousdelivery.com.

Year after year, the DevOps reports show that “high performing” organizations are much more generative than pathologically…as you would suspect from the less than rosy words chosen to describe “power-oriented” cultures. It’s easy to identify your organization as pathological and equally easy to realize that’s unhelpful. Moving from the bureaucratic column to the generative column, however, is where most IT organizations struggle.

Core values of product culture

There are two layers of product culture, at least that I’ve seen over the years and boiled down. The first layer describes the attitudes of product people, the second the management tactics you put in place to get them to thrive.

Product people should be:

  • Innovative — they’re interested in solving problems, discovering problems, and coming up with new ways to accomplish inefficient tasks. These kinds of people also value continuously learning, without which innovation can’t happen except by accident: you don’t want to depend accidentally dropping a burrito into a deep fryer to launch your restaurant chain.
  • Risk takers — I don’t like this term much, but it means something very helpful and precise in the corporate world, namely, that people are willing to do something that has a high chance of failing. The side that isn’t covered enough is that they’re also focused on safety. “Don’t surf if you can’t swim,” as Andrew Clay Shafer summed it up. Risk takers ensure they know how to “swim” and they build safety nets into their process. They follow a disciplined approach that minimizes the negative consequences of failure. The small batch process, for example, with its focus on a small unit of work (a minimal amount of damage if things go wrong and an easier time diagnosing what caused the error) and studying the results, good and bad, creates a safe, disciplined method for taking risks.
  • People focused — products are meant to be used by people, whether as “customers” or “employees.” The point of everything I’m discussing here is to make software that better helps people, be that delivering a product the like using or one that allows them to be productive, getting banking done as quickly as possible so they can get back to living their life, to lengthen DBS Bank’s vision. Focusing on people, then, is what’s needed. Too often, some people are focused on process and original thinking, sticking to those precepts even if they prove to be ineffective. People-focused staff will instead be pragmatic, looking to observe how their software is helping or hindering the people we call “users.” They’ll focus on making people’s lives better, not achieving process excellence, making schedules and dates, or filling out request tickets correctly.

Finding people like this can seem like winning the lottery. Product-focused people certainly are hard to find and valuable, but they’re a lot less rare than you’d think. More importantly, you can create them by putting the right kind of management policy and nudges in place. A famous quip by Adrian Cockcroft then at Netflix, now at Amazon) illustrates this. As he recounts:

[A]t a CIO summit I got the comment “we don’t have these Netflix superstar engineers to do the things you’re talking about”, and when I looked around the room at the company names my response was “we hired them from you and got out of their way.”

There is no talent shortage, just shortage of management imagination and gumption. As most recently described in the 2018 DORA DevOps report, over and over again, research finds that the following gumptions give you the best shot at creating a thriving, product-centric culture: autonomy, trust, and voice. Each of these three support and feed into each other as we’ll see.

Autonomy

People who’re told exactly what to do tend not to innovate. Their job is not to think of new ways to solve problems more efficiently and quickly, or solve them at all. Instead, their job is to follow the instructions. This works extremely well when you’re building IKEA furniture, but following instructions is a port fit when the problem set is unknown, when you don’t even know if you know that you don’t know.

Your people and the product teams need to autonomy to study their users, theorize how to solve their problems, and fail their way to success. Pour on too much command-and-control, and they’ll do exactly what you don’t want: they’ll follow your orders perfectly. A large part of a product-centric organization’s ability to innovate is admitting that people closest to the users — the product team — are the most informed about what features to put into the software and even what the user’s problems are. You, the manager, should be overseeing multiple teams and supporting them by working with the rest of the organization. You’ll lack the intimate, day-to-day knowledge of the users and their problems. Just as a the business analysts and architects in a waterfall process are too distant from the actual work, you will be too and will make the same errors.

The 2018 DORA DevOps report suggests a few techniques for helping product teams gain autonomy:

  • Establishing and communicating goals, but letting the team decide how the work will be done.
  • Removing roadblocks by keeping rules simple.
  • Allowing the team to change rules if the rules are obstacles to achieving the goals.
  • Letting the team prioritize good outcomes for customers, even if it means bending the rules.

This list is a good start. As ever, apply a small batch mentality to how you’re managing this change and adapt according to your findings.

There are some direct governance and technology changes needed to give teams this autonomy. The product teams need a platform and production tools that allow them to actually manage the full-life cycle of their product. “[I]f you say to your team that ‘when you build it you also run it,’” Rabobanks’ Vincent Oostindië says, “you cannot do that with a consolidated environment. You cannot say to a team ‘you own that stuff, and by the way somebody else can also break it.’”

Trust

Taking risks, suggesting new features, resolving problems in production, and otherwise innovating in software requires a great deal of trust, both from management and of management. The DORA report defines trust, in this context as “how much a person believes their leader or manager is honest, has good motives and intentions, and treats them fairly.”

To succeed at digital transformation, the people in the product teams must trust management. Changing from a services-driven organization of a product organization requires a great deal of upheaval and discomfort. Staff are being asked to behave much differently than they’ve been told to in the past. The new organization can seem threatening to careers. People will gripe and complain, casting doubt on success. Management needs to first demonstrate that their desire to change can be trusted. Doing things like celebrating failures, rewarding people for using the new methods, and spending money on the trappings of the new organization (like free breakfast or training) will demonstrate management commitments.

Just as staff must trust management, managers must trust the product teams to be responsible and independent. This means managers can’t constantly check in on and meddle in the day-to-day affairs of product teams. Successful managers will find it all too tempting to get their hands dirty and volunteer to help out with problems. Getting too involved on a day-to-day basis is likely to hurt more than help, however.

Felten Buma uses Finding Nemo as a metaphor for the trust managers must have in their product teams…if you’ll pardon a cartoon reference in this book. Nemo’s father, Marlin, is constantly worried about and micromanaging his son, having been shocked by the death of his wife, Nemo’s mother. They’re fish as you might recall, so his mother was eaten one day. Not only that, but Nemo has a weak flipped on one side. Overall, this means Nemo’s father is a helicopter parent, but is also forever telling Nemo that he’s not skilled enough can’t do risky things, like swimming beyond the reef. While most leaders haven’t experienced the loss of one of their parents from fish’s meal-making, they’ve likely experienced some disasters in the past that could make them helicopter managers, always looking to “help” staff with advice about what works and doesn’t work. As in the movie, until that manager actually trusts the product team and demonstrates that trust by backing off, the product teams will lack the full moral and self-trust needed to perform well.

Buma suggests an exercise to help transform helicopter managers. In a closed meeting of managers, ask them to each share one of their recent corporate failures. Whether or not you discuss how it was fixed is immaterial to the exercise, the point is to have the managers practice being vulnerable and then show them that their career doesn’t end. Then, to practice giving up control, ask them to deligrate an important task of theirs to someone else. Buma says that surprisingly, most managers find these two tasks very hard and some outright reject it. Those managers who can go through these two exercises are likely mentally prepared to be good, transformational leaders.

Voice

The third leg of transformative leadership is giving product teams voice. Once teams trust management and start acting more autonomously, they’ll need to have the freedom to speak up and suggest ways to improve not only the product, but the way they work. A muzzled product team is much less valuable than one that can speak freely. As the DORA report defines it:

Voice is how strongly someone feels about their ability and their team’s ability to speak up, especially during conflict — for example, when team members disagree, when there are system failures or risks, and when suggesting ideas to improve their work.

Put another way, you don’t want people to be “courageous.” Instead, you want open discussions of failure and how to improve to be common and ordinary, “boring,” not “brave.” The opposite of giving your team’s voice is suppressing their suggestions, dismissing them, and explaining why such thinking is dangerous or “won’t work here.” Traditional managers tend to be deeply offended when “their” staff speaks to the rest of the organization independently, when they “go around” their direct line managers. This kind of thinking is a good indication that the team lacks true voice. While it’s certain more courteous to involve your manager in such discussions, management should trust teams to be autonomous enough to do the right thing.

In an organization like the US Air Force, where you literally have to ask permission to “speak freely,” giving product teams voice can seem impossible. To solve this problem, the Kessel Run team devised a relatively simple fix: they asked the airmen and women to wear civilian clothes when they were working on their products. Without the explicit reminder of rank that a uniform and insignia enforces, team members found it easier to talk freely with each other, regardless of rank. Of course, managers also explicitly told and encouraged this behavior. Other organizations like Allstate have used this same sartorial trick, encouraging managers to change from button-up shirts and suits to t-shirts and hoodies instead. Dress can be surprisingly key for changing culture. As a Nissan factory manager put it, “[i]f I go out to the plant in a $400 suit and tie, people don’t talk to me so freely.”

Managing ongoing culture change

Improving culture is a never ending process. Pivotal, for example has created an excellent, beloved culture over the past 25 years but is still constantly monitoring and improving it. And while I might sigh at yet another employee survey to fill out, the company has demonstrated that it actually listens and changes. This is very rare for any company and it shows how much work is needed to maintain a good culture.

Employee surveys are a good way to monitor progress. You should experiment with what to put in these surveys, and even other means of getting feedback on your organization’s culture. Dick’s Sporting Goods narrowed down to ENPS as small and efficient metric. Longer term, Dick’s Jason Williams says that they’ve seen some former employees come back to their team, another good piece of feedback for how well you’re managing your organization’s cultural change.

How you react to these surveys and feedback is even more important than gathering the feedback. Just as you expect your product teams to go through a small batch process, reacting to feedback from users, you should cycle through organizational improvement theories, paying close attention to the feedback you get from surveys and other means.

The ultimate feedback, of course, will be if you achieve the business goals derived from your strategy. But, you need to make sure that success isn’t at the cost of incurring cultural debt that will come due in the future. This debt often comes due in the form of stressed out staff leaving or, worse, going silent and no longer telling you about what they’re learning from failures. Then you’re back in the same situation you were trying to escape from all this digital transformation, an organization that’s scared and static, rather than savvy and successful.

This post is an early draft of a chapter in my book,  Monolithic Transformation.

Spraying the bullshit off “vision” & “strategy”

This post is an early draft of a chapter in my book,  Monolithic Transformation. You can get a free copy of the book!

Start your project on Monday and ship it on Friday. It’s no longer that it’s going to take 9 months. — Andy Zitney, Allstate, at the time, and now McKesson

When you’re changing, you need to know what you’re changing to. It’s also handy to know how you’re going to change, and, equally, how you’re not going to change. In organizations, vision and strategy are the tools management uses to define why and how change happens.

Use vision to set your goals and inspiration

“Vision” can be a bit slippery. Often it means a concise phrase of hope that can actually happen, if only after a lot of work. Andy Zitney’s vision of starting on Monday and shipping on Friday is a classic example of vision. Vision statements are often more than a sentence, but they give the organization a goal and the inspiration needed to get there. Everyone wants to know “why I’m here,” which the vision should do, helping stave off any corporate malaise and complacency.

Kotter has an excellent description of vision, as ever divided into a list:

Vision refers to a picture of the future with some implicit or explicit commentary on why people should strive to create that future. In a change process, a good vision serves three important purposes. First, by clarifying the general direction for change, by saying the corporate equivalent of “we need to be south of here in a few years instead of where we are today,” it simplifies hundreds or thousands of more detailed decisions. Second, it motivates people to take action in the right direction, even if the initial steps are personally painful. Third, it helps coordinate the actions of different people, even thousands and thousands of individuals, in a remarkably fast and efficient way.

Creating and describing this vision is one of the first tasks a leader, and then their team, needs to do. Otherwise, your staff will just keep muddling through yesterday’s success, unsure of what to change, let alone, why to change. In IT, a snappy vision also keeps people focused on the right things instead of focusing on IT for IT’s sake. “Our core competency is ‘fly, fight, win’ in air and space,” says the US Air Force’s Bill Marion, for example, “It is not to run email servers or configure desktop devices.”

The best visions are simple, even quippy sentences. “Live more, bank less” is a great example from DBS Bank. “[W]e believe that our biggest competitors are not the other banks,” DBS’s Siew Choo Soh says. Instead, she continues, competitive threats are coming new financial tech companies “who are increasingly coming into the payment space, as well as the loan space.”

DBS Bank’s leadership believes that focusing on the best customer experience in banking will fend off these competitors and, better, help DBS become one of the leading banks in the world. This isn’t just based on rainbow whimsey, but strategic data: in 2017, 63% of total income and a 72% of profits came from digital customers. Focusing on that customer set and spreading whatever magic brought in that much profit to the “analog customers” is clearly a profitable course of action.

“We believe that we need to reimagine banking to make banking simple, seamless, as well as invisible to allow our customers to live more bank less,” Soh says. A simple vision like that is just the tip the of the iceberg but it can easily be expanded into strategy and specific, detailed actions that will benefit DBS Bank for years to come. Indeedm DBS has already won several awards, including Global Finance Magazine’s best bank in the world for 2018.

Creating an actionable strategy

“Strategy” has many, adorably nuanced and debated definitions. Like enterprise architecture, it’s a term that at first seems easily knowable, but becomes more obtuse as you stare into the abyss. A corporate strategy defines how a company will create, maintain, and grow business value. At the highest level, the strategy is usually increasing investor returns, usually through increasing the company’s stock price (via revenue, profits, or investor’s hopes and dreams thereof), paying out dividends, or engineering the acquisition of the company at a premium. In not-for-profit organizations, “value” often means how effective and efficiently the organization can execute its mission, be that providing clean water, collecting taxes, or defending a country. The pragmatic part of strategy is cataloging the tools the organization has at its disposal to achieve, maintain, and grow that value. More than specifying which tools to use, strategy also says what the company will not do.

People often fail at writing down useful strategy and vision. They want to serve their customers, be the best in their industry, and other such thin bluster. I like to use the check cashing test to start defining an organization’s strategy. Your organization always want to make more money with good profits. Well, check cashing is a profit rich, easy business. You just need a pile of cash and good insurance for when you get robbed. Do you want to cash checks? No? OK, then we know at least one thing you don’t want to do…

The authors of Winning Through Innovation provide a more practical recipe for defining your strategy:

  1. Who are your customers and what are their needs?
  2. Which market segments are you targeting?
  3. How broad or narrow is your product or service offering?
  4. Why should customers prefer your product or service to a competitor’s?
  5. What are the competencies you possess that others can’t easily imitate?
  6. How do you make money in these segments?

Strategy should explain how to deliver on the vision with your organization’s capabilities, new capabilities enabled by technologies, customers needs and jobs to be done, your market, and your competitors. “This is where strategy plays an important role,” Kotter says, “Strategy provides both a logic and a first level of detail to show how a vision can be accomplished.”

There are endless tools for creating your strategy from hiring management consulting firms, focusing on cost or better mouse traps, eating nothing but ramen noodles, drawing on napkins, and playing the boardroom version of The Oregon Trail. If you don’t already have a strategy definition method, it doesn’t really matter which one you choose. They’re all equally terrible if you do nothing and lack an actionable strategy.

A strategy for the next 10 years of growth at Dick’s Sporting Goods

Dick’s Sporting Goods, the largest sporting good retailer in the US, provides a recent example of translating higher level vision and strategy. As described by Jason Williams, over the past 10 years Dick’s rapidly built out its e-commerce and omni-channel capabilities, an enviable feat for any retailer. As always, success created a new set of problems, esp. for IT. It’s worth reading William’s detailed explanation of these challenges:

With this rapid technological growth, we’ve created disconnects in our overall enterprise view. There were a significant number of store technologies that we’ve optimized or added on to support our e-commerce initiatives. We’ve created an overly complex technology landscape with pockets of technical debt, we’ve invested heavily in on premise hardware — in the case of e-commerce you have to plan for double peak, that’s a lot of hardware just for one or two days of peak volume. Naturally, this resulted in a number of redundant services and applications, specifically we have six address verification services that do the same thing. And not just technical issues, we often had individuals and groups that have driven for performance, but it doesn’t align to our corporate strategy. So why did we start this journey? Because of our disconnect in enterprise view, we lack that intense product orientation that a lot of our competitors already had.

These types of “disconnects” and “pockets of technical debt” are universal problems in enterprises. Just as with Dick’s, these problems are usually not the result of negligence and misfeasance, but of the actions needed to achieve and maintain rapid growth.

To clear the way for the next 10 years of success, Dick’s put a new IT strategy in place, represented by 4 pillars:

  1. Product architecture — creating an enterprise architecture based around the business, for example, pricing, catalog, inventory, and other business functions. This focus helps shift from a function and service centric mindset to product-centric mindset.
  2. Modern software development practices — using practices like test-driven development, pairing, CI/CD, lean design, and all the proven, agile best practices.
  3. Software architecture — using a microservices architecture, open source, following 12 factor principles to build cloud native applications on-top of Pivotal Cloud Foundry. This defines how software will be created, reducing the team’s toil so that they can focus on product design and development.
  4. Balanced teams — finally, as Williams describes it, having a unified, product-centric team is the “the most critical part” of Dick’s strategy. The preceding three provider the architectural and infrastructural girding to shift IT from service delivery over to product delivery.

Focusing on these four areas gives staff very clear goals which translate easily into next steps and day-to-day work. Nine months into executing this strategy, Dick’s has achieved tangible success: they’ve created 31 product teams, increased developer productivity by 25%, ramped their testing up to 70% coverage, and improved the customer experience by increasing page load time and delivering more features, more frequently.

Keep your strategy agile

Finally, keep your strategy agile. While your vision is likely to remain more stable year to year, how you implement it might need to change. External forces will put pressure on a perfectly sound strategy: new government regulations or laws could change your organization’s needs, Amazon might finally decide to bottom out your market. Figure out a strategy review cycle to check your assumptions and course correct your strategy as needed. That is, apply a small batch approach to strategy.

Organizations usually review and change strategy on an annual basis as part of corporate planning, which is usually little more than a well orchestrated fight between business units for budget. While this is an opportunity to review and adjust strategy, it’s at the whim of finance’s schedule and the mercurial tactics of other business units.

Annual planning is also an unhelpfully waterfall-centric process, as pointed out by Mark Schwartz in The Art of Business Value. “The investment decision is fixed,” he writes, but “the product owner or other decision-maker then works with that investment and takes advantage of learnings to make the best use possible of the investment within the scope of the program. We learn on the scale of single requirements, but make investment decisions on the scale of programs or investment themes — thus the impedance mismatch.”

A product approach doesn’t thrive in that annual, fixed mindset. Do at least an additional strategy review each year, and many more in the first few years as you’re learning about your customers and product with each release. Don’t let your strategy get hobbled by the fetters of the annual planning and budget cycle.

This post is an early draft of a chapter in my book,  Monolithic Transformation. You can get a free copy of the book!

Why change?

This post is an early draft of a chapter in my book,  Monolithic Transformation.

From Michael Gaida.

By now, the reasons to improve how your organization does software are painfully obvious. Countless executives feel this urgency in their bones, and have been saying so for years:

“There’s going to be more change in the next five to ten years than there’s been in the last 50” — Mary Barra, CEO, GM

Intuitively, we know that business cycles are now incredibly fast: old companies die out, or are forced to dramatically change, and new companies rise to the top…soon to be knocked down by the new crop of sharp-toothed ankle biters.

Innosight’s third study of companies’ ability to maintain leadership positions estimates that by 2018, 50% of the companies on the S&P 500 will drop off, replaced by competitors and new market entrants. Staying at the top of your market-heap is getting harder and harder.

Profesor Rita McGrath has dubbed this the age of “transient advantage,” which is an apt way of describing how long — not very! — a company can rely on yesterday’s innovations. A traditional approach to corporate strategy is too slow moving, as she says: “[t]he fundamental problem is that deeply ingrained structures and systems designed to extract maximum value from a competitive advantage become a liability when the environment requires instead the capacity to surf through waves of short-lived opportunities.” Instead, organizations must be more agile: “to win in volatile and uncertain environments, executives need to learn how to exploit short-lived opportunities with speed and decisiveness.”

Software defined businesses

“We’re in the technology business. Our product happens to be banking, but largely that’s delivered through technology.” — Brian Porter, CEO, Scotiabank

We’re now solidly in an innovation phase of the business cycle. Organizations must become faster and more agile in strategy formulation, execution, and adaptation to changing markets. Again and again, IT is at the center of how startups enter new markets (often, disruptively) and how existing enterprises retain and grow market-share.

Organizations are seeking to become software defined businesses. In this mode of thinking, custom written software isn’t just a way of “digitizing” analog process (like making still lengthy mortgage applications or insurance claims processes “paperless”), but the mission critical tool for executing and evolving business models.

While software might have played merely a supporting role in the business for so long, successful organizations are casting software as the star. “It’s no longer a business product conversation, it’s a software product that drives the business and drives the market,” McKesson’s Andy Zitney says, later adding, “[i]t’s about the business, but business equals software now.”

Retail is the most obvious example. There’s an anecdote that Home Depot realized how important innovation was to them when they found out that Amazon sold more hammers than Homer. While other retailers languish, Home Depot grew revenue 7.5% year-over-year in Q4 2017. This isn’t solely due to software, but controlling its own software destiny has played a large part. As CIO Matt Carey says of competition from Amazon, “I don’t run their roadmap; I run my roadmap.”

External competition isn’t the only reason organizations change, especially when it comes to optimizing their internal processes. Duke Energy, for example, realized that creating mobile versions of their internal applications would improve how line-workers coordinated their work in the field. A food service company improved the day-to-day reliability of cooks by introducing apps that walked staff through checklists and videos for food preparation and optimized kitchen staff’s time by better monitoring the temperature of stored food.

These cases can seem pedestrian compared to self-driving cars and AIs that will (supposedly) create cyber-doctors. However, unlike these gee-whiz technologies, these small changes work incredibly fast and have large impacts.

Organizations often focus on the process, not the software

Most large organizations have massive IT departments, and equally large pools of developers working on software. However, many of these organizations haven’t updated their software practices and technologies for a decade or more. The results are predictable as three years of a Cutter Consortium survey shows. The study found that just 30% of respondents felt that IT helped their business innovate. As the chart below shows, this has fallen from about 50 percent in 2013:

Source: “Stat of the Week: What is your IT organization’s role in business innovation?” Cutter Benchmark Review, Vol. 15, №1, July 2015.

This usefulness gap continues because IT departments are using an old approach to software. IT departments still rely on three-tier architectures, process hardened, dedicated infrastructure “service management” processes, and use functional organizations and long release cycles to (they believe) reliably produce software. I have to assume that this “waterfall” method was highly innovative and better than alternatives at the time…years and years ago.

In trying to be reliable and cost effective, IT departments have become excellent at process, even projects. In the 1990s, IT was in chaos with a shift from mainframes to Unix, then to Linux and Windows Server. On the desktop, the Windows GUI took over, but then the web browser hit mid-decade and added a whole new set of transitions and worries. Oh, and then there was the Internet, and the tail-end of massive ERP stand-ups that were changing core business processes. With all this chaos, IT often failed even on the simplest task like changing a password. Addressing this, the IT community created a school of thought called IT Service Management (ITSM) that sought to understand and codify each thing IT did for the business, conceptualizing those things as “services”: email, supply chain management, CRM, and, yes, changing passwords. Ticket desks were used to manage each process, and project management practices erected to lovingly cradle requests to create and change each IT service.

The result was certainly better than nothing, and much better than chaos. However, the ITSM age too often resulted in calcified IT departments that focused more on doing process perfectly than delivering useful services, that is, “business value.” The paladins of ITSM are quick to say this was never the intention, of course. It’s hard to know who’s the blame, or if we just need Jeffersonian table-flipping every ten years to hard reboot IT. Regardless, the traditional way of running IT is now a problem.

Most militaries, for example, can take anywhere between five to 12 years to roll out a new application. In this time, the nature of warfare can change many times over, a generations of soldiers can churn through the ranks, and the original requirements can change. Release cycles of even a year often result in the paradox of requirements perfection. In the best case scenario, the software you specified a year ago is delivered completely to spec, well tested, and fully function. But now, a year later, new competitor and customer demands nullifies the requirements from 12 months ago: that software is no longer needed.

Stretch this out to ten years, and you can see why the likes of US Air Force are treating transforming their software capabilities as a top priority. As General James “Mike” Holmes, Commander, Air Combat Command put it, “[y]ears of institutional risk aversion have led to the strategic dilemma plaguing us today: replacing our 30- year old fleet on a 30-year timeline.”

It’s easy to dismiss this as government work at its worst, clearly nothing like private industry. I’d challenge you, though, to find a large, multinational enterprise that doesn’t suffer from a similar software malaise. This misalignment is clearly unacceptable. IT needs to drastically change or it risks slowing down their organization’s innovation.

Small Batch Thinking

“If you aren’t embarrassed by the first version of your product, you shipped too late.” — Reid Hoffman, LinkedIn co-founder and former PayPal COO

How is software done right, then? Over the past 20 years, I’ve seen successful organizations use the same, general process: continuously executing small batches of software, over short iterations that put a rapid feedback loop in place. IT organizations that follow this process are delivering a different type of outcome than a set of requirements. They’re giving their organization the ability to adapt and change monthly, weekly, even daily.

By “small batches,” I mean identifying the problem to solve, formulating a theory of how to solve the problem, creating a hypothesis that can prove or disprove the theory, doing the smallest amount of application development and deployment needed to test your hypothesis, deploying the new code to production, observing how users interact with your software, and then using those observations to improve your software. The cycle, of course, repeats itself.

The small batch loop.

This whole process should take at most a week — hopefully just a day. All of these small batches, of course, add up over time to large pieces of software, but in contrast to a “large batch” approach, each small batch of code that survives the loop has been rigorously validated with actual users. Schools of thought such as Lean Startup reduce this practice to helpfully simple sayings like “think, make, check.” Meanwhile, the Observe, Orient, Decide, Act (OODA) loop breaks the cycle down into even more precision. However you label and chart the small batch cycle, make sure you’re following a hypothesis driven cycle instead of assuming up-front that you know what how your software should be implemented.

As Liberty Mutual’s’ Chris Bartlow says, “document this hypothesis right because if you are disciplined in doing that you actually can have a more measurable outcome at the end where you can determine was my experiment successful or not.” This discipline gives you a tremendous amount of insight into decisions about the your software — features to add, remove, or modify. A small batch process gives you a much richer, fact-based ability to drive decisions.

“When you get to the stoplight on the circle [the end of a small batch loop] and you’re ready to make a decision on whether or not you want to continue, or whether or not you want to abandon the project, or experiment [more], or whether you want to pivot, I think [being hypothesis driven] gives you something to look back on and say, ‘okay, did my hypothesis come true at all,” Bartlow says, “is it right on or is it just not true at all?”

Long-term, you more easily avoid getting stuck in the “that’s the way we’ve always done it” lazy river current. The record of your experiments will also serve as an excellent report of your progress, even something auditors will cherish once you explain that log to them. These well-documented and tracked records are also your ongoing design history that you rely on to improve your software. The log helps makes even your failures valuable because you’ve proven something that does not work and, thus, should be avoided in the future. You avoid the cost and risk of repeating bad decisions.

In contrast, a “large batch” approach follows a different process: teams document a pile of requirements up front, developers code away at implementing those features, perhaps creating “golden builds” each week or two (but not deploying those builds to production!), and once all of the requirements are implemented and QA’ed, code is finally deployed to production. With the large batch approach, this pile of unvalidated code creates a huge amount of risk.

This is the realm of multi-year projects that either underwhelm or are consistently late. As one manager at a large organization put it, “[w]e did an analysis of hundreds of projects over a multi-year period. The ones that delivered in less than a quarter succeeded about 80 percent of the time while the ones that lasted more than a year failed at about the same rate.”

No stranger to lengthy projects with, big, up-front analysis, the US Air Force is starting to think in terms of small batches for its software as well. “A [waterfall] mistake could cost $100 million, likely ending the career of anyone associated with that decision. A smaller mistake is less often a career-ender and thus encourages smart and informed risk-taking,” said M. Wes Haga.

Shift to user-centric design

If a small batch approach is the tool your organization now wields, a user-centric approach to software design is the ongoing activity you enable. There’s little new about taking a user-centric approach to software. What’s different is how much more efficient and fast creating good user experience and design is done thanks to highly networked applications and cloud-automated platforms.

When software was used exclusively behind the firewall and off networks as desktop applications, software creators had no idea how their software was being used. Well, they knew when there were errors because users fumed about bugs. Users never reported how well things were going when everything was working as planned. Worse, users didn’t report when things were just barely good enough and could be improved. This meant that software teams had very little input into what was actually working well in their software. They were left to, more or less, just make it up as they went along.

This feedback deficit was accompanied by slow release cycles. The complex, costly infrastructure used required a persnickety process of hardware planning, staging, release planning, and more operations work before deploying to production. Even the developers’ environments, needed to start any work, often took months to provision. Resources were scarce and expensive, and the lack of comprehensive automation across compute, storage, networking, and overall configuration required much slow, manual work.

The result of these two forces was, in retrospect, a dark age of software design. Starting in the mid-2000s, the ubiquity of always-on users and cloud automation removed these two hurdles.

Because applications were always hooked up to the network, it was now possible to observe every single interaction between a user and the software. For example, a 2009 Microsoft study found that only about one third of features added to the web properties achieved the team’s original goals — that is, were useful and considered successful. If you can quickly know which features are effective and ineffective, you can more rapidly improve your software, even eliminating bloat and the costs associated with unused, but expensive to support code.

By 2007, it was well understood that cloud automation dramatically reduced the amount of manual work needed to deploy software. The problem was evenly distributing those benefits beyond Silicon Valley and companies unfettered by the slow cycles of large enterprise. Just over 10 years later, we’re finally seeing cloud efficiencies spreading widely through enterprises. For example, Comcast realized a 75 percent lift in velocity and time to market when they used a cloud platform to automated their software delivery pipeline and production environment.

When you can gather, and thus, analyze all user interactions as well as deploy new releases at will, you can finally put a small batch cycle in place. And. this, you can create better user interaction and product design. And as we’ve seen in the past ten years, well designed products handily win out and bring in large profits.

Good user design practices are numerous and situational. Most revolve around talking with actual users and figuring out ways to extract their challenges are and then iteratively work on ways to solve them.

“Instead of starting with the [preconceived] solution,” Pivotal designer Aly Blenkin says, “we start with a general understanding of the problem. We try unpacking that problem and understanding it from the user’s perspective and using that as a foundation to start building out our designs and our ideas. Once we have that foundation, it allows us to eliminate risk and we do that through a balanced team: so having designers, product managers, engineers, data scientists come together with a multi-disciplinary approach to the way we build software.”

Good design is worth spending time on. As Forrester consistently finds, organizations that focus on design tend to perform better financially than those that don’t. As such, design can be a highly effective competitive tool. Looking at the relationship between good design and revenue growth, Forrester found that organizations that focus on better design have a 14% lead on those that don’t. For example, “in two industries, cable and retail, leaders outperformed laggards by 24 percentage and 26 percentage points, respectively.”

I haven’t done a great job at describing what exactly good design looks like, let alone what the day-to-day work is. Let’s next look at simple case study with clear business results as an example.

Case Study: no one wants to call the IRS

From “Minimum Viable Taxes: Lessons Learned Building an MVP Inside the IRS,” Dec 2015.

You wouldn’t think big government, particularly a tax collecting organization, would be a treasure trove of good design stories, but the IRS provides a great example of how organizations are reviving their approach to software.

The IRS historically used call centers to provide basic account information and tax payment services. Call centers are expensive and error prone: one study found that only 37% of calls were answered. Over 60% of people calling the IRS for help were simply hung-up on! With the need to continually control costs and deliver good service, the IRS had to do something.

In the consumer space, solving this type of account management problem has long been taken care of. It’s pretty easy in fact; just think of all the online banking systems you use and how you pay your monthly phone bills. But at the IRS, viewing your transactions had yet to be digitized.

When putting software around this, the IRS first thought that they should show you your complete history with the IRS, all of your transactions, as seen in the before UI example above. This confused users and most of them still wanted to pick up the phone. Think about what a perfect failure that is: the software worked exactly as designed and intended, it was just the wrong way to solve the problem.

Thankfully, because the IRS was following a small batch process, they caught this very quickly, and iterated through different hypotheses of how to solve the problem until they hit on a simple finding: when people want to know how much money they owe the IRS, they only want to know how much money they owe the IRS. When this version of the software was tested, most people didn’t want to use the phone.

Now, if the IRS was on a traditional 12 to 18 months cycle (or longer!) think of how poorly this would have gone, the business case would have failed, and you would probably have a dim view of IT and the IRS. But, by thinking about software in an agile, small batch way, the IRS did the right thing, not only saving money, but also solving people’s actual problems.

This project has great results: after some onerous up-front red-tape transformation, the IRS put an app in place which allows people to look up their account information, check payments due, and pay them. As of October 2017, there have been over 2 million users and the app has processed over $440m in payments. Clearly, a small batch success.

Create business agility with small batches

A small batch approach delivers value very early in the process with incremental releases of feature to production. This contrasts to a large batch approach which waits until the very end to (attempt to) deliver all of the value in one big lump. Of course, delivering early doesn’t delivering 1 year’s worth of work in one week. Instead, it means delivering just enough software to validate your design with user feedback.

Delivering early also allows you to prioritize your backlog, the list of requirements to implement. Organizations delivering weekly often find that a feature has been implemented “enough” and further development on the feature can be skipped. For example, to give people their hotel stay invoice, just allowing them to print a stripped down webpage might suffice instead of writing the code that creates and downloads a PDF. Once further development on that feature is de-prioritised, the team can decided to bring a new feature to the top of the backlog, likely ahead of schedule. This flexibility in priorities is one of the core of reasons agile software delivery makes business more agile and innovative.

Done properly a small batch approach also gives you a steady, reliable release train. This concept means that each week, your product teams will deliver roughly the same amount of “value” to production. Here, “value” means whatever changes they make to the software in production: typically, this is adding code that creates new features of modifies existing ones, but it could also be performance, security improvements, patches that ensure the software runs properly.

A functioning small batch process, then, gives you business agility and predictability. Trying out multiple ideas is now much cheaper, one of the keys to innovating new products and business models. The traditional, larger batch approach often requires millions of dollars in budget, driving the need for high-level approval, driving the need…to wait for the endless round of meetings and finance decisions, often on the annual budget cycle. This too often killed off ideas, as Allstate’s Opal Perry explains: “by the time you got permission, ideas died.” But with an MVP approach, as she contrasts, “a senior manager has $50,000 or $100,000 to do a minimum viable product” and can explore more ideas.

Case study: the lineworker knows best at Duke Energy

Duke Energy wanted to improve how line-workers coordinated their field-work. At first, the vice president in charge of the unit reckoned that a large map showing where all the line workers’ location would help him improve scheduling and work queues.

The team working on this went further than just trusting the VP’s first instincts, doing some field research with the actual line-workers. After getting to know the line-workers, they discovered a solution that redinfed the business problem. While the VP’s map would be a fine dashboard and give more information to central office, what really helped was developing a job assignment application for line-workers. This app would let line-workers locate their peers to, for example, partner with them on larger jobs, and avoid showing up at the same job. The app also introduced an Uber-like queue of work where line-workers could self-select which job to do next.

In retrospect this change seems obvious, but it’s only because the business paid attention to the feedback loop and user research and then reprioritized their software plans accordingly.

Transforming is easy…right?

Putting small batch thinking in place is no easy task: how long would it take you, currently, to deploy a single line of code, from a whiteboard to actually running in production? If you’re like most people, following the official process, it’d take weeks — just getting on the change review board’s schedule would take a week or more, and hopefully key approvers aren’t on vacation. This single-line of code thought experiment will start to flesh out what you need to do — rather, fix — to switch over to doing small batches.

Transforming one team, one piece of software isn’t easy, but it’s often very possible. Improving two applications usually works. How do you go about switching 10 applications over to a small batch process? How about 500?

Supporting hundreds of applications and teams — plus the backing services that support these applications — is a horse of a different color, rather, a drove of horses of many different colors. There’s no comprehensive manual for doing small batches at large scale, but in recent years several large organizations have been stampeding through the thicket. Thankfully, many of them have shared their success, failures, and, most importantly, lessons learned. We’ll look at their learnings next, with an eye, of course, at taming your organization’s big batch bucking.

This post is an early draft of a chapter in my book,  Monolithic Transformation.

Speed

This post is an early draft of a chapter in my book,  Monolithic Transformation.

From John Mitchell:

Speed is the currency of business today and speed is the common attribute that differentiates companies and industries going forward. Anywhere there is lack of speed, there is massive business vulnerability:

● Speed to deliver a product or service to customers.

● Speed to perform maintenance on critical path equipment.

● Speed to bring new products and services to market.

● Speed to grow new businesses.

● Speed to evaluate and incubate new ideas.

● Speed to learn from failures.

● Speed to identify and understand customers.

● Speed to recognize and fix defects.

● Speed to recognize and replace business models that are remnants of the past.

● Speed to experiment and bring about new business models.

● Speed to learn, experiment, and leverage new technologies.

● Speed to solve customer problems and prevent reoccurrence.

● Speed to communicate with customers and restore outages.

● Speed of our website and mobile app.

● Speed of our back-office systems.

● Speed of answering a customer’s call.

● Speed to engage and collaborate within and across teams.

● Speed to effectively hire and onboard.

● Speed to deal with human or system performance problems.

● Speed to recognize and remove constructs from the past that are no longer effective.

● Speed to know what to do.

● Speed to get work done.

Continuous innovation only works with an enterprise that embraces speed and the data required to measure it. By creating conditions for continuous innovation, we must bring about speed. While this is hard, it has a special quality that makes the job a little easier. Through data, speed is easy to measure.

Innovation, on the other hand, can be extremely difficult to measure. For example, was that great quarterly revenue result from innovation or market factors? Was that product a one hit wonder or result of innovation? How many failures do we accept before producing a hit? These questions are not answerable. But we can always capture speed and measure effects of new actions. For example, we can set compliance expectations on speed and measure those results.

Speed is not only the key measurement, it becomes a driver for disruptive innovation. Business disruption has frequently arisen from startups and new technologies, not seeking optimization, but rather discovering creative ways to rethink problems to address speed. Uber is about speed. Mobile is about speed. IoT is about speed. Google is about speed. Drones are about speed. AirBnB is about speed. Amazon is about speed. Netflix is about speed. Blockchain is about speed. Artificial Intelligence is about speed.

Continuous Innovation then is the result of an enterprise, driven by speed, which is constantly collecting data, developing and evaluating ideas, experimenting and learning, and through creativity and advancing technologies, is constructing new things to address ever evolving customer needs.

This post is an early draft of a chapter in my book,  Monolithic Transformation.

Power-line picture from Claudiu Sergiu Danaila.

Team composition: not all ninjas

This post is an early draft of a chapter in my book, Monolithic Transformation.

By way of “A brief history of rockstars destroying guitars.”

Skilled, experienced team members are obviously valuable and can temper the risk failure by quickly delivering software. Everyone would like the mythical 10x developer, and would even settle for a 3 to 4x “full stack developer.” Surely, management often thinks, doing something as earth-shattering as “digital transformation” only works with highly skilled developers. You see this in surveys all the time: people say that lack of skills is a popular barrier to improving their organization’s software capabilities.

This mindset is one of the first barriers to scaling change. Often, an initial, team of “rockstars” has initial success, but attempts to clone them predictably fails and scaling up change is stymied. It’s that “lack of skills” chimera again. It’s impossible to replicate these people, and companies rarely want to spend the time and money to actually train existing staff.

Worse, when you use the only ninjas need apply tactic, the rest of the organization loses faith that they could change as well. “When your project is successful,” Jon Osborn explains, “and they look at your team, and they see a whole bunch of rockstars on it, then the excuse comes out, ‘well, you took all the top developers, of course you were successful.’”

Instead of only recruiting elite developers, also staff your initial teams with a meaningful dose of normals. This will not only help win over the rest of the organization as you scale, but also means you can actually find people. A team with mixed skill levels also allows you train your “junior” people on the job, especially when they pair with your so called “rockstars.”

Rockstars known to destroy hotel rooms

I met a programmer with 10x productivity once. He was a senior person and required 10 programmers to clean up his brilliant changes. –Anonymous on the c2 wiki

Usually what people find, of course, is that this rockstar/normal distinction is situational and the result of a culture that awards the lone wolf hero instead of staff that helps and supports each other. Those mythical 10x developers are lauded because of a visual cycle of their own creation. At some point, they spaghetti coded out some a complicated and crucial part of the system “over the weekend,” saving the project. Once in production, strange things started happening to that piece of code, and of course our hero was the only one who could debug the code, once again, over the weekend. This cycle repeats itself, and we laud this weekend coder, never realizing they’re actually damaging our business.

Relying on these heros, ninjas, rockstars, or what have you is a poor strategy in a large organization. Save the weekend coding for youngsters in Ramen chomping startups that haven’t learned better yet. “Having a team dynamic and team structure that the rest of the organization can see themselves in,” Osborn goes on to say, “goes a long way towards generating a buy in that you’re actually being successful and not cheating by using all your best resources.”

Volunteers

When possible, recruiting volunteers is the best option for your initial projects, probably for the first year. Forcing people to change how they work is a recipe for failure, esp. at first. You’ll need motivated people who are interested in change or, at least, will go along with it instead of resisting it.

Osborn describes this tactic at Great American Insurance Group: “We used the volunteer model because we wanted excited people who wanted to change, who wanted to be there, and who wanted to do it. I was lucky that we could get people from all over the IT organisation, operations included, on the team… it was a fantastic success for us.”

This might be difficult at first, but as a leader of change you need to start finding and cultivating these change-ready volunteers. Again, you don’t necessarily want rockstars, so much as open minded people who enjoy trying new things.

Rotating out to spread the virus of digital transformation

Few organizations have the time or budget-will to train their staff. Management seems to think that a moist bedding of O’Reilly books in a developer’s dark room will suddenly pop-up genius skills like mushrooms. Rotating pairing in product teams addresses this problem in a minimally viable way inside a team: team members learn from each other on a daily basis. Event better, staff is actually producing value as they learn instead of sitting in a neon-light buzzing conference room working on dummy applications.

To scale this change, you can selectively rotate staff out of a well functioning team into newer teams. This seeds their expertise through the organization, and once you repeat this over and over, knowledge will spread faster. One person will work with another, becoming two skilled people, who each work with another person, become four skilled people, then eight, and so on. Organizations like Synchrony go so far as the randomly shuffle desks every six months to ensure people are moving around.

More than just skill transfer and on the job training, rotating other staff through your organization will help spread trust in the new process. People tend to trust their peers more than leaders throwing down change from high, and much more than external “consultants,” and worse, vendor shills like myself. As ever, building this trust through the organization is key to scaling change.

Orange France is one of the many examples of this strategy in practice. After the initial success revitalizing their SMB customer service app, Orange started rotating developers to new teams. Developers that worked on the new mobile application pair with Orange developers from other teams, the website team. As ever with pairing, they both teach the peers how to apply agile and improve the business with better software at the same time. Talking about his experience with rotating pairing, Orange’s Xavier Perret says that “it enabled more creativity in the end. Because then you have different angles, [a] different point of view. As staff work on new parts of the system they get to know the big picture better and being “more creative problem solving” to each new challenge, Perret ads.

While you may start with ninjas, you can take a cadre of volunteers and slowly by surely build up a squad of effective staff that can spread transformation throughout your organization. All with less throwing stars and trashed hotel rooms than those 10x rockstars leave in their wake.

This post is an early draft of a chapter in my book, Monolithic Transformation.

Beyond digital transformation BS, improving your organization by fixing your software strategy

A large tire fire

This post lists early draft of a chapters in my now published book, Monolithic Transformation.

Credit to Team Tirefi.re.

The phrase “digital transformation” is mostly bull-shit, but then again, it’s perfect. The phrase means executing a strategy to innovate new business models driven by rapidly delivered, well designed, and agile software. For many businesses, fixing their long dormant, lame software capabilities is an urgent need: companies like Amazon loom as over-powering competitors in most every industry. More threatening, clever, existing enterprises have honed their ability software capabilities over the past five years.

Liberty Mutual, for example, entered a new insurance market on the other side of the world in 6 months, doubling the average close rate. Home Depot has grown it’s online business by around $1bn each of the past four years, is the #2 ranked digital retailer by Gartner L2, and is adding more than 1,000 technical hires in 2018. The US Air Force modernized their air tanker scheduling process in 120 days, driving $1m in fuel savings each week, and leading to canceling a long-standing $745m contract that hadn’t delivered a single line of code in five years.

Whatever businesses you’re in, able, ruthless competition is coming from all sides: new entrants and existing behemoths. Their success is driven by an agile, cloud-driven software strategy that transforms their organizations into agile businesses.

Let’s take a breath.

That’s some full tilt bluster, but we’ve been in an era of transient advantage for a long time. Businesses need every tool they can lay hands on to grow their business, sustain their existing cash-flows, and fend off competitors. IT has always been a powerful tool for enabling strategies, as they say, but in the past 10 years seemingly helpful but actually terrible practices like outsourcing have ruined most IT department’s ability to create useful software for the businesses they supposedly support.

These organizations need to improve how they do software to transform their organizations into programmable businesses.

Studying how large organizations plan for, initially fail, and then succeed at this kind of transformation is what I spend my time doing. This book (which I’m still working on) collects together what I’ve found so far, and is constructed from the actual experiences and stories of people of who’ve suffered through the long journey to success.

Enjoy! And next time someone rolls their eyes at the phrase “digital transformation,” ask them, “well, what better phrase you got, chuckle-head?”

Draft chapters

I’m posting draft chapters of this book as I MVP-polish them up. In sort of the right order, here they are:

  1. Why change?
  2. Spraying the bullshit of “vision” & “strategy”.
  3. Communicate the digital vision and strategy.
  4. Creating a culture of change, continuous learning, & comfort.
  5. Enterprise architecture still matters.
  6. Creating alliances & holding zero-sum trolls at bay.
  7. A series of small projects, building momentum to scale.
  8. Product teams — agile done right.
  9. Team composition: not all ninjas.
  10. Tracking your improvement — “metrics.”
  11. Dealing with compliance — it might even be a good idea.
  12. You own it (conclusion)

There’s also the complete draft in progress if you can bear it. Also, there’s a previous “edition” of sorts, and the ever shifting talk I give on this content.

This post lists early draft of a chapters in my now published book, Monolithic Transformation.

Communicate the digital vision and strategy

This post is an early draft of a chapter in my book,  Monolithic Transformation.

Your employees listening to yet another annual vision and strategy pitch.

If a strategy is presented in the boardroom but employees never see, is it really a strategy? Obviously, not. Leadership too often believes that the strategy is crystal clear but staff usually disagree. For example, in a survey of 1,700 leaders and staff, 69% of leaders said their vision was “pragmatic and could easily translated into concrete projects and initiatives.” Employees, had a glummer picture: only 36% agreed.

Your staff likely doesn’t know the vision and strategy. More than just understanding it, they rarely know how they can help. As Boeing’s Nikki Allen put it:

In order to get people to scale, they have to understand how to connect the dots. They have to see it themselves in what they do — whether it’s developing software, or protecting and securing the network, or provisioning infrastructure — they have to see how the work they do every day connects back to enabling the business to either be productive, or generate revenue.

There’s little wizardry to communicating strategy. First, it has to be compressible. But, you already did that when you established your vision and strategy…right? Next, you push it through all the mediums and channels at your disposal to tell people over and over again. Chances are, you have “town hall” meetings, email lists, and team meetings up and down your organization. Recording videos and podcasts of you explaining the vision and strategy is helpful. Include strategy overviews in your public speaking because staff often scrutinizes these recordings. While “Enterprise 2.0” fizzled out several years ago, Facebook has trained all us to follow activity streams and other social flotsam. Use those habits and the internal channels you have to spread your communication.

You also need to include examples of the strategy in action, what worked and didn’t work. As with any type of persuasion, getting people’s peers to tell their stories are the best examples. Google and others find that celebrating failure with company-wide post mortems is instructive, career-ending crazy as that may sound. Stories of success and failure are valuable because you can draw a direct line between high-level vision to fingers on keyboard. If you’re afraid of sharing too much failure, try just opening up status metrics to staff. Leadership usually underestimates the value of organization-wide information radiators, but staff usually wants that information to stop prairie dogging through their 9 to 5.

As you’re progressing, getting feedback is key: do people understand it? Do people know what to do to help? If not, then it’s time to tune your messages and mediums. Again, you can apply a small batch process to test out new methods of communicating. While I find them tedious, staff surveys help: ask people if they understand your strategy. Be to also ask if know how to help execute the strategy.

Manifestos can help decompose a strategy into tangible goals and tactics. The insurance industry is on the cusp of turbulent competitive landscape. To call it “disruptive,” would be too narrow. To pick one sea of chop, autonomous vehicles are “changing everything about our personal auto line and we have to change ourselves,” says Liberty Mutual’s Chris Bartlow. New technologies are only one of many fronts in Liberty’s new competitive landscape. Every existing insurance company and cut-throat competitors like Amazon are using new technologies to both optimize existing business models and introduce new ones.

“We have to think about what that’s going to mean to our products and services as we move forward,” Bartlow says. Getting there required re-engineering Liberty’s software capabilities. Like most insurance companies, mainframes and monoliths drove their success over past decades. That approach worked in calmer times, but now Liberty is refocusing their software capability around innovation more than optimization. Liberty is using a stripped down set of three goals to make this urgency and vision tangible.

“The idea was to really change how we’re developing software. To make that real for people we identified these bold, audacious moves — or ‘BAMS,’” says Liberty Mutual’s John Heveran:

These BAMs grounded Liberty’s strategy, giving staff very tangible, if audacious, goals. With these in mind, staff could start thinking about how they’d achieve those goals. This kind of manifesto, makes strategy actionable.

So far, it’s working. “We’re just about cross the chasm on our DevOps and CI/CD journey,” says Liberty’s Miranda LeBlanc. “I can say that because we’re doing about 2,500 daily builds, with over a 1,000 production deployments per a day,” she adds. These numbers are tracers of putting a small batch process in place that’s used to improve the business. They now support around 10,000 internal users at Liberty and are better provisioned for the long ship ride into insurance’s future.

Choosing the right language is important for managing IT transformation. For example, most change leaders suggest dumping the term “agile.” At this point, near 25 years into “agile,” everyone feels like they’re agile experts. Whether that’s true is irrelevant. You’ll faceplam your way through transformation if you’re pitching switching to a methodology people believe they’ve long mastered.

It’s better to pick your own branding for this new methodology. If it works, steal the buzzwords du jour, from “cloud native,” DevOps, or serverless. Creating your own brand is even better. As we’ll discuss later, Allstate created a new name, CompoZed Labs, for its transformation effort. Using your own language and branding can help bring smug staff onboard and involved. “Oh, we’ve always done that, we just didn’t call it ‘agile,’” sticks-in-the-mud are fond of saying as they go off to update their Gantt charts.

Make sure people understand why they’re going through all this “digital transformation.” And make even more sure they know how to implement the vision and strategy, or, as you start thinking, our strategy.

This post is an early draft of a chapter in my book,  Monolithic Transformation.

Creating alliances & holding zero-sum trolls at bay

This post is an early draft of a chapter in my book,  Monolithic Transformation.

Source.

Lone wolves rarely succeed at transforming business models and behavior at large organizations. True to the halo effect, you’ll hear about successful lone wolves often. What you don’t hear about are all the lone wolves who limped off to die alone. Even CEOs and boards often find that change-by-mandate efforts fail. “Efforts that don’t have a powerful enough guiding coalition can make apparent progress for a while,” as Kotter summarizes, “But, sooner or later, the opposition gathers itself together and stops the change.”

Organizations get big by creating and sustaining a portfolio of revenue sources, likey over decades. While these revenue sources may transmogrify from cows to dogs, if frightened or backed into a corner, hale but mettlesome upstarts will are usually trampled by the status quo stampede. At the very least, they’re constantly protecting their neck from frothy, sharp-tooth jackals. You have to work with those cows and canines, often forming “committees.” Oh, and, you know, they might actually be helpful.

How you use this committee is situation. It might be the placate enemies who’d rather see you fail than succeed, looking to salvage corporate resources from the HMS Transformation’s wreak. The old maxim to keep your friends close and your enemies closer summarizes this tactic well. Getting your “enemies” committed to and involved in your project is an obvious, facile suggestion, but it’ll keep them at bay. You’ll need to remove my cynical tone from your committee and actually rely on them for strategic and tactical input, support in budgeting cycles, and, eventually, involvement in your change.

For example, a couple years back I was working with all the C-level executives at a large retailer. They’d come together to understand IT’s strategy to become a software defined business. Of course, IT could only go so far and needed the the actual lines of business to support and adopt that change. The IT executives explained how transforming to a cloud native organization would improve the company’s software capabilities in the morning. In the afternoon, they all started defining a new application focused on driving repeat business, using the very techniques discussed in the morning. This workshopping solidified IT’s relationship with key lines of business and started working transforming those businesses. It also kicked off real, actual work on the initiative. By seeing the benefits of the new approach in action, IT also won over the CFO who’d been the most skeptical.

As this anecdote illustrates, building an alliance often requires serving your new friends. IT typically has little power to drive change, especially after decades of positioning themselves as a service bureau instead of a core enabler of growth. As seen in the Duke lineworker case above, asking the business what they’d like changed is more effective than presuming to know. As that case also shows, a small batch process discovers what actually needs to happen despite the business’ initial theories. But, getting there requires a more of a “the customer is always right” approach on IT’s part.

Now, there are many tactics for managing this committee; as ever Kotter does an excellent job of cataloging them in Leading Change. In particular, you want to make sure the committee members remain engaged. Good executives can quickly smell a waste of time and will start sending junior staff if the wind of change smells stale (wouldn’t you do the same?). You need to manage their excitement, treating them as stakeholder and customers, not just collaborators. Luckily, most organizations I’ve spoken with find that cloud native technologies and methodologies so vastly improve their software capabilities, in such a short amount of time that winning over peers is easy. As one executive a year intro their digital transformation program told me, “holy-@$!!%!@-cow we are starting to accelerate. It’s getting hard to not overdo it. I have business partners lined up out the door.”

This post is an early draft of a chapter in my book,  Monolithic Transformation.

Tracking your improvement  – “metrics”

This post is an early draft of a chapter in my book,  Monolithic Transformation.

Tracking the health of your overall innovation machine can be both overly simplified and overly complex. What you want to measure is how well you’re doing at software development and delivery as it relates to improving your organization’s goals. You’ll use these metrics to track how your organization is doing at any given time and, when things go wrong, get a sense of what needs to be fixed. As ever with management, you can look at this as a part of putting a small batch process in place: coming up with theories for how to solve your problems and verifying if the theory worked in practice or not.

All that monitoring

In IT most of the metrics you encounter are not actually business oriented and instead tell you about the health of your various IT systems and processes: how many nodes are left in a cluster, how much network traffic customers are bringing in, how many open bugs development has, or how many open tickets the help desk is dealing with on average.

Example of Pivotal Cloud Foundry’s Healthwatch metrics dashboard.

All of these metrics can be valuable, just as all of them can be worthless in any given context. Most of these technical metrics, coupled with ample logs, are needed to diagnose problems as they come and go. In recent years, there’ve been many advances in end-to-end tracing thanks to tools like Zipkin and Spring Sleuth. Log management is well into its newest wave of improvements, and monitoring and IT management analytics are just ascending another cycle of innovation — they call it “observability” now, that way you know it’s different this time!

Instead of looking at all of these technical metrics, I want to look at a few common metrics that come up over and over in organizations that are improving their software capabilities.

Six common cloud native metrics

Some metrics consistently come up when measuring cloud native organizations:

Lead Time

Source

Lead time is how long it takes to go from an idea to running code in production, it measures how long your small batch loop takes. It includes everything in-between from specifying the idea, writing the code and testing it, passing any governance and compliance needs, planning for deployment and management, and then getting it up and running in production.

If your lead time is consistent enough, you have a grip on IT’s capability to help the business by creating and deploying new software and features. Being this machine for innovation through software is, as you’ll hopefully recall, the whole point of all this cloud native, agile, DevOps, and digital transformation stuff.

As such, you want to monitoring your lead closely. Ideally, it should be a week. Some organizations go longer, up to two weeks, and some are even shorter, like daily. Target and then track an interval that makes sense for you. If you see your lead time growing, then you should stop everything, find the bottlenecks and fix them. If the bottlenecks can’t be fixed, then you probably need to do less each release.

Velocity

Velocity shows how many features are typically deployed each week. Whether you call features “stories,” “story points,” “requirements,” or whatever else, you want to measure how many of them the team can complete each week; I’ll use the term “story.” Velocity tells you three things:

  1. Your progress to improving and ongoing performance — at first, you want to find out what your team’s velocity is. They will need to “calibrate” on what they’re capable of doing each week. Once you establish this base line, if it goes down something is going wrong and you can investigate.
  2. How much the team can deliver each week — once you know how many features your team can deliver each week, you can more reliability plan your road-maps. If a team can only deliver, for example, 3 stories each week, asking them to deliver 20 stories in a month is absurd. They’re simply not capable of doing that. Ideally, this means your estimates are no longer, well, always wrong.
  3. If the the scope of features is getting too big or too small — if previously, reliability performing team’s velocity starts to drop, it means that they’re scoping their stories incorrectly: they’re taking on too much work, or someone is forcing them to. On the other hand, if the team is suddenly able to deliver more stories each week or finds themselves with lots of extra time each week, it means they should take on more stories each week.

There are numerous ways to first calibrate on the number of stories a team can deliver each week and managing that process at first is very important. As they calibrate, your teams will, no doubt, get it wrong for many releases, which is to be expected (and one of the motivations in picking small projects at first instead of big, important ones). Other reports like burn down charts can help illustrate how the team’s velocity is getting closer to delivering across major releases (or in each release) and help you monitor any deviation from what’s normal.

Latency

In general, you want your software to be as responsive as possible. That is, you want it to be fast. We often think of speed in this case, how fast is the software running and how fast can it respond to requests? Latency is a slightly different way of thinking about speed, namely, how long does a request take end-to-end to process, returning back to the user.

Latency is different than the raw “speed” of the network. For example, a fast network will send a static file very quickly, but if the request requires connecting to a database to create and then retrieve a custom view of last week’s Austrian sales, it will take awhile and, thus, the latency will be much longer than downloaded an already made file.

From a user’s perspective, latency is important because an application that takes 3 minutes to respond versus 3 milliseconds might as well be “unavailable.” As such, latency is often the best way to measure if your software is working.

Measuring latency can be tricky….or really simple. Because it spans the entire transaction, you often need to rely on patching together a full view — or “trace” — of any given user transaction. This can be done by looking at locks, doing real or synthetic user-centric tracing, and using any number of application performance monitoring (APM) tools. Ideally, the platform you’re using will automatically monitor all user requests and also put together catalog all of the sub-processes and sub-sub-processes that make up the entire request. That way, you can start to figure why things are so slow.

Error Rates

Often, your systems and software will tell when there’s an error: an exception is thrown in the application layer because the email service is missing, an authentication service is unreachable so the user can’t login, a disk is failing to write data. Tracking and monitoring these errors is, obviously, a good idea. Some of them will range from “is smoke coming out of the box?” to more obtuse ones like servicing being unreachable because DNS is misconfigured. Oftentimes, errors are roll-ups of other problems: when a web server fails, returning a 500 response code, it means something went wrong, but doesn’t the error doesn’t usually tell you what happened.

Error rates also occur before production, while the software is being developed and tested. You can look at failed tests as error rates, as well as broken builds and failed compliance audits.

Fixing errors in development can be easier and more straight forward, whereas triaging and sorting through errors in production is an art. What’s important to track with errors is not just that one happened, but the rate at which they happen, perhaps errors per second. You’ll have to figure out an acceptable level of errors because there will be many of them. What you do about all these errors will be driven by your service targets. These targets may be foisted on you in the form of heritage Service Level Agreements or you might have been lucky enough to negotiate some sane targets.

Chances are, a certain rate of errors will be acceptable (have you ever noticed that sometimes, you just need to reload a web-page?) Each part of your stack will throw off and generate different errors: some are meaningless (perhaps they should be more warnings or even just informative notices, e.g., “you’re using an older framework that might be deprecated sometime in the next 30 years) and others could be too costly, or even impossible to fix (“1% of user’s audio uploads fail because their upload latency and bandwidth is too slow”). And some errors may be important above all else: if an email server is just losing emails every 5 minutes…something is terribly wrong.

Generally, errors are collected from logs, but you could also poll the service in question and it might send alerts to your monitoring systems, be that an IT management system or just your phone.

Mean-time-to-repair (MTTR)

If you can accept the reality that things will go wrong with software, how quickly you can fix those problems becomes a key metric. It’s bad when an error happens, but it’s really bad if it takes you a long time to fix it.

Tracking mean-time-to-repair is an ongoing measurement of how quickly you can recovering from errors. As with most metrics, this gives you a target to improve towards and then allows you to make sure you’re not getting worse.

If you’re following cloud native practices and using a good platform, you can usually shrink your MTTR with the ability to roll back changes. If a release turns out to be bad (an error), you can back it out quickly, removing the problem. This doesn’t mean you should blithely roll out bad releases, of course.

Measuring MTTR might require tracking support tickets and otherwise manually tracking the time between incident detection and fix. As you automate remediations, you might be able to easily capture those rates. As with most of these metrics, what becomes important in the long term is tracking changes to your acceptable MTTR and figuring out why the negative changes are happening.

Costs

Everyone wants to measure cost, and there are many costs to measure. In addition to the time spent developing software and the money spent on infrastructure, there are ratios you’ll want to track like number of applications to platform operators. Typically, these kinds of ratios give you a quick sense of how efficiently IT runs. If each application takes one operator, something is probably missing from your platform and process. T-Mobile, for example, manages 11,000 containers in production with just 8 platform operators.

There are also less direct costs like opportunity and value lost due to waiting on slow release cycles. For example, the US Air Force calculated that is saved $391M by modernizing it’s software methodology. The point is that you need to obviously track the cost of what you’re doing, but you also need to track the costs of doing nothing, which might be much higher.

Business Value

“Comcast Cloud Foundry Journey — Part 2,” Greg Otto, Comcast, June 2017.

Of course, none of the metrics so far has measured the most valuable, but difficult metric: value delivered. How do you measure your software’s contribution to your organization’s goals? Measuring how the process and tools you use contributes to those goals is usually harder. This is the dicey plain of correlation versus causation.

Somehow, you need to come up with a scheme that shows and tracks how all this cloud native stuff you’re spending time and money on is helping the business grow. You want to measure value delivered over time to:

  1. Prove that you’re valuable and should keep living and get more funding,
  2. Figure out when you’re failing to deliver so that you can fix it

There are a few prototypes of linking cloud native activities to business value delivered. Let’s look at a few examples:

  1. As described in the case study above, when the IRS replaced call centers with poor availability with software, IT delivered clear business value. Latency and error rates decreased dramatically (with phone banks, only 37% of calls made it through) and the design improvements they discovered led to increased usage of the software, pulling people away from the phones. And, then, the results are clear: by the Fall of 2017, the this application had collected $440m in back taxes.
  2. Sometimes, delivering “value” means satisfying operational metrics rather than contributing dollars. This isn’t the best of all situations to be in, but if you’re told, for example, that in the next two years 60% of applications need to be “on the cloud,” then you know the business value you’re supposed to deliver on. In such cases, simply tracking the replatforming of applications to a cloud platform will probably suffice.
  3. Running existing businesses more efficiently is a popular goal, especially for large organizations. In this case, the value you deliver with cloud native will usually be speeding up businesses processes, removing wasted time and effort, and increasing quality. Duke Energy’s lineworker case is a good example, here. Duke gave lineworkers a better, highly tuned application that queue and coordinate their work in the field. The software increased lineworker’s productivity and reduced waste, directly creating business value in efficiencies.
  4. The US Air Force’s tanker scheduling case study is another good example here: by adapting a cloud native, software model they were able to ship the first version in 120 days and started saving $100,000’s in fuel costs each week. Additionally, the USAF computed the cost of delay — using the old methods that took longer — at $391M, a handy financial metric to consider.
  5. And, then, of course, there comes raw competition. This most easily manifests itself as time-to-market, either to match competitors or get new features out before them. Liberty Mutual’s ability to enter the Australian motorcycle market, from scratch, in six months is a good example. Others, like Comcast demonstrate competing with major disruptors like Netflix.

It’s easy to get very nuanced and detailed when you’re mapping IT to business value. You need to keep things as simple as possible, or, put another way, only as complex as needed. As with the example above, clearly link your cloud native efforts to straight forward business goals. Simply “delivering on our commitment to innovation” isn’t going to cut it. If you’re suffering under vague strategic goals, make them more concrete before you start using them to measure yourself. On the other end, just lowering costs might be a bad goal to shoot for. I talk with many organizations who used outsourcing to deliver on the strategic goal of lowering costs and now find themselves incapable of creating software at the pace their business needs to compete.

Fleshing out metrics

I’ve provided a simplistic start at metrics above. Each layer of your organization will want to add more detail to get better telemetry on itself. Creating a comprehensive, umbrella metrics system is impossible, but there are many good templates to start with.

Pivotal has been developing a cloud native centric template of metrics, divided into 5 categories:

BuiltToAdapt Benchmark.

These metrics cover platform operations, product, and business metrics. Not all organizations will want to use all of the metrics, and there’s usually some that are missing. But, this 5 S’s template is a good place to start.

If you prefer to go down rabbit holes rather than shelter under umbrellas, there are more specialized metric frameworks to start with. Platform operators should probably start by learning how the Google SRE team measures and manages Google, while developers could start by looking at TK( need some good resource ).

Whatever the case, make sure the metrics you choose are

  1. targeting the end goal of putting a small batch process in place to create better software,
  2. reporting on your ongoing improvement towards that goal, and,
  3. alerting you that you’re slipping and need to fix something…or find a new job.

This post is an early draft of a chapter in my book,  Monolithic Transformation.

Dealing with compliance — it might even be a good idea

This post is an early draft of a chapter in my book,  Monolithic Transformation.

“Compliance” will be one of your top bugbears as you improve how your organization does software. As numerous organizations have been finding, however, compliance is a solvable problem. You can even improve the quality of compliance and risk management in most cases with your new processes and tools, introducing more, reliable controls than traditional approaches.

I’ve seen three approaches to dealing with compliance, often used together as a sort of maturity model:

  1. Ignore compliance, compliantly — select projects to work on that don’t need much compliance, if any. Eventually, you’ll want to work on projects that do, but this buys you time to learn by doing and building up a small series of successful projects.
  2. Minimal Viable Compliance — often, the compliance requirements you must follow have built up over years, even decades. It’s very rare that any control is removed, but it’s very frequent that they should be. Find the smallest set of controls you actually need to satisfy.
  3. Transform compliance — as you scale up your transformation efforts, like most organizations you’ll find that you have to work with auditors. Most organizations are finding that simply involving auditors in your software lifecycle from start to end not only helps you pass compliance with flying colors, but that it improves the actual compliance work.

But first, what exactly is “compliance”?

Paul tells you what compliance is.

If you’re a large organization, chances are you’ll have a set of regulations you need to comply with. These are both self- and government-imposed. In software, the point of regulations is often to govern the creation of software, how it’s managed and in run in production, and how data is handled. The point of most compliance is risk management, e.g., making sure developers deliver what was asked for, making sure they follow protocol for tracking changes and who made them, making sure the code and the infrastructure is secure, and making sure that people’s personal data is not needlessly exposed.

Compliance often takes the form of a checklist of controls and verifications that must be passed. Auditors are staff that go through the process of establishing those lists, tracking down their status in your software, and also negotiating if each control must be followed or not. The auditors are often involved before and after the process to establish the controls and then verify that they were followed. It’s rare that auditors are involved during the process, which is a huge source of wasted time, it turns out. Getting involved after your software has been created requires much compliance archaeology and, sadly, much cutting and pasting between emails and spreadsheets, paired with infinite meeting scheduling.

When you’re looking to transform your software capabilities, this traditional approaches to compliance, however, often end up hurting businesses more than helping them. As Liberty Mutual’s David Ehringer describes it

The nature of the risk affecting the business is actually quite different: the nature of that risk is, kind of, the business disrupted, the business disappearing, the business not being able to react fast enough and change fast enough. So not to say that some of those things aren’t still important, but the nature of that risk is changing.

Ehringer says that many compliance controls are still important, but there are better ways of handling them without worsening the largest risk: going out of business because innovation was too late.

Let’s look at three ways that organizations are avoiding failure by compliance.

Ignore compliance, compliantly

While just a quick fix, engineering a way to avoid compliance is a common first approach. Early on, when you’re learning a new mindset for software and build up a series of small successes, you’ll likely work on applications that require little to no compliance. These kinds of applications often contain no customer data, don’t directly drive or modify core processes, or otherwise touch anything that’d need compliance scrutiny.

These may seem disconnected from anything that matters and, thus, not worth working on. Early on, though, the ability to get moving and prove that change is possible often trumps any business value concerns. You don’t want to eat these “empty calorie” projects too much, but it’s better than being killed off at the start.

Minimal Viable Compliance

Part of what makes compliance seem like toil is that many of the controls seem irrelevant. Over the years, compliance builds up like plaque in your steak-loving arteries. The various controls may have made sense at some time — often responding to some crisis that occured because this new control wasn’t followed. At other times, the controls may simply not be relevant to the way you’re doing software.

Clearing away old compliance

When you really peer into the audit abyss, you’ll often find out that many of the tasks and time bottlenecks are caused by too much ceremony and processes no longer needed to achieve the original goals of audibility. Target’s Heather Mickman recounts her experience with just such an audit abyss clean-up in The DevOps Handbook:

As we went through the process, I wanted to better understand why the TEAP-LARB [Target’s existing governance] process took so long to get through, and I used the technique of “the five whys”…which eventually led to the question of why TEAP-LARB existed in the first place. The surprising thing was that no one knew, outside of a vague notion that we needed some sort of governance process. Many knew that there had been some sort of disaster that could never happen again years ago, but no one could remember exactly what that disaster was, either.

As Boston Scientific’s CeeCee O’Connor says, finding your path to minimal viable compliance means you’ll actually need to talk with auditors and understand the compliance needs. You’ll likely need to negotiate if various controls are needed or not, more or less proving that they’re not. When working with auditors on an application that helped people manage a chronic condition, O’Connor group first mapped out what they called “the path to production.”

Boston Scientific’s “Path to Production.”

This was a value-stream like visual that showed all of the steps and processes needed to get the application into production, including, of course compliance steps. Representing each of these as sticky notes on a wall allowed the team to quickly work with auditors to go through each step — each sticky note — and ask if it was needed. Answering such a question requires some criteria, so applying lean they team asked the question “does this process add value for the customer?”

You’re already helping compliance

This mapping and systematic approach allowed the team and auditors to negotiate the actual set controls needed to get to production. At Boston Scientific, the compliance standards had built up over 15 years, growing thick, and this process helped thin them out, speeding up the software delivery cycle.

The opportunity to work with auditors will also let you demonstrate how many of your practices are already improving compliance. For example, pair programming means that all code is continuously being reviewed by a second person and detailed test suite reports show that code is being tested. Once you understand what your auditors need, there are likely other processes that you’re following that contribute to compliance.

Discussing his work at Boston Scientific, Pivotal’s Chuck D’Antonio describes a happy coincidence between lead design and compliance. When it comes to pacemakers and other medical devices, you’re only supposed to build exactly the software needed, removing any extraneous software that might bring bugs. This requirement matches almost exactly with one of the core ideas of minimum viable products and lean: only deliver the code needed. Finding these happy coincidences, of course, requires working closely with auditors. It’ll be worth a day or two of meetings and tours to show your auditors how you do software and ask them if anything lines up already.

Case Study: “It was way beyond what we needed to even be doing.”

Operating in five US states and insuring around 15 million people, health insurance provider HCSC is up to its eyeballs in regulations and compliance. As it started to transform, HCSC initially felt like getting over the compliance hurdle would be impossible. Mark Ardito recounts how easy it actually was once auditors were satisfied with how much better a cloud-native approach was:

Turns out it’s really easy to track a story in [Pivotal] Tracker to a commit that got made in git. So I know the SHA that was in git, that was that Tracker story. And then I know the Jenkins job that pushed it out to Cloud Foundry. And guess what? I have this in the tools. There’s logs of all these things happening. So slowly, I was able to start to prove out auditability just from Jenkins logs, git SHAs, things like that. So we started to see that it became easier and easier to prove audits instead of Word documents, Excel documents — you can type anything you want in a Word document! You can’t fake a log from git and you can’t fake a log in Jenkins or Cloud Foundry.

Automation makes auditors happier and removes huge, time-sucking bottlenecks.

Transform compliance

While you may be able to avoid compliance or eliminate some controls, regulations are more likely unavoidable. Speeding up the compliance bottleneck, then, requires changing how compliance is done. Thankfully, using a build pipeline and cloud platforms provides a deep set of tools to speed up compliance. Even better, you’ll find cloud native tools and processes improve the actual quality and accuracy of compliance.

Compliance as code

Many of the controls auditors need can be satisfied by adding minor steps into your development process. For example, as Boston Scientific found, one of their auditors controls specified that a requirement had to be tracked through the development process. Instead of having to verify this after the team was code complete, they made sure to embed the story ID into each git commit, automated build, and deploy. Along these lines, the OpenControl project has put several years of effort into automating even the most complicated government compliance regimes. Chef’s InSpec project is also being used to automate compliance.

Pro-actively putting in these kinds of tracers is a common pattern form organizations that are looking to automate compliance. There’s often a small amount of scripting required to extract these tracers and present them in a human readable format, but that work is trivial in comparison to the traditional audit process.

Put compliance in the platform

Another common tactic is to put as much control enforcement into your cloud platform as possible. In a traditional approach, each application comes with its own set of infrastructure and related configuration: not only the “servers” needed, but also systems and policy for networking, data access, security settings, and so forth.

This makes your entire stack of infrastructure and software a single, unique unit that must be audited each release. This creates a huge amount of compliance work that needs to be done even for a single line of code: everything must be checked from the dirt to screen. As Raytheon’s Keith Rodwell lays out, working with auditors, you can often show them that by using the same, centralized platform for all applications you can inherit compliance from the platform. This allows you to avoid the time taken to re-audit each layer in your stack.

The US federal government’s cloud.gov platform provides a good example of baking controls into the platform. 18F, the group that built and supports cloud.gov described how their platform, based on Cloud Foundry, takes care of 269 controls for product teams:

Out of the 325 security controls required for Moderate-impact systems, cloud.gov handles 269 controls, and 41 controls are a shared responsibility (where cloud.gov provides part of the requirement, and your applications provide the rest). You only need to provide full implementations for the remaining 15 controls, such as ensuring you make data backups and using reliable DNS (Domain Name System) name servers for your websites.

Organizations that bake controls into their platforms find that they can reduce the time to pass audits from months (if not years!) to just weeks or even days. The US Air Force has had similar success with this approach, bringing security certification down from 18 months to 30 days, sometimes even just 10.

Compliance as a service

Finally, as get deeper into dealing with compliance, you might even find that you work more closely with auditors. It’s highly unlikely that they’ll become part of your product team; though that could happen in some especially compliance-driven government and military work where being compliant is a huge part of the business value. However, organizations often find that auditors are involved closely throughout their software life-cycle. Part of this is giving auditors the tools to proactively check on controls first hand.

Home Depot’s Tony McCulley suggests giving auditors access to your continuous delivery process and deployment environment. This means auditors can verify compliance questions on their own instead of asking product teams to do that work. Effectively, you’re letting auditors peer into and even help out with controls in your software. Of course, this will only works if have a well-structured, standardized platform supporting your build pipeline with good UIs that non-technical staff can access.

Making compliance better

There have obviously been culture shocks. What is more interesting though is that the teams that tend to have the worst culture shock are not those typical teams that you might think of, audit or compliance. In fact, if you’re able to successfully communicate to them what you’re doing, DevOps and all of the associated practices seem like common sense. [Auditors] say, ‘Why weren’t we doing this before?’” — Manuel Edwards, E*TRADE, Jan 2016

The net result of all these efforts to speed up compliance often improves the quality of compliance itself:

  1. Understanding and working with auditors gives the product team the chance to write software that more genuinely matches compliance needs.
  2. The traceability of requirements, authorization, and automated test reports give auditors much more of the raw materials needed to verify compliance.
  3. Automating compliance reporting and baking controls into the platform creates much more accurate audits and can give so called “controls” actual, programmatic control to enforce regulations.

As with any discussion that includes the word “automation,” some people take all of this to mean that auditors are no longer needed. That is, we can get rid of their jobs. This sentiment then gets stacked up into the eternal “they” antipattern: “well, they won’t change, so we can’t improve anything around here.

But, also as with any discussion that includes to word “automation,” things are not so clear. What all of these compliance optimizations point to is how much waste and extra work there is in the current approach to compliance.

This often means auditors working overtime, on the weekend, and over holidays. If you can improve the tools auditors use you don’t need to get rid of them. Instead, as we can do with previously overworked developers, you end up getting more value out of each auditor and, at the same time, they can go home on-time. As with developers, happy auditors mean a happier business.

This post is an early draft of a chapter in my book,  Monolithic Transformation.

Rule 1: Don’t go to meetings. Rule 2: See rule 1.

Coffee is for coders.

Whether you’re doing waterfall, DevOps, PRINCE, SAFe, PMBOK, ITIL, or whatever process and certification-scheme you like, chances are you’re not using your time wisely. I’d estimate that most of the immediate, short-term benefit organizations get from switching to cloud native is simply because they’re now actually, truly following a process which both focuses your efforts on creating customer value (useful software that helps customers out, making them keep paying or pay you more) and managing your time wisely. This is like the first 10–20 pounds you lose on any diet: that just happens because you’re actually doing something where before you were doing nothing.

Less developer meetings, more pairing up

When it comes to time management, eliminating meetings is the easiest, biggest productivity booster you can do. Start with developers. They should be doing actual work (probably “coding”) 5–6 hours a day and go to only a handful of meetings a week. If the daily stand-up isn’t getting them all the information they need for the day, look to improve the information flow or limit it to just what’s needed.

Somewhat counter-intuitively, pairing up developers (and other staff, it turns out) will increase productivity as well. When they pair, developers are better synced up on most knowledge they need, learning how all parts of the system work with a built in tutor in their pair. Keeping up to speed like this means the developers have still less meetings to go to, those ones where they learn about the new pagination framework that Kris made. Pairing helps with more than just knowledge maintenance. While it feels like there’s a “halving” of developers by pairing them up, as one of the original pair programming studies put it: “the defect removal savings should more than offset the development cost increase.” Pairs in studies over the past 20+ years have consistently written higher quality code and written it faster than solo coders.

Coupled with the product mindset to software that involves the whole team in the process from start to end, they’ll be up to speed on the use cases and customers. And, by putting small batches in place, the amount of up-front study needed (requiring meetings) will be reduced to bite-sized chunks.

It takes a long time to digest 300 pages

We’re going to need a lot more coffee to get through this requirements meeting.

The requirements process is a notorious source of wasteful meetings. This is especially true when companies are still doing big, up-front analysis to front-end agile development teams.

For example, at a large health insurance company, the product owner at first worked with business analysts, QA managers, and operations managers to get developers synced up and working. The product owner quickly realized that most of the content in the conversations was not actually needed, or was overkill. With some corporate slickness, the product owner removed the developers from this meeting-loop, and essentially /dev/null’ed the input that wasn’t needed.

Assign this story to management

Staff can try to reduce the amount of meetings they go to (and start practices like pairing), but, to be effective, managers have the responsibility to make it happen. At Allstate, managers would put “meetings” on developers calendars that said “Don’t go to meetings.” When you read results like Allstate going from 20% productivity to 90% productivity, you can see how effective eliminating meetings, along with all their other improvements, can be on an organization.

If you feel like developers must go to a meeting, first ask how you can eliminate that need. Second, track it like any other feature in the release, accounting for the time and cost of it. Make the costs of the miserable visible.

This concept of attending less meetings isn’t just for developers,The same productivity outcomes can be achieved to QA, the product owners, operations, and everyone else. Once you’ve done this, you’ll likely find having a balanced team easier and possible. Of course, once you have everyone on a balanced team, following this principle is easier.Reducing the time your staff spends in meetings and, instead, increasing the time they spend coding, designing, and doing actual product management (like talking with end users!) get you the obvious benefits of increasing productivity by 4x-5x.

If you feel you cannot do this, at least track the time you’re losing/using on meetings. A good rule of thumb is that context switching (going from one task to another) takes about 30 minutes. So, an hour long meeting will actually take out 2 hours of an employee’s time. To get ahold of how you’re choosing to spend your time, in reality, track these as tasks somehow, perhaps even adding in stories for “the big, important meeting.” And then, when you’re project tracking make sure you actually want to spend your organization’s time this way. If you do: great, you’re getting what you want! More than likely, spending time doing anything by creating and shipping customer value isn’t something you want to keep doing.

It may seem ridiculous to suggest that paying attention to time spent in meetings is even something that needs to be uttered. In my experience, management may feel like meetings are good, helpful, and not too onerous. After all, meetings are a major tool for managers to come to learn how their businesses are performing, discuss growth and optimization options, and reach decisions. Meetings are the whiteboards and IDEs of managers. Management needs to look beyond the utility meetings give them, and realize that for most everyone else, meetings are a waste of time.

For more on improving software in your organization check out my 49 pages in a fancy PDF on the topic.

De-shittifying Tech T-Shirts

I have a lot of tech t-shirts. Here’s an overview of my personal style and opinions. There’s a lot of politics in t-shirt selection, much of it good, still even more of it driven by aesthetics. I’m not seeking to win any points in those games (well, except maybe that all genders should have shirts that are designed for them), just telling you what I like.

Why? I get asked for input on t-shirts at least twice a year (often more). Here’s a URL for that input. And, I end-up getting a lot of tech t-shirts. Thankfully, my mom really likes them, so about 2–3 times a year I give her a couple t-shirt grocery bags full of t-shirts, the shitty ones.

Less shit on the shirt

First, some general comments:

  1. I don’t like any shit on the back of the shirt, unless it’s a tiny brand name or URL right at the top.
  2. I don’t like those shirts with a big, sticky feeling print thing on them (with an exception for pure awesomeness as you’ll see in a couple of them). I think that means I like “screen-print” shirts.
  3. They, of course, have to be that super-soft material. Those “beefy-t” shirts go right into the plastic bag of shirts I give to my mom (well, actually, I just don’t pick them up in less I get tricked into doing so).
  4. I’m overweight — and I think most people who get tech t-shirts are (ducks)— so I don’t like those “slim fit” shirts. No one wants to see me act out the hit song, “My Humps.”
  5. You gotta have women sizes, of course. (Close followers will instantly notice that we don’t do that over at the podcast — need to add a card in Trello posthaste!)
  6. Pictures and designs instead of just words are good, but words are fine.
  7. In general, your company’s logo is crap on a shirt. And for God’s sake, don’t put in on the sleeve. Don’t put anything on the sleeve.
  8. Speaking of logos, those t-shirts where you list a bunch of sponsor logos on the back are garbage.
  9. Colors: this is tricky. I clearly like grey shirts instead of bright colors. Also, I generally don’t like black, as Dan Baskette put it: “I’m not at a Motley Crue concert, so don’t give me a black t-shirt.” Actual color (blue, green, red, etc.) is probably OK. But. I like grey.

Here’s a selection. These are not all, by far, t-shirts from tech conferences, but most of them could be and illustrate my taste:

The DevOpsDays Austin people always do well. The MSP shirt is good too.
The Pickle Rick shirt is an example of a bunch of shit on the front being OK because it’s awesome.
I like grey.
The Kansas City one is a good example of a bunch of shit on the front without being shitty.
Pretty basic, and both brand names, but both good. I have three of the Pivotal ones; they’re good.

Apparently, I buy a lot of (super-fucking-expensive-oh-my-God-I-should-just-be-a-dandy-fellow-and-shop-at-Nordstrom-oh-I’m-supporting-indepedent-aritsts-OK-then-here’s-my-wallet-and-ATM-PIN) Cotton Bureau shirts.

Bonus! Hoodies’n’shit

Occasionally, you get lucky and there’s a hoodie or jacket. First, hoodies and jackets are super-awesome to get at a conference. The OpenStack people are really good at this, and at Pivotal we’ve had several internal conferences that were awesome on this front too.

For me, hoodies and jackets have slightly different rules:

  1. I’m not in a motorcycle gang, so I don’t want any shit on the back.
  2. Same for the front.
  3. That said, there are some exceptions if it’s subtle. The two OpenStack hoodies I have are good examples of this.
  4. It’s OK to just discreetly put your company’s name and logo on the left breast.
  5. A thin-hoodie is actually pretty nice — I have an OpenStack hoodie that’s an excellent example of this, it’s a good “layering” thing versus the ultra thick ones.
  6. When it comes to fabric, I think “beefy-t” is fine.
  7. My Pivotal hoodie has a clever feature: the Pivotal name is embroidered on the rim of the hood. Nifty!

A selection (sadly, I don’t think anyone’s ever given me a jacket — you know who you are!):

Notice the subtle left breast brand, and the fun brand on the hood’s rim.
A good example of acceptable shit on the back. Putting city names of past conferences is also an ongoing, fun thing for OpenStack hoodies.
A thin hoodie, plus almost imperceptible shit on the back (it uses city names of previous conferences to write out “OpenStack”).
TaskTop has nice jackets, where a brand name up in the usual spot is fine. These were those somewhat hard-shell North Face jackets, or in that same style. Very nice.

T-shirt what thou wilt shall be the whole of the law

Like I said, it’s not like I have any opinions on the matter of tech conference t-shirts. Nope.

This is not me, but look how cool that dude looks! You can too!

Cloud Native Works in Government — the IRS, US Air Force, and contractors

“We have already slashed the time needed to implement new ideas by 70 percent while avoiding hundreds of millions of dollars in costs.” M. Wes Haga, Chief of Mission Applications and Infrastructure Programs for Air Force Research Lab

Slowly but surely, the US government is improving how they do software. Working at Pivotal, I’m lucky to see some of this change and talk with the people who’ve actually done it. Just as we’re seeing huge improvements in the private sector with Pivotal’s cloud native approach, we’re now seeing successful examples of transformation in government. As with any sweeping transformation trend, there are several early case studies that have proven change is possible in the government. The cloud native practices of agility, DevOps, and relying on cloud platforms are spreading through the US Federal government and it is encouraging and cool to see the outcomes they have enabled.

People often complain about red-tape, funding problems, staff’s unwillingness to change, and an overall defeatist attitude. These cases show not only that the cloud native approach works, giving agencies and the military new, modernized capabilities with clear, positive ROI, but also show that it’s possible. In fact, it’s not as hard as it may seem.

IRS

If you’ve seen my talks, this IRS story is one of my favorite cases of what it means to do “digital transformation.”

The IRS had been using call centers for many, many years to provide basic account information and tax payment services. Call centers are expensive and error prone: one study found that only 37% of calls were answered. Over 60% of people calling the IRS for help were simply hung-up on! With the need to continually control costs and deliver good service, the IRS had to do something.

In the consumer space, solving this type of account management problem has long been taken care of. It’s pretty easy in fact; just think of all the online banking systems and paying your monthly cellphone bills. But at the IRS, viewing your transactions had yet to be digitized.

When putting software around this, the IRS first thought that they should show you your complete history with the IRS, all your transactions. This confused users and most of them still wanted to pick up the phone. Think about what a perfect failure that is: the software worked exactly as designed and intended, it was just the wrong way to solve the problem. Thankfully, because they were following a small batch process, they caught this very quickly, and iterated through different versions of it until they hit on a simple finding: when people want to know how much money they owe the IRS, they just want to know how much money they owe the IRS. When this version of the software was tested, people didn’t need to use the phone.

Now, if the IRS was on a traditional 12 to 18 months cycle (or longer!) think of how poorly this would have gone, the business case would have failed, you would probably continue to have a dim view of IT and the IRS. But, by thinking about software correctly — in an agile, small batch way — the IRS did the right thing, not only saving money, but also solving people’s actual problems.

Digitization projects like this, however, can be hard in the government due to the all too well meaning process and oversight. The IRS has been working with Pivotal to introduce a very advanced agile approach, e.g., shipping frequently, pairing across roles, and intense user-testing. Along the way, they had to manage various stakeholders expectations, winning over their trust, interest, and eventually support for transforming how the IRS does their software.

This project has great results: after some onerous up-front red-tape transformation, they put an app in place which allows people to look up their account information, check payments due, and pay them. As of October 2017, there have been over 2 million users and the app has processed over $440m in payments.

Check out this interview with Andrea Schneider (IRS) & Lauren Gilchrist (Pivotal) for the story and details, and an older but helpful overview of the project from Andrea:

Keeping the Air Force Flying

It’s rare to get details on military IT projects, so these stories are particularly delicious as it’s a literal case of “digital transformation,” going from analog to digital.

The US military has for a long time realized that they need to rapidly respond to changes in the field, not only a weekly or daily basis, but on an hourly basis. Software drives a huge amount of how the military operates now, “Everything we do in the military, and everything we do in combat, is now software based,” as Lt. Col. Enrique Oti put it. With so much reliance on software, when most IT projects take five to seven years to ship, there’s a bit of a crisis in how IT is done. “This idea of not taking action is not an option that the United States Army actually has,” said Army CIO Lt. Gen. Bruce Crawford in a recent talk.

Much can be blamed on the procurement process (and the associated needs of oversight, but overall the issue is putting more agile approach to software in place. The Air Force has several projects under its harness that are showing the way.

One of them is a story of literally going from analog to digital. They’d been planning out refueling schedules in the Middle East with a large white board. While the staff were working earnestly, it took about 8 hours and, clearly, was not the ideal state for planning something as vital as refueling.

After working with Pivotal, they digitized this process and dramatically reduced the time it took to prepare the whiteboard. They shipped their first version in 120 days (an amazing speed for any organization, private or public sector). Even better, they now regularly ship new features each week, continually improving the system. Moving from shipping every 5 years to every week, adding in the ability to adapt to new needs and operational challenges means this piece of software is directly supporting and improving the overall mission.

Because they could schedule more precisely, they were also able to remove one tanker from regular usage each day (see at about 1h47m in this video), saving about a million dollars a day. The ROI on this project, clearly, was off the charts. In fact, they were able to make back their investment in this project in seven days, based on the fuel savings. They were also able to cut the staff needed dramatically, while at the same time improving the service and freeing up staff to work on other important missions and tasks.

Looking forward, this also opened up the possibility to integrate other data into this planning, and provide this schedule to other processes. But in a software-driven organization, there’s plenty of other opportunities. They’re now working on seven more applications, including, a dynamic targeting tool. More broadly, this approach to development reduces risks of all type, but especially blow up budgets. As M. Wes Haga put it:

Previously, every time we added a new capability, we would have had to build, test, and deploy the entire IT stack. A mistake could cost $100 million, likely ending the career of anyone associated with that decision. A smaller mistake is less often a career-ender and thus encourages smart and informed risk-taking.

Contractors too…

“You gave me what I asked for, but not really what I wanted.”

Raytheon is with the program as well, having recognized the need to need to become more agile in its delivery practices. The software needs to evolve as quickly as possible, years long contracts just won’t cut it. As one of Raytheon’s engineers put it: “employing Agile and DevOps is going to speed up the software lifecycle, getting new features into the hands of the men and women of the Armed Forces a lot quicker.”

They’ve been working with Pivotal to switch over to faster feedback cycles and apply DevOps practices to their software life-cycle.

Working with the Air Force, as with all these types of transformations, they started with one project, built up skills and knowledge, and have been expanding to other products. The first project was the Air Force’s Air and Space Operations Center Weapon System (AOC Pathfinder). They’re also working on one of the Air Forces intelligence systems, the Distributed Common Ground System.

Software release cycle speed (from years to months, if not weeks) is important in these systems, but matching the evolving and emerging needs for those systems is equally — perhaps even more! — important. “The DevOps model allows our customers to ask for the products they really want,” Raytheon’s Quynh Tran said, “The results [are that] we are shortening deployment times and prioritizing work based on their needs. We’re going to be better at meeting their expectations…. Military users get their requests changed in months instead of years and see the results of continuous feedback.”

See also this interview with Keith Salisbury.

(Thanks to @dormaindrewitz who helped me track down many of the facts and figures above.)

Building trust with internal marketing, large and small

Most companies don’t realize the amount of work required to fully transform their approach to creating and caring for software. Scaling up the improvements learned and put into place by your initial teams relies on building trust and understanding in the overall organization. For whatever reason, most people in large organizations are resistant to change and, what with the frequent introduction of process improvement programs, skeptical of the flavor of the week of the syndrome. A large part of scaling up digital transformation, then, is internal marketing. And it’s a lot more than most people anticipate.

Beyond Newsletters

Once you nail down some, initial, successful applications, start a program to tell the rest of the organization about these projects. This is beyond the usual email newsletter mention, often quickly leading to internal “summits” with speakers from your organization going over lessons learned and advice for starting new cloud native projects.

You have to promote your change, educate people, and overall “sell” it to people who either don’t care, don’t know, or are resistant. These events can piggyback on whatever monthly “brown-bag” sessions you have and should be recorded for those who can’t attend. Somewhat early on, management should set aside time and budget to have more organized summits. You could, for example, do a half day event with three to four talks by team members from successful projects, having them walk through the project’s history and advice on getting started.

Building Trust

This internal marketing works hand-in-hand with starting small and building up a collection of successful projects. As you’re working on these initial projects, spend time to document the “case studies” of what worked and didn’t work, and track the actual business metrics to demonstrate how the software helped your organization. You don’t so much want to just how fast you can now move, but you want to show how doing software done in this new way is strategic for the business.

Content-wise, what’s key in this process is for staff to talk with each other, about your organization’s’ own software, market, and challenges faced. I find that organizations often think that they face unique challenges. Each organization does have unique hang-ups and capabilities, so people in those organization tend to be most interested in how they can apply the wonders of cloud native to their jobs, regardless of whatever success they might hear about at conferences or, worse, vendors with an obvious bias. Hearing from each other often gets beyond this sentiment that “change can’t happen here.”

Once your organization starts hearing about these success, you’ll be able to break down some of the objections that stop the spread positive change. As Amy Patton at SPS Commerce put it, “having enough wins, like that, really helped us to keep the momentum going while we were having a culture change like DevOps.”

Winning over process stakeholders

The IRS provides an example of using release meetings to slowly win over resistant middle-management staff and stakeholders. Stakeholders felt uncomfortable letting these detailed requirements evolve over each iteration. As with most people who’re forcedencouraged to move from waterfall to agile, they were skeptical that the final software would have all the features they initially wanted.

While the team was, of course, verifying these evolving requirements with actual, in production user testing, stakeholders were uncomfortable. These skeptics were used to comfort of lots of up-front analysis and requirements, exactly spelling out which features would be implemented. To start getting over this skepticism, the team used their release meetings to show off how well the process was working, demonstrating working code and lessons learned along the way. These meetings went from five skeptics to packed, standing room only meetings with over 45 attendees. As success was built up and the organizational grape-vine filled with tales of wins, interest grew and with it, trust in the new system.

The next step: training by doing

As the organizations above and others like Verizon and Target demonstrate, internal marketing must be done “in the small,” like this IRS case and, eventually, “in the large” with internal summits.

Scaling up from marketing activities is often done with intensive, hands-on training workshops called “dojos.” These are highly structured, guided, but real release cycles that give participants the chance to learn the technologies and styles of development. And because they’re working on actual software, you’re delivering business value along the way: it’s training and doing.

These sessions also enable the organization to learn the new pace and patterns of cloud native development, as well as set management expectations. As Verizon’s Ross Clanton put it recently:

The purpose of the dojo is learning, and we prioritize that over everything else. That means you have to slow down to speed up. Over the six weeks, they will not speed up. But they will get faster as an outcome of the process.

Scaling up any change to a large organization is mostly done by winning over the trust of people in that organization, from individual contributors, to middle-management, to “leadership.” Because IT has been so untrustworthy for so many decades — how often are projects not only late and over-budget, but then anemic and lame when finally delivered? — the best way to win over that trust is to actually learn by doing and then market that success relentlessly.

This post is an early draft of a chapter in my book,  Monolithic Transformation.

Change is hard, but possible, or, It’s the still the case that you should stop hitting yourself

Change is hard, but possible, or, It’s the still the case that you should stop hitting yourself

In the corporate clip-art game, ain’t no one’s better than geralt.

Improving is never easy, and that’s certainly true when it comes to how large organizations improve how they do software. While it can seem like a curse, I’m lucky to talk with people at organizations who are struggling to improve. By the nature of the work Pivotal does, we spend a lot of time talking with organizations who want to be more agile, shift to a DevOps approach, and other wise (to a buzz-phrase) become “cloud native.”

As I’m fond of putting it, they just want to get better at software. The first step is to stop hitting yourself, as we’ve discussed before. But that’s just a eye-rolling bon mot, really. You actually have to do some work.

The road to better software is paved with white collar pain. It’s like (as I’m told) when you start working out: it hurts, for, like, several years, and then you sort of start to enjoy it and maybe can live 0.7 years longer.

Let’s look at a couple of those common pains.

The Pain of legacy process (aka “culture”)

Perhaps the most frequent question is something like:

“How do we reconcile [old processes we don’t like] with DevOps [or whatever new way of doing things we now want to do?”

Well, that’s the 23% CAGR over 5 years question right there. I’d start with understanding what DevOps (or whatever process you want to switch to) is, and why it is (DevOps wants to ensure that you can deploy software weekly so you can always be improving it, and that it actually works [has uptime and resilience]).

At that point, you ask “does our current process do that”? If not, then you have to get executives to change how the organization runs. There’s no short cuts or easy cuts, you just have to do it over the course of *years*.

In contrast, to “virtualize,” you sort of just install VMware and after a few years you have huge ROI and savings. (Granted, the truth of virtualization is that it ruffled all sorts of feathers in IT departments in the 2000s and people were all up in arms and chickens without heads running around and cats and dogs living together …we just forget all that ;>)

Put another way “you’re going to change and eliminate those processes. GET READY FOR A SURPRISE!”

“Come on, man! I got five kids to feed!”

Internal selling

One of the chief characteristics of large organizations is that you have to convince the organization to actually do anything. We have these visions that executives in large organizations can actually make the trains run on time, as it were. Nope.

Thus, when it comes to change, you have to spend much of your time doing internal marketing to sell it up the chain, to your peers, and to your own organization.

To my mind, the only ways to do internal marketing well is to either (1.) already be successful, or, (2.) get executives at other companies to tell you and your organization How They Did It And That It Worked.

The first is just a recursive noop (“success breeds success” and other such nonsense). Thankfully, when it comes to the second, there’s a lot help now-a-days, primary in the form of there change agents who’ve gone through this themselves.

How did these executives succeed? By actually trying: picking small projects at first, learning the new way, succeeding (and hiding failures), then trying bigger things, and then telling people about it by making money. After all, success breeds success, right?

They also fire, er, “re-allocate” a lot of people, which they don’t talk about a lot in big glitzy keynotes but do over drinks in loud bars.

Of course, vendors (like myself) saying all this is pretty useless. We’re not trustworthy, after all, and are better at unicorn management and breeding programs than tending to the donkey ranches.

So, let me direct you to some “actual” people who’ve gone through all this:

There’s many more “talks” that aren’t recorded, you just have to find the right people and sit down with them to chat.

How do we migrate legacy software?

There are no good answers here. This is like someone with terminal lung cancer asking for help on stopping smoking. I suppose that’s gas-lighting…but if “legacy” is what’s holding you back it means you’re not managing technical debt well. Stick all your enterprise architects in a room (maybe even have an open bar!) and gently ask them, “so…what would you say you do around here?”

Updating legacy software is hard. The problem with “lift and shift” (which many vendors like to wrap fancy slides around) is that the “cloud native” benefits you’re looking to get are not only from the platform you run your software on, but from how the software itself is written (and, then, obviously, then, how you manage and operate it in production).

Sure, you could just dump some three tier, MVC, hairball into a WAR file and spoot it out into some container orchestrated cloud thing…but all you’ll new have is a big lever that says “reboot” on it. With brute-migration there won’t be:

  1. All the resiliency advantages of little blue/green man deploys, canary parties, feature flag burnings, bulk-bin heads, etc.,
  2. The ability to to start deploying weekly or even daily to improve how software is done (i.e., “you don’t operate in an agile way”)
  3. And, you know, you still have to make sure it all runs in production properly tomorrow.

Worse: the original problem still isn’t fixed. The next time you need to pay down your technical debt so you can improve/do things in a better way, you’ll still have the same old crap weighing you down, just with a different compression format and file extension.

I think there’s plenty of “hacks” to be had to extend legacy software’s value; that is, not having to spend time and money on updating/refactor/rewriting them). I hear Oracle has some bridge-themed tools if you like your current parking arrangements, and there’s always queues, amirght? You could probably do a lot worse than doing some BCG matrix trust-falls to find your low-priority, little used apps and shipping them off to an MSP, AWS, or one of those data centers that’s sitting there purring like an old cat with crusty eyes and renal issues.

The point of Whatever The New Approach You Want To Do is: when you want to write software in the best way possible, do it the new way, not the way you’ve been doing.

There’s some more instructive help from people like my pals Kenny and Rohit, to be sure. You can find plenty of content like this that speaks to how to start picking away at the scabs of legacy. As with peeling off any scab, it’s important to know that the skin underneath it is healed, or you just re-bleed. That’s probably how you should treat migrating legacy applications.

Like I said: no good answers here, just lots of work and risk of bleeding.

That sounds great, but where the hell do I start?

Getting started is vexing. Essentially, you need to pick low-risk projects that are still “material” to the business. I just happen to have a draft of some advice here “from the streets” in this little excerpt from a new booklet I’m working on.

Good luck, be sure to tell us how it goes

If you’re struggling with stopping hitting yourself, the best next step is to find other people who’re struggling and to talk with them. You then need to “see it to believe it” and, then, really, just start trying. There’s no universal bromide or DVD you can install. There is, however, a way of thinking — a process even — you can apply, namely: learning and slowly changing towards the better.

(And, you know, there’s lots of people hiring if you find yourself a rat on a sinking ship.)

Getting Started — picking your first cloud native projects, or, Every Digital Transformation Starts with One Project

This post is pretty old and possibly out of date. There’s updates on this topic and more in my book, Monolithic Transformation.

Every journey begins with a single step, they say. What they don’t tell you is that you need to pick your step wisely. And there’s also step two, and three, and then all the n + 1 steps. Picking your initial project is important because you’ll be learning the ropes of a new way of developing and running software, and hopefully of running your business.

Choosing these first projects wisely is also important for internal marketing and momentum purposes: the smell of success is the best deodorant, as they say, so you want your initial projects to be successful. And…if they’re not, you want to quietly sweep them under the rug so no one notices. Few things will ruin the introduction of a new, proven way of operating into a large organization than failure’s foetidly. Following Larman’s Law, the organization will do anything it can — consciously and unconsciously — to stop change. One sign of weakness early, and your cloud journey will be threatened by status quo zombies.

Project picking peccadilloes

Your initial projects should be material to the business, but low risk. They should be small enough that you can quickly show success in the order of months, and also technically feasible for cloud technologies. These shouldn’t be “science projects” or automation of low value office activities: no virtual reality experiments or conference room schedulers (unless those are core to your business). On the other hand, you don’t want to do something too big, like “migrate the .com site.” As Christopher Tretina recounts Comcast’s initial cloud native ambitions:

We started out last year with a very grandiose vision.. And it didn’t take us too long to realize we had bit off a little more than we could choose. So around mid-year, last year, we pivoted and really tried to hone in and focus on ‘what are just the main services we wanted to deploy that’ll get us the most benefit?’

Your initial projects should also allow you to test out the entire software lifecycle, all the way from conception, to coding, to deployment, to running in production. Learning is a key goal of these initial projects and you’ll only do that by going through the full cycle. As Home Depot’s Anthony McCulley describes the applications chosen in the first 6 or so months of their cloud native roll-out: “they were real apps, I would just say that they were just, sort of, scoped in such a way that if there was something wrong it wouldn’t impact an entire business line.” In Home Depot’s case, the applications chosen were projects like managing (and charging for!) late returns for tool rentals and running the in-store custom paint desk.

A special case for initial projects is picking a microservice to deploy. This is not as perfect case as a full-on, human-facing project, but will allow you to test out cloud native principals. The microservice could be something like a fraud detection or address canonicalization service. This is one approach to migrating legacy applications in reverse order, a strangler from within!

Picking projects by portfolio analysis

There are several ways to select your initial projects following the above criteria. Many Pivotal customers use a method perfected over the past 25 years by Pivotal Labs called “discovery.” In the abstract, it follows the usual BCG matrix approach but builds in intentional scrappiness to ensure that you can quickly do a portfolio analysis with the limited time and attention you can secure from all the stakeholders. The goal is to get a ranked list of projects to do based on the organization’s priorities and the “easiness” of the projects.

First, gather all the relevant stakeholders. This should include a mixture of people from “the business” and IT side, as well as the actual team that will be doing the initial projects. This discovery session is typically led by a facilitator, usually a Pivotal Labs person familiar with coaxing a room through this process.

The facilitator will hand out stacks of sticky notes and markers, asking everyone to write down projects that they think are valuable. What “valuable” is will depend on each stakeholder. We’d hope that the more business minded of them would have a list of corporate initiatives and goals in their heads (or a more formal one they brought to the meeting). One approach used in Lean is to ask management “if we could do one thing better, what would it be?” and start from there, maybe with some five why’s spelunking.

After writing down projects on sticky notes, the discovery process facilitator draws or tapes up a 2×2 matrix that looks like the following:

People in button up shirts prioritizing sticky notes.

The participants then put up their sticky notes in this quadrant, forcing themselves not to weasel out and put the notes on the lines. Once everyone has done this, you get a good sense of projects that all stakeholders think are important, sorted by the criteria I mentioned above: material to the business (“important”) and low risk (“easy”).

If all of the notes are clustered in one quadrant (usually, in the upper right, of course), the facilitator will redo the 2×2 lines to just that quadrant, forcing the decision and narrowing down on just projects to “do now.” The process might repeat itself over several rounds. To force a ranking of projects you might also use techniques like dot voting which will force the participants to really think about how they would prioritize the projects. At the end, you should have a list of projects, ranked by the consensus of the stakeholders in the room.

Like I said: “scrappy.”

Planning out the initial project

Of course, you may want to refine your list even more, but to get moving, the next step is to pick the top project and start breaking down what to do next. How you proceed here is highly dependent on how your product teams break down tasks into stories, iterations, and releases (or epics, sagas, or whatever cutesy terms you like for “bucket of stuff scoped at some hierarchical level with purposefully vague responsibility and temporal connotations”).

More than likely, following the general idea of a small batch process you’ll:

  1. Create an understanding of the user(s) and the “problems” they’re trying to solve with your software through personas and approaches like scenarios or Jobs to be Done,
  2. Come up with several theories for how those problems could be solved
  3. Distill the work to code and test these into stories,
  4. Add in more stories for “non-functional” requirements (like setting up build processes, CI/CD pipelines, testing automation, getting the new ping-pong table setup, etc.),
  5. Arranging them into iteration sized chunks without planning too far ahead (least you’re not able to adapt your work to the user experience and productivity findings from each iteration)

Crafting your hockey stick

Starting small ensures steady learning and helps contain the risk of a “fail fast” approach. But as you learn the cloud native approach better and string up a series of successful projects, you should expect to ramp up quickly. The below shows Home Depot’s ramp up in their first year:

This chart measures application instances in Pivotal Cloud Foundry which does not map exactly to a single application. What’s important is the general shape and acceleration of this curve as they became more familiar with the approach and the platform.

Another Pivotal customer in the telco space started with about 10 unique applications at first and expanded to 100 applications just over half a year later. These were production applications used to manage millions of customer account management and billing tasks.

How do you start: simple

It all sounds simple, and that’s part of the point. When learning something new, you want to start as simple as possible, but not simpler.

 

This post is pretty old and possibly out of date. There’s updates on this topic and more in my book, Monolithic Transformation.

These aren’t the ROI’s you’re looking for, or, “ROI: ¯\_(ツ)_/¯”

I have a larger piece on common objections to “cloud native” that I’ve encountered over the last year. Put more positive, “how to get your digital transformation started with a agile, DevOps, and cloud native” or some such platitudinal title like that. Here’s a draft of the dread-ROI section.

The most annoying buzzkill for changing how IT operates (doing agile, DevOps, taking “the cloud native journey,” or whatever you think is the opposite of “waterfall”) is the ROI counter-measure. ROI is a tricky hurdle to jump because it’s:

  1. Highly situational and near impossible to properly prove at the right level — do you want to prove ROI just within the scope of the IT department, or at the entire business-level? What’s the ROI of missing out on transitioning Blockbuster to Netflix? What’s the ROI of a mobile app for a taxi company when Uber comes along? What’s the ROI for investing in a new product that may or may not work within three years, but will save the company’s bacon in five years?
  2. Compounded by the fact that the “value” of good software practices is impossible to predict. Drawing the causal lines between “pair programming” and “we increased market-share by 3% in Canada” can be a hard line to draw. You can back-think a bunch of things like “we reduced defects and sped up code review time by pairing,” but does that mean you made more money, or did you make more money because the price of oil got halved?

In my experience, when people are trying to curb you on ROI, what they’re asking is “how will I know the time and money I’m going to spend on this will pay off and, thus, I won’t lose time and money? (I don’t want to look like a fool, you see, at annual review time)”

What they’re asking is “how do I know this will work better than what I’m currently doing or alternatives.” It also usually means “hey vendor, prove to me that I should pay you.”

As I rambled through last year, I am no ROI expert. However, I’ve found two approaches that seem to be more something than nothing: (1.) creating a business case and trusting that agile methods will let you succeed, and, (2.) pure cost savings from the efficiencies of agile and “cloud native.”

A Business Case

A business case can tell you if you your approach is too expensive, but not if it will pay for itself because that depends on the business being successful.

Here, you come up with a new business idea, a product, service, or tweak to an existing one of those. “We should set up little kiosks around town where people can rent DVDs for a $1 a day. People like renting DVDs. We should have a mobile app where you can reserve them because, you know, people like using mobile. We should use agile to do this mobile app and we’re going to need to run it somewhere, like ‘the cloud.’ So, hey, IT-nerds, what’s the ROI on doing agile and paying for a cloud platform on this?”

In this case, you (said “IT-nerds”) have some externally imposed revenue and profit objectives that you need to fit into. You also have some time constraints (that you’ll use to push back on bloated requirements and scope creep when they occur, hopefully). Once you have these numbers, you can start seeing if “agile” fits into it and if the cost of technology will fit your budget.

One common mis-step here is to think of “cost” as only the licensing or service fees for “going agile.” The time it takes to get the technology up and running and the chance that it will work in time are other “costs” to account for (and this is where ROI for tech stuff gets nasty: how do you put those concerns into Excel?).

To cut to the chase, you have to trust that “agile” works and that it will result in the DVD rental mobile app you need under the time constraints. There’s no spreadsheet friendly thing here that isn’t artfully dressed up qualitative thinking in quantitate costumes. At best you can point to things like the DevOps reports to show that it’s worked for other people. And for the vendor expenses, in addition to trusting that they work, you have to make sure the expenses fit within your budgets. If you’re building a $10m business, and the software and licensing fees amount to $11m, well, that dog won’t hunt. There are some simple, yet helpful numbers to run here like the TCO for on-premises vs. public cloud fees.

Of course, a major problem with ROI thinking is that it’s equally impossible to get a handle on competing ways to solve the problem, esp. the “change nothing” alternative. What’s the ROI of how IT currently operates? It’d be good to know that so you can compare it to the proposed new way.

If you’re lucky enough to know a realistic, helpful budget like this, your ROI task will be pretty easy. Then it’s just down to horse trading with your various enterprise sales reps. Y’all have fun with that.

Efficiency

Focus on removing costs, not making money.

If you’re not up for the quagmire of business case driven ROI, you can also discuss ROI in terms of “savings” the new approach affords. For things like virtualizing, this style of ROI is simple: we can run 10 servers on one server now, cutting our costs down by 70–80% after the VMware licensing fees.

Doing “agile,” however, isn’t like dropping in a new, faster and cheaper component into your engine. Many people I encounter in conference rooms think about software development like those scenes from 80s submarine movies. Inevitably, in a submarine movie, something breaks and the officer team has to swipe all the tea cups off the officer’s mess table and unfurl a giant schematic. Looking over the dark blue curls of a thick Eastern European cigarette, the head engineer gestures with his hand, then slams a grimy finger onto the schematics and says “vee must replace the manifold reducer in the reactor.”

Solving your digital transformation problems is not like swapping “agile” into the reactor. It’s not a component based improvement like virtualization was. Instead, you’re looking at process change (or “culture,” as the DevOps people like to say), a “thought technology.” I think at best what you can do is try to calculate the before and after savings that the new process will bring. Usually this is trackable in things like time spent, tickets opened, number of staff needed, etc. You’re focusing on removing costs, not making money. As my friend Ed put it when we discussed how to talk about DevOps with the finance department:

In other words, if I’m going to build a continuous integration platform, I would imagine you could build out a good scaffolding for that and call it three or four months. In the process of doing that, I should be requiring less help desk tickets get created so my overtime for my support staff should be going down. If I’m virtualizing the servers, I’ll be using less server space and hard drive space, and therefore that should compress down. I should be able to point to cost being stripped out on the back end and say this is maybe not 100% directly related to this process, but it’s at least correlated with it.

In this instance, it’s difficult to prove that you’ll achieve good ROI ahead of time, but you can at least try to predict changes informed by the savings other people have had. And, once again, you’re left to making a leap of faith that qualitative anecdotes from other people will apply to you.

For example, part of Pivotal’s marketing focuses on showing people the worth of buying a cloud platform to support an agile approach to software deliver (we call that “cloud native”). In that conversation, I cite figures like this:

  • Developers [at Allstate] used to spend only 20% of their time coding and now it’s closer to 90%
  • A federal government agency wanted to save money on call-centers by converting the workflow to a web app. They’d scheduled to complete the project in 9 months, but after converting to agile delivered it in 6 weeks.
  • When doing agile, because testing is pushed down to the team level and automated, you can expect to reduce your traditional QA spend. In fact, many shops on the cloud native journey have massively eliminated their QA department as a stand alone entity.
  • ING’s savings from transforming to a more cloud-y IT setup: “Investment of €200m to further simplify, standardize and automate IT; Decommissioning 40% of application landscape; Moving 80% of applications to zero-touch private cloud.” Resulting in savings of €270m starting in 2018.
  • From Orange: “Who isn’t happy to continue working when projects are delivered on average six times faster than with a waterfall approach?”
  • “[R]espondents from a recent government study who have already used PaaS say they save 47% of their time, or 1 year and 8 months off a 3.5 year development cycle. For those who have not deployed PaaS, respondents believe it can shave 31% off development time frames and save 25% of their annual IT budget, a federal savings of $20.5 billion.”
  • 14 months down to 6 months, 16 staff down to 8 staff: “[w]hen planning the first product developed on Pivotal Cloud Foundry, CoreLogic allocated a team of 12 engineers with four quality assurance software engineers and a management team. The goal was to deliver the product in 14 months. Instead, the project ultimately required only a product manager, one user experience designer and six engineers who delivered the desired product in just six months.”
  • “We did an analysis of hundreds of projects over a multiyear period. The ones that delivered in less than a quarter succeeded about 80% of the time, while the ones that lasted more than a year failed at about the same rate. We’re simply not very good at large efforts.”
  • From a 1999 study: “software projects that use iterative development deliver working software 38% sooner, complete their projects twice as fast, and satisfy over twice as many software requirements.”
  • After switching over to “the new way,” one large retailer has already seen 80% Reduction in cycle time and scope and reduced cycle time from 123 days to 23 days.
  • One large insurance company can now manage 1,500 apps with just two operators. There were many, many more before that. Another large bank could manage 145 apps with just 2 operators, and so on.

In most of these cases, once you switch over to the new way, you end up with extra capacity because you can now “do IT” more efficiently. Savings, then, come from what you decide to do with that excess capacity: (a.) doing more with the new capacity like adding more functionality to your existing businesses, creating new businesses, or entering new markets, or, (b.) if you don’t want to “grow,” you get rid of the expense of that excess capacity (i.e., lay-off the excess staff or otherwise get them off out of the Excel sheet for your business case).

But, to be clear, you’re back into the realm of imagining and predicting what the pay-off will be (the “business case” driven ROI from above) or simply stripping out costs. It’s a top-line vs. bottom-line discussion. And, in each case, you have to take on faith the claims about efficiencies, plus trust that you can achieve those same savings at your orginizations.

With these kinds of numbers and ratios, the hope is, you can whip out a spreadsheet and make some sort of chart that justifies doing things the new way. Bonus points if you use Monte Carlo inspired ranges to illustrate the breadth of possibilities instead of stone-code line-graph certainty.

Everything is up when there’s no bottom

As an added note of snark: all of these situations assume you know the current finances for the status quo way of operating. Surely, with all that ITIL/ITSM driven, mode 1 thinking you have a strong handle on your existing ROI, right? (Pause for laughs.)

More seriously, the question of ROI for thought technologies is extremely tricky. In that conversation on this topic that I had with Ed last year, the most important piece of advice was simple: talk with the finance people more and explain to them what’s going on.

That’s the most effective (and least satisfying!) advice you get about any of this “doing things the new way” change management prattle: whether it’s auditors, DBAs, finance, PMO people, or whoever is throwing chaff in your direction: just go and talk with them. Understand what it is they need, why they’re doing their job, and bring them onto the team instead relegating them to the role of The Annoying Others.

Check out another take on this over in my September 2016 column at The Register.

Please teach my kid Spanish, or, What have the Romans ever done for us?

I got a survey from my son’s school district about foreign language preferences. I was predictably shocked that Spanish wasn’t listed in the rankings we were asked to do:

To be post-PC, I suppose Spanish isn’t a foreign language in the US, esp. in Texas. However, I wanted to drive home my point so left this comment.

I really, really would like my son (and daughter when she’s old enough) to be taught Spanish in school. I could go look up the stats, but given our geography (our hemisphere, really), Spanish and English seem like the most useful, functional languages.

I only know terrible gringo-Spanish and I wish I knew it much better. When I went to school, we were deathly afraid that the Japanese were going to take over, so I took Japanese in junior high, then French in high school. I finally wised up and took Spanish in college and now speak my crappy Spanish and barely understand it when others speak it. Spanish is such a valuable language for not only everyday life, but also understanding, emphasizing, and therefor beneficially living with all our the Spanish speaking fellow citizens. I’m sure Chinese will be handy, which is why I ranked it as first in the options given, but if I could rank Spanish as my preferred #1–10, then I’d list Chinese as #11, followed by the ordering I had.

Latin seems like a waste, and I’m an incredibly jingoistic about Western culture. If a dead language had to be taught, I’d rather Ancient Greek was taught so that kids could read Greek texts in their original state: I don’t think the Romans did that much that’s beyond remixes of Greek things (Marcus Aurelius and Lucretius aside) — I mean, can you imagine reading Plato, Aristotle, Heraclitius, and Homer in the original Greek? It’d be amazing!

Apologies for the “open letter,” please read it in the appropriate voice.

7 BigCo Anti-patterns — white collars doing it wrong

Awesome corporate clipart from geralt.

To wrap up this little run of Thriving in BigCo’s posts, here’s a quick listing of seven, often sad and unhelpful bad practices I’ve noticed people and large organizations doing. Try to avoid them.

  1. Sad bag of slides — you see a person uses the same basic slides in their presentations over and over again, trying to argue for the same point each time they get an audience. They may put together new pitches, slightly masked, but you end up seeing those same old stuff. At some point, you can time how long it will take this person to suggest their idea, pull out their sad slides and go for it.
  2. No free coffee — the company doesn’t even understand the basics of the culture they want, e.g., software developers need free coffee and will be more prone to leave. Cf. “penny-wise, pound foolish.”
  3. Strategy by gratuitous differentiation — thinking that the way to compete is to do something your competitors don’t do…without understanding why they don’t do it. For example, back in the “what do we do about cloud?!” days (~2010–2014 for the first wave) many large IT vendors would want to compete with Amazon’s cloud services by being “more enterprise.” Instead of just jumping in the first blue-cloud-ocean you find, you need to carefully understand why Amazon may not do these “enterprise” things, e.g., they cost a lot more and kill margin.
  4. “What, you don’t like money?” — despite a business throwing off a lot of cash, you don’t want to acquire it. Like worrying about buying an otherwise successful software company because they have a large mainframe business…a business that generates good revenue at large margins with a captive market, so you should keep it and run it if you like money.
  5. 40/4 = 10, or, “4-up” — to reduce the number of slides in a presentation, you put the content of four slides on one. This “reduces” your 40 slide presentation to just 10 slides!
  6. Pay people to ignore them — BigCo’s love hiring new employees, paying them well, and then rarely listening to them. Instead, you hire outsiders and consultants who say similar things, but are listened to. In fact, the first task of any good management consultant team is to go interview all those bright, but ignored, employees you have and ask them what they’d do. The lesson is to track how many ideas come internally vs. externally and, rather than just blame your people for low internal idea generation, ask yourself if you’re just not listening.
  7. Death by Sync — the price of communication is high and you have to be judicious about how far in debt you go. Instead of doing this, most companies spend lots of time “syncing” with other groups and people in the company rather than just doing things. Part of what upper management needs to do is establish a culture and processes that prevent death-by-sync. Also known as “hanging yourself with pre-wire.”

And, if you missed them, here are the longer ones:

“Nope, that’s not a problem.”

One of the things that separates a more senior person from a more junior person is statement, “nope, that’s not a problem.” At the very least, the statement may be more like, “that is a problem, but we’re not going to solve it now. Ship it!” I’ve been through this several times, on both sides of the seniority curve and now respect people who can wisely decide when not to care about a problem and go on with the work.

For example, let’s say you’ve done a fair amount of work on the deck for The Big Meeting, and at the 11th hour you found that something is slightly wrong: you’ve mis-categorized one of the 30 sub-categories in your market sizing. The worker bees are all upset that there’s an error and are having a call to see how to redo “everything.” A good manager can come into a situation like this and say, “yeah, it’d bee too much chaos to change it now. We’ll do it later. Just put a little foot-note down there if you think it’s that bad. Ship it!” The worker bees are somewhat flummoxed, or maybe relieved: but that’s the power of management, to decide to go ahead with it instead of having to make it perfect.

There is some amount of risk to allowing such “bugs” to go through the process, but sometimes — oftentimes! — the risk is minimal and worth taking.

Asking questions often leads to more work, for you

From geralt.

Most of what we do as white collar workers is help our organization come to a decision. What new geographies to sell our enterprise software and toothpaste in, what pricing to make our electric razors and cranes, which people to fire and which to promote, or how much budget is needed according to the new corporate strategy. Even in the most cynical corporate environment, asking questions — and getting answers! — is the best, primary way to set the stage for making a decision.

You have to be careful, however,of how many questions you ask, and on what topic. If you ask too many questions, you may find that you’ll just create more work for yourself. Before asking just any old question, ask yourself if you’re willing to do the work needed to answer it…because as the asker, you’ll often be tasked with doing that work. We see this in life all the time, you ask someone “want to go to lunch?” and next thing you know, you’re researching in all the restaurants within five miles that have gluten free, vegan, and steak options.

I’ve seen countless “staffers” fall prey to this in meetings with managers who’re putting together plans and strategies. The meeting has spent about 30 minutes coming up with a pretty good plan, and then an eager staffer pipes up and suggests 1–3 other options. The manager is intrigued! Quickly, that staffer is asked to research these other options, but, you know, the Big Meeting is on Monday, so can you send me a memo by Sunday morning?

In some situations, this is fine and expected. But in others, conniving management will just use up as much as your energy as possible: it’s always better to have done more research, so why not let that eager staffer blow-up their Saturday to have more back-up slides? Co-workers can also let you self-assign homework into burnout if they find you annoying: you’ll notice that when you, the eager staffer, pipe up, they go suddenly quiet and add no input.

As always, you have to figure out your corporate culture. But, just make sure that before you offer up alternatives and otherwise start asking The Big Questions, you’re read to back-up those questions by doing the extra homework to answering them yourself.

Avoid fence painting by assigning homework

Early corporate culture pioneers.

One of the more eye-rolling tactics of white collar workers is what I call “fence painting”: an employee somehow gets someone else to do work for them. This can be as simple as coasting off budget, but the more insidious practice is to get other outside your chain of command to do work for you.

Most of us have experienced this: days after The Big Meeting you suddenly think “why am I up at 11am working on this report for Scopentholler? I don’t even work in that division!”

What you, the whitewash encrusted white collar worker want to do here is somehow still seem “up for anything” and fully capable, and yet not end up painting Scopentholler’s fences. I suggest assigning “homework” as the first rung of filters. If someone wants your input, or wants you to somehow get involved in a project, come up with some mini-project they need to do first. Have them write a brief for you, put on a meeting to bring you up to speed, do a report about how the regional sales have been going, or otherwise force them to do some “homework.”

Home work filters and prioritizes

Assigning homework does two things:

  1. It gauges their commitment to getting you involved, filtering out lazy-delegators. If all this person is looking to do is get you to do their work, it’s highly unlikely that they’ll do additional work. Mysteriously, the importance of you being involved in this project will disappear.
  2. If they do the homework, you get more information (one of the major currencies of corporate culture) and you can better gauge if it’s actually worth your time to get involved. If the quality of the “homework” is good, you’re interested in the work, and it aligns with your responsibilities, then you should probably consider the original fence painting task.

I’ve seen assigning homework cut out a huge amount of fence painting work, for me and others I’ve observed doing this. Also, once it’s known that realize you assign homework, people will stop preying on you: “oh, don’t ask Crantouzok to get involved, you’ll just have to make another deck before she lifts a finger!”

Of course, if you’re pure of heart and mind, you can always just be direct and say “that’s low in my priority queue and not really my responsibility.” But, as with all thriving in BigCo tips here, first make sure the BigCo you’re working in is equally pure of heart and mind.

Dead Horse Points

In corporate meetings, oftentimes one person figures out a problem and comes up with a solution. Equally often, multiple people in the meeting like the re-iterate the point in their own words, adding 5–10 minutes more to the meeting.

Once the epiphany and decision is made, everyone should just close the issue, and move on. No need for people to comment on it more.

For example, in one company I worked for we were discussing a software product name. The project had been called “APM.” But it turns out, it wasn’t an APM product, it was just exposing instrumented metrics in the software. This is, in itself, incredibly valuable, but not full blown APM. Someone initially pointed out, “we shouldn’t call that APM,” and everyone agreed.

Then 3 other people chimed in with their retelling of this point, basically embellishing and rephrasing the point. In most corporate meetings, and I’d argue in the never-ending meeting of Slack channels and email, there’s no need for all that extra talk after a realization and decision is made. Someone needs to pipe up and say “what’s the next issue?” or close out the meeting.

A presentation is just a document that’s been printed in landscape mode

Slides must stand on their own

Much presentation wisdom of late has revolved around the actual event of a speaker talking, giving the presentation. In a corporate setting, the actual delivery of the presentation is not the primary purpose of a presentation. Instead, a presentation is used to facilitate coming to a decision; usually you’re laying out a case for a decision you want the company to support. Once that decision is made, the presentation is often used as the document of record, perhaps being updated to reflect the decision in question better.

As a side-note, if your presentation doesn’t argue for a specific, “actionable” decision, you’re probably doing it wrong. For example, don’t just “put it all on the table” without suggesting what to do about it.

Think of presentations as documents which have been accidentally printed in landscape and create them as such. You will likely not be given the chance to go through your presentation from front to end like you would at a conference, You’ll be interrupted, go back and forth, and most importantly, end up emailing the presentation around to people who will look at it without you presenting.

You should therefore make all slides consumable without you being there. This leads to the use of McKinsey titles (titles that are one-liners explaining the point you’re making) and slides that are much denser than conference slides. The presentation should have a story-line, an opening summary of the points you want to make, and a concluding summary of what the decision should be (next steps, launching a new project, the amount needed for your budget, new markets to enter, “and therefore we should buy company X,” etc.).

This also gives rise to “back-up” slides which are not part of the core story-line but provide additional, appendix-like information for reference both during the presentation meeting and when others look at the presentation on their own. You should also put extensive citations in footnotes with links so that people consuming the presentation can fact check you; bald claims and figures will be defeated easily, nullifying your whole argument to come to your desired decision.

Also remember that people will take your slides and use them in other presentations, this is fine. And, of course, if successful, your presentation will likely be used as the document of record for what was decided and what the new “plan” was. It will be emailed to people who ask what the “plan” is and it must be able to communicate that accordingly.

Remember: in most corporate settings, a presentation is just a document that has been printed in landscape mode.

See more “surviving in BigCo’s” tips in this round-up post.

Getting Digital Transformation Wrong, Software Development Edition

Just figuring out how to piece it all together.

“Digital transformation,” we’re all supposed to be doing that right? Most conversations I get involved now-a-days amount to “how do we do that?” and “how’s it going for other organizations?” When it comes to the custom written software part of whatever “digital transformation” is, here’s a few common “anti-patterns” I’ve been noticing:

1. Products, not projects.

Management has to change from a project-driven mindset to a product-driven one. Instead of annual budgets and plans, they’ll need to move to the weekly/monthly loop of shipping code, observing it, improving it, etc. I suspect this idea of “uncertainty” in long-term plans will seem baffling to most. Of course, the “uncertainty” is always there in long-term planning: you have the uncertainty of ever being successful — just because you have a plan doesn’t mean…well…anything when it comes to success in software.

2. Accept that you have no idea what you’re doing, and build a system that helps you figure it out.

A “waterfall” approach (plan, code, deliver, done) like is very fragile when it comes to unexpected events and changes (e.g., executive and market whim). This is why classically, development teams do not like scope and requirements documents changed. Changing and adding features “in the middle” of projects is very damaging as it screws up the order of work, budgets, and innumerable other things. It’s like if half way through assembling an Ikea bookshelf you decided you wanted a desk instead.

Companies get in a whiplash vortex where they’re constantly late and over budget because they keep changing the plan (bookshelf-cum-desk). They never get anything done because they’re using the wrong process. This is where the idea of “small batches,” mixed with “code must always be shippable,” comes into play to reduce risk. If you do small batches, you can change your mind every week…in a less painful way than in a waterfall mentality. (Pro-tip: it’s actually not a good idea to change your idea much, but, instead, experiment and use this feedback loop to explore new ideas and features. “Pivoting” is fine early one when you’re trying to discover your product/market fit, but once you do, you don’t want to change you mind too much.)

3. “Good cooking takes time. If you are made to wait, it is to serve you better, and to please you.”

Most big companies — rather, people managing those companies — like to rush things and spend zero resources on them, where resources means “time, money, and corporate attention.” Software is hard work that must be studied and understood. That’s why we pay software people a lot. Management often sets “the game” incorrectly. The first thing they do is rush the work and create really short project timelines for too much stuff. “Creating a sense of urgency” can be more damaging than helpful.

Instead or lighting fires under everyone’s collective asses, management needs to start small, painfully small.

Companies also often don’t want to spend any incremental budget, and their IT Service Management practices treat computational resources are sparse, static, and expensive (Mainframe, Unix, Windows) rather than infinite, fungible and cheap (cloud, Linux, etc.): you have to wait months to get a server. Once you layer in audit, compliance, and security (other resources that are treated as sparse and have endless downside/risk if something goes wrong [you get hacked!] and little upside to speeding it up), it delays things even further.

In short, management has framed the whole process of “doing software” in a way that’s slow and expensive. Thus, by (un)intention, doing software is slow and expensive.

(The the next item is a consequence too much resource control: the unwillingness to train and spend time on “the factory.”)

4. “You’re doing it wrong.”

Many of the productive practices in software — like using version control, doing CI and CD, writing tests before code, A/B testing, strict automation etc. — seems like a lot of unnecessary work until you’ve experienced the benefits, mostly in the medium and the long term. It’s always easier to just SCP code onto a server, or demo your application on someone’s laptop instead of doing the proper CI/CD pipeline and automation modeling on a real production environment. I’m still shocked at this, but most large organizations follow very little, if any good software development practices:

From : “Town Hall: Agile in the Enterprise,” Mike West, Nathan Wilson, Thomas Murphy, Dec 2015, Gartner AADI US conference.

Fixing this is tedious, “hand-to-hand” battle: OK, so, start writing tests first… fully automate your builds and get a proper environment that you can deploy at will to, etc. That’s likely a good 3–6 months of calendar time right there (for training, screwing up, then re-training, and also building all your own platforms if you don’t just use off the shelf workflows) before you can start really getting into the flow of shipping code.

5. Innovation takes a lot more time than you think.

Corporate managers and product owners don’t understand how much work is involved in innovation. In my experience, “management” thinks that their dictates and “requirements” are clear and straightforward. They think they’re dealing with the “known knowns” quadrant of Rumsfeldian knowledge, but at best they’re dealing with “known unknowns,” and likely deal with “unknown unknowns”…about their own wants and desires!

For example, as a requester of software, you often have no idea what you actually want until you see it several times. This means you need to set-up a process to continually check-in on progress (e.g., demos ever 1–2 weeks with the option to change direction) and, more importantly, be involved: management must “spend” their attention on the software development process. And at the same time, you have to resist the urge to manage by table flipping and instead be happy “learning by failing.”

6. Technical debt and legacy are rarely addressed, let alone quantified.

Most any large organization will have a pile of legacy applications and processes. Most of this will be riddled by technical debt, if only parts of the system that (a.) no one really understands how they works, and, (b.) there really aren’t automated tests. This means — building on Michael Feather’s Legacy Code Dilemma — that it’s too high risk to touch/change your legacy code, let alone do it quickly. Beyond just modifying those systems, this can include integrating with systems as well, usually highly customized ERP systems that any business needs to work with to actually do work. While it’s possible to “refactor” code, I’m becoming more and more of the opinion that at best you quarantine it (or do some “strangler pattern” mumbo jumbo if you’re really good), or just put it all on an a piece off ice, light it on fire, and shove it off into the ocean.

7. Fix the Meatware.

As I’m ever going on about, many of these are meatware problems: technology is often just fine, it’s the people that need fixing. The confounding thing with this issue — how do you transform large organizations to [A|a]gile? — is that all actors in this system (except, sometimes, “top management”) knows exactly what to do and what’s wrong. There’s some sort of complex game theory going on that ruins the easiness of fixing the problem, and everyone ends up becoming expert at telling you why nothing can improve instead of figuring out how to do better. I think that’s something of what Andrew’s always on about with his prisoner’s dilemma and Nash equilibrium who walk into a stone cutter bar.

For a more recent, detailed study on this topic, check out my book Monolithic Transformation.

Eventually, to do a developer strategy your execs have to take a leap of faith

A kingmaker in the making.

I’ve talked with an old colleague about pitching a developer-based strategy recently. They’re trying to convince their management chain to pay attention to developers to move their infrastructure sales. There’s a huge amount of “proof” an arguments you can make to do this, but my experience in these kinds of projects has taught me that, eventually, the executive in charge just has to take a leap of faith. There’s no perfect slide that proves developers matter. As with all great strategies, there’s a stack of work, but the final call has to be pure judgement, a leap of faith.

“Why are they using Amazon instead of our multi-billion dollar suite?”

You know the story. Many of the folks in the IT vendor world have had a great, multi-decade run in selling infrastructure (hardware and software). All the sudden (well, starting about ten years ago), this cloud stuff comes along, and then things look weird. Why aren’t they just using our products? To cap it off, you have Apple in mobile just screwing the crap out of the analogous incumbents there.

But, in cloud, if you’re not the leaders, you’re obsessed with appealing to developers and operators. You know you can have a “go up the elevator” sale (sell to executives who mandate the use of technology), but you also see “down the elevator” people helping or hindering here. People complain about that SOAP interface, for some reason they like Docker before it’s even GA’ed, and they keep using these free tools instead of buying yours.

It’s not always the case that appealing to the “coal-facers” (developers and operators) is helpful, but chances are high that if you’re in the infrastructure part of the IT vendor world, you should think about it.

So, you have The Big Meeting. You lay out some charts, probably reference RedMonk here and there. And then the executive(s) still isn’t convinced. “Meh,” as one systems management vendor exec said to me most recently, “everyone knows developers don’t pay for anything.” And then, that’s the end.

There is no smoking gun

If you can’t use Microsoft, IBM, Apple, and open source itself (developers like it not just because it’s free, but because they actually like the tools!) as historic proof, you’re sort of lost. Perhaps someone has worked out a good, management consultant strategy-toned “lessons learned” from those companies, but I’ve never seen it. And believe me, I’ve spent months looking when I was at Dell working on strategy. Stephen O’Grady’s The New Kingmakers is great and has all the material, but it’s not in that much needed management consulting tone/style. (I’m ashamed to admit I haven’t read his most recent book yet, maybe there’s some in there.)

Of course, if Microsoft and Apple don’t work out as examples of “leaders,” don’t even think of deploying all the whacky consumer-space folks out like Twitter and Facebook, or something as detailed as Hudson/Jenkins or Oracle DB/MySQL/MariaDB.

I think SolarWinds might be an interesting example, and if Dell can figure out applying that model to their Software Group, it’d make a good case study. Both of these are not “developer” stories, but “operator” ones; same structural strategy.

Eventually, they just have to “get it”

All of this has lead me to believe that, eventually, the executives have to just take a leap of faith and “get it.” There’s only so much work you can do — slides and meetings — before you’re wasting your time if that epiphany doesn’t happen.

The transformation is complete.

If this is your bag, come check out a panel on the developer relations at the OpenStack Summit on April 28th, in Austin — I’ll be moderating it!

So you want to become a software company? 7 tips to not screw it up.

Hey, I’ve not only seen this movie before, I did some script treatments:

Chief Executive Officer John Chambers is aggressively pursuing software takeovers as he seeks to turn a company once known for Internet plumbing products such as routers into the world’s No. 1 information-technology company.

Cisco is primarily targeting developers of security, data-analysis and collaboration tools, as well as cloud-related technology, Chambers said in an interview last month.

Good for them. Cisco has consistently done a good job to fill out its portfolio and is far from the one-trick pony people think it is (last I checked, they do well with converged infrastructure, or integrated systems, or whatever we’re supposed to call it now). They actually have a (clearly from lack of mention in this piece) little known-about software portfolio already.

In case anyone’s interested, here’s some tips:

1.) Don’t buy already successful companies, they’ll soon be old tired companies

Software follows a strange loop. Unlike hardware where (more or less) we keep making the same products better, in software we like to re-write the same old things every five years or so, throwing out any “winners” from the previous regime. Examples here are APM, middleware, analytics, CRM, web browsers…well…every category except maybe Microsoft Office (even that is going bonkers in the email and calendaring space, and you can see Microsoft “re-writing” there as well [at last, thankfully]). You want to buy, likely, mid-stage startups that have proven that their product works and is needed in the market. They’ve found the new job to be done (or the old one and are re-writing the code for it!) and have a solid code-base, go-to-market, and essentially just need access to your massive resources (money, people, access to customers, and time) to grow revenue. Buy new things (which implies you can spot old vs. new things).

2.) Get ready to pay a huge multiple

When you identify a “new thing” you’re going to pay a huge multiple of 5x, 10x, 20x, even more. You’re going to think that’s absurd and that you can find a better deal (TIBCO, Magic, Actuate, etc.). Trust me, in software there are no “good deals” (except once in a lifetime buys like the firesale fro Remedy). You don’t walk into Tiffany’s and think you’re going to get a good deal, you think you’re going to make your spouse happy.

3.) “Drag” and “Synergies” are Christmas ponies

That is, they’re not gonna happen on any scale that helps make the business case, move on. The effort it takes to “integrate” products and, more importantly, strategy and go-to-market, together to enabled these dreams of a “portfolio” is massive and often doesn’t pan out. Are the products written in the exactly the same programming language, using exactly the same frameworks and runtimes? Unless you’re Microsoft buying a .Net-based company, the answer is usually “hell no!” Any business “synergies” are equally troublesome: unless they already exist (IBM is good at buying small and mid-companies who have proven out synergies by being long-time partners), it’s a long-shot that you’re going to create any synergies. Evaluate software assets on their own, stand-alone, not as fitting into a portfolio. You’ve been warned.

4.) Educate your sales force. No, really. REALLY!

You’re thinking your sales force is going to help you sell these new products. They “go up the elevator” instead of down so will easily move these new SKUs. Yeah, good luck, buddy. Sales people aren’t that quick to learn (not because they’re dumb, at all, but because that’s not what you pay and train them for). You’ll need to spend a lot of time educating them and also your field engineers. Your sales force will be one of your biggest assets (something the acquired company didn’t have) so baby them and treat them well. Train them.

5.) Start working, now, on creating a software culture, not acquiring one

The business and processes (“culture”) of software is very different and particular. Do you have free coffee? Better get it. (And if that seems absurd to you, my point is proven.) Do you get excited about ideas like “fail fast”? Study and understand how software businesses run and what they do to attract and retain talent. We still don’t really understand how it all works after all these years and that’s the point: it’s weird. There are great people (like my friend Israel Gat) who can help you, there’s good philosophy too: go read all of Joel’s early writing of Joel’s as a start, don’t let yourself get too distracted by Paul Graham (his is more about software culture for startups, who you are not — Graham-think is about creating large valuations, not extracting large profits), and just keep learning. I still don’t know how it works or I’d be pointing you to the right URL. Just like with the software itself, we completely forget and re-write the culture of software canon about every five years. Good on us. Andrew has a good check-point from a few years ago that’s worth watching a few times.

6.) Read and understand Escape Velocity

This is the only book I’ve ever read that describes what it’s like to be an “old” technology company and actually has practical advice on how to survive. Understand how the cash-cow cycle works and, more importantly for software, how to get senior leadership to support a cycle/culture of business renewal, not just customer renewal.

7.) There’s more, of course, but that’s a good start

Finally, I spotted a reference to Stall Points in one of Chambers’ talks the other day which is encouraging. Here’s one of the better charts you can print out and put on your wall to look at while you’re taking a pee-break between meetings:

That charts all types of companies. It’s hard to renew yourself, it’s not going to be easy. Good luck!

The Problem with PaaS Market-sizing

Figuring out the market for PaaS has always been difficult. At the moment, I tend to estimate it at $20–25bn sometime in the future (5–10 years from now?) based on the model of converting the existing middleware and application development market. Sizing this market has been something of an annual bug-bear for me across my time at Dell doing cloud strategy, at 451 Research covering cloud, and now at Pivotal.

A bias against private PaaS

This number is in contrast to numbers you usually see in the single digit billions from analysts. Most analysts think of PaaS only as public PaaS, tracking just Force.com, Heroku, parts of AWS, Azure, and Google, and bunch of “Other.” This is mostly due, I think, to historical reasons: several years ago “private cloud” was seen as goofy and made-up, and I’ve found that many analysts still view it as such. Thus, their models started off being just public PaaS and have largely remained as so.

I was once a “public cloud bigot” myself, but having worked more closely with large organizations over the past five years, I now see that much of the spending on PaaS is on private PaaS. Indeed, if you look at the history of Pivotal Cloud Foundry, we didn’t start making major money until we gave customers what they wanted to buy: a private PaaS platform. The current product/market fit, then, for PaaS for large organizations seems to be private PaaS

(Of course, I’d suggest a wording change: when you end-up running your own PaaS you actually end-up running your own cloud and, thus, end up with a cloud platform. Also, things are getting even more ambiguous at the infrastructure layer all the time — perhaps “private PaaS” means more “owning” the PaaS layer, regardless of who “owns” the IaaS layer.)

How much do you have budgeted?

With this premise — that people want private PaaS — I then look at existing middleware and application development market-sizes. Recently, I’ve collected some figures for that:

  • IDC’s Application Development forecast puts the application development market (which includes ALM tools and platforms) at $24bn in 2015, growing to $30bn in 2019. The commentary notes that the influence of PaaS will drive much growth here.
  • Recently from Ovum: “Ovum forecasts the global spend on middleware software is expected to grow at a compound annual growth rate (CAGR) of 8.8 percent between 2014 and 2019, amounting to $US22.8 billion by end of 2019.”
  • And there’s my old pull from a Goldman Sachs report that pulled from Gartner, where middleware is $24bn in 2015 (that’s from a Dec 2014 forecast).

When dealing with large numbers like this and so much speculation, I prefer ranges. Thus, the PaaS TAM I tent to use now-a-days is something like “it’s going after a $20–25bn market, you know, over the next 5 to 10 years.” That is, the pot of current money PaaS is looking to convert is somewhere in that range. That’s the amount of money organizations are currently willing to spend on this type of thing (middleware and application development) so it’s a good estimate of how much they’ll spend on a new type of this thing (PaaS) to help solve the same problems.

Things get slightly dicey depending on including databases, ALM tools, and the underlying virtualization and infrastructure software: some PaaSes include some, none, or all of these in their products. Databases are a huge market (~$40bn), as is virtualization (~$4.5bn). The other ancillary buckets are pretty small, relatively. I don’t think “PaaS” eats too much database, but probably some “virtualization.”

So, if you accept that PaaS is both public and private PaaS and that it’s going after the middleware and appdev market, it’s a lot more than a few billion dollars.

(Ironic-clipart from my favorite source, geralt.)

Roles and Responsibilities for DevOps and Agile Teams

This post is pretty old and possibly out of date. There’s an updated version of it in my book, Monolithic Transformation.

Overview

There are a handful of “standard” roles used in typical Agile and DevOps teams. Any application of Agile and DevOps is highly contextual and crafted to fit your organization and goals. This document goes over the typical roles and their responsibilities and discusses how Pivotal often sees companies staffing teams. It draws on “the Pivotal Way,” our approach to creating software for clients, recommendations for staffing the operations roles for Pivotal Cloud Foundry (our cloud platform), and recent recommendations and studies from agile and DevOps literature and research.

I don’t discuss definitions for “Agile” or “DevOps” except to say: organizations that want to improve the quality (both in lack of bugs and “design” quality as in “the software is useful”) and uptime/resilience of their software look to the product development practices of agile software development and DevOps to achieve such results. I see DevOps as inclusive of “Agile” in this discussion, and so will use the terms interchangeably. For more background, see one of my recent presentations on this topic.

Finally, there is no fully correct and stable answer to the question of what roles and responsibilities are in teams like this. They fluctuate constantly as new practices are discovered and new technologies remove past operational constraints, or create new ones! This is currently my best understanding of all this based both organizations that are using Cloud Foundry and those who are using other cloud platforms. As you discover new, better ways, I’d encourage you to reach out to me so we can update this document.

Core Roles for DevOps Teams

There are generally two types of teams we see: “business capabilities teams” who work on the actual software or services (“the application”) in question and “agile operations teams” as seen below:

While not depicted in the diagram above, the amount of staff in each layer dramatically reduces as you go “down” the stack. The most number of people are in the business capability layer working on the actual applications and services, many less individuals are working on creating and customizing capabilities in Pivotal Cloud Foundry (e.g., a service to interact with a reservation or ERP system that is unique to the organization), while many, many less “operators” work at keeping the cloud platform up and running, updated, and handle the hardware and networking issues.

Making the Applications — Business Capability Roles

These teams work on the application or services being delivered. The composition of these teams changes over time as each team “gels,” learning the necessary operations skills to be “DevOps” oriented and master the domain knowledge needed to make good design choices.

These are the core roles in this layer:

  • Developer/Engineer — those who not only “code,” but gain the operations knowledge needed to support the application in production, with the help of…
  • Operations — those who work with developers on production needs and help support the application in production. This role may lessen over time as developers become more operations aware, or it may remain a dedicated role.
  • Product Owner/Product Manager — the “owner” of the application in development that specifies what should be done each iteration, prioritizing the list of “requirements.”
  • Designer — studies how users interact with the software and systematically designs ways to improve that interaction. For user-facing applications this also includes visual design.”.

These are roles that are not always needed and sometimes be fulfilled partially by shared, but designated to the team staff:

  • Tester — staff that helps with the effort ensure the software does what was intended and functions properly.
  • Architect — in large organization, the role or architect is often used to ensure that individual teams are aligning with the larger organization’s goals and strategy, while also acting as a consultative enabler to help teams be successful and share knowledge.
  • Data scientist — if data analysis is core to the application being developed, getting the right analysis and domain skills will be key.

Developer/Engineer

These are the programmers, or “software developers.” Through the practice of pairing, knowledge is quickly spread amongst developers, ensuring that there are no “empires” built, and addresses the risks of a low “bus factor.”Developers are also encouraged to “rotate” through various roles from front to back-end to get good exposure to all parts of the project. By using a cloud platform, like Pivotal Cloud Foundry, developers are also able to package and deploy code on their own through the continuous integration and continuous delivery tools.

Developers are not expected to be experts at operations concerns, but by relying on the self-service and automation capabilities of cloud platforms do not need to “wait” for operations staff to perform configuration management tasks to deploy applications. Over time, with this reliance on a cloud platform which cleanly specifies how to best build applications so that they’re easily supported in production, developers gain enough operations knowledge to work without dedicated operations support.

The amount of developers on each team is variable, but so far, following the two pizza team rule of thumb, we see anywhere from 1 to 3 pairs, that is 2 to 6, and sometimes more.

Operations

In a cloud native mode, until business capabilities teams have learned the necessary skills to operate applications on their own, they will need operations support. This support will come in the form of understanding (and co-learning!) how the cloud platform works, and assistance troubleshooting applications in production. Early on you should plan to have heavy operations involvement to help collaboration with developers and share knowledge, mostly around getting the best from the cloud platform in place. You may need to “assign” operations staff to the team at the beginning, making them so called designated operations staff instead of dedicated, as explained in Effective DevOps.

Many teams find that the operations role never leaves the team, which is perfectly normal. Indeed, the desired end-state is that the application teams have all the development and operations skills and knowledge needed to be successful.

As two cases to guide your understanding of this role:

  • Etsy has approximately 15 operations engineers to a few hundred other engineers, according to Effective DevOps.
  • As re-told by Diego Lapiduz at 18F, early on when teams were learning how to use and operate cloud.gov, he and a handful of other operations staff spent much of their time with development teams, getting intimately familiar with each application. Now, because the practice of designated operations staff is less needed, he and his operations peers are less involved and have little knowledge of the applications in use…which is good, and as intended!

As a side note, it’s common for operations people to freak out at this point, thinking they’re being eliminated. While it’s true that margin-berzerked management could choose to look at operations staff as “waste,” it’s more likely that following Jevon’s Paradoxoperations staff will be needed even more as the amount of applications and services multiply.

Product Owner/Product Manager

This is the role that defines and guides the requirements of the application. It is also one of the roles that varies in responsibilities the most across products. At its core, this role is the “owner” of the software under development. In that respect, they help prioritize, plan, and deliver software that meets your requirements. Someone has to be “the final word” on what happens in high functioning teams like this. The amount of control vs. consensus driven management is the main point of variability in this role, plus the topic areas that the product owner must be knowledge of.

It’s best to approach the product owner as a “breadth first role”: they have to understand the business, the customer, and the technical capabilities. This broad knowledge helps them make sure they’re making the right prioritization decisions.

In Pivotal Labs engagements, this role is often performed by a Pivotal employee, pairing up with one of your staff to train and transfer knowledge. Whether during a project or through workshops, they help you understand the Pivotal, iterative development approach, as well as mentor and train your internal staff to help them learn agile methods and skills, which will enable them to move on with confidence when the engagement is complete.

Designer

One of the major lessons of contemporary software is that design matters, a tremendous amount more than previously believed. The “small batch” mentality of learning and improving software afforded by cloud platforms like Pivotal Cloud Foundry gives you the ability to design more rapidly and with more data-driven precision than ever before. Hence, the role of a designer is core to cloud native teams.

The designer focuses on identifying the feature set for the application and translating that to a user experience for the development team. Activities may include completing the information architecture, user flows, wireframes, visual design, and high-fidelity mock-ups and style guides. Most importantly, designers have to “get out of the building” and not only see what actual users are doing with the software, but get to know those users and their needs intimately.

Testers (partial/optional)

While the product manager, and overall team are charged with testing their software, some organizations either want or need additional testing. Often this is “exploratory testing” where a third party (the tester[s]) are trying to systematically find the edge cases and other “bugs” the development team didn’t think of.

It’s worth questioning the need for separate testers if you find yourself in that situation to make sure you need them. Much routine “QA” is now automated (and can, thus, be done by the team and automated CI/CD pipelines), but you may want exploratory, manual testing in addition to what the team is already doing and verification that the software does as promised and functions under acceptable duress. But even that can be automated in some situations as the Chaos Monkey and Chaos Lemur show.

Architect (partial/optional)

Traditionally, this role is responsible for conducting enterprise analysis, design, planning, and implementation, using a “big picture” approach to ensure a successful development and execution of strategy. Those goals can still exist in many large organizations, but the role of an architect is evolving to be an enabler for more self-sufficient, and decoupled teams. Too often this role has become a “Dr. No” in most large organizations, so care must be taken to ensure that the architect supports the team, not the other way around.

Architects are typically more senior technical staff who are “domain experts.” They may also be more technically astute and in a consultative way help ensure the long-term quality and flexibility of the software that the team creates, share best practices with teams, and otherwise enables the teams to be successful. As such, this role may be a fully dedicated one who, hopefully, still spends much of their time coding so as not to “go soft” and lose not only the trust of developers but an intimate enough knowledge of technology to know what’s possible and not possible in contemporary software.

Data Science (partial/optional)

If your application includes a large amount of data analysis, you should consider including a data scientist role on the team. This role can follow the dedicated/designated pattern as discussed with the operations role above.

Data Science today is where design might have been a few years ago. It is not considered to be a primary role within a product team, but more and more products today are introducing a level of insight not seen before. Google Now surfaces contextual information; SwiftKey offers word predictions based on swipe patterns; Netflix offers recommendations based on what other people are watching; and Uber offers predictive arrival times of their drivers. These features help turn transactional products into smart product.

Other Roles

There are many other roles that can and do exist in IT organizations. These are roles like database administrators (DBAs), security operations, network operations, or storage operations. In general, as with any “tool,” you should use what you need when you need it. However, as with the architect role above, any role must reorient itself to enabling the core teams rather than “governing” them. As the DevOps community has discussed at length for nearly ten years, the more you divide up your staffing by function, the further you move from a small, integrated team, and your goal of consistently and regularly building quality software will become harder.

Agile Operations Roles

Roles here focus on operating, supporting, and extending the cloud platform in use. These roles typically sit “under” the Agile and DevOps teams, so the discussion here is briefer. Each role is described in term of roles and responsibilities typically encountered in Pivotal Cloud Foundry installs. These can vary by organization and deployment (public vs. private cloud, the need for multi-cloud support, types of IaaS used, etc.)

For a brief definition of a cloud platform, see “the use of a cloud platform”section below.

Application Operator

These are typically the “operations” people described above and serve as a supporting and oversight function to the business capabilities teams, whether designated or dedicated. Typical responsibilities are:

  • ŸManages lifecycle and release management processes for apps running in Pivotal Cloud Foundry.
  • ŸResponsible for the continuous delivery process to build, deploy, and promote Pivotal Cloud Foundry applications.
  • ŸEnsures apps have automated functional tests that are used by the continuous delivery process to determine successful deployment and operation of applications.
  • ŸEnsures monitoring of applications is configured and have rules / alerts for routine and exceptional application conditions.
  • ŸActs as second level support for applications, triaging issues, and disseminating them to the platform operator, platform developer or application developer as required.

A highly related, sometimes overlapping, role is that of centralized development tool providers. This role creates, sources, and manages the tools used by developers all the way from commonly used libraries to, version control and project management tools, to maintaining custom written frameworks. Companies like Netflix maintain “tools teams” like this, often open sourcing projects and practices they develop.

Platform Operator

The is the typical “sys admin” for the cloud platform itself:

  • Manages IaaS infrastructure that Pivotal Cloud Foundry is deployed to, or co-ordinates with the team that does.
  • Installs and configures Pivotal Cloud Foundry.
  • ŸPerforms capacity, availability, issue, and change management processes for Pivotal Cloud Foundry.
  • ŸScales Pivotal Cloud Foundry, forecasting, adding, and removing IaaS and physical capacity as required.
  • ŸUpgrades Pivotal Cloud Foundry.
  • ŸEnsures management and monitoring tools are integrated with Pivotal Cloud Foundry and have rules / alerts for routine and exceptional operations conditions.

Platform Engineering

This team and its roles are responsible for extending the capabilities of the cloud platform in use. What this role does per organization can vary, but common tasks of this role for organizations using Pivotal Cloud Foundry are to:

  • ŸMakes enhancements to existing buildpack(s) and builds new buildpack(s) for the platform. ŸBuilds service broker(s) to manage lifecycle of external resources and make them available to Pivotal Cloud Foundry apps.
  • ŸBuild Pivotal Cloud Foundry tiles with associated BOSH releases and service brokers to enable managed services in Pivotal Cloud Foundry.
  • ŸManages release and promotion process for buildpacks, service brokers, and tiles across Pivotal Cloud Foundry deployment topology.
  • ŸIntegrates Pivotal Cloud Foundry APIs with external tool(s) when required.

Physical Infrastructure Operations

While not commonly covered in this type of discussion, someone has to maintain the hardware and data centers. In a cloud native organization this function is typically so highly abstracted and automated — if not outsourced to a service provider or public cloud altogether — that it it does not often play a major role on cloud native operations. However, especially at first as your organization is transforming this new way of operating, you will need to work with physical infrastructure operations staff, whether in-house or with your outsourcer.

Supporting Best Practices

The use of a cloud platform

Much of the above is predicated on the use of a cloud platform. A cloud platform is a system, such as Pivotal Cloud Foundry, that provides the runtime and production needs to support applications on-top of cloud infrastructure (public and private IaaS), and often, as in the case of Pivotal Cloud Foundry, with fully integrated middleware services that are natively supported (such as databases, queues, and application development frameworks).

Two of the key ways of illustrating the efficiency of using Pivotal Cloud Foundry are (a.) the ratios of operators to application running in the platform, and, (b.) reduction in lead time. Here are some relevant metrics for Pivotal Cloud Foundry:

  • One large financial institution is currently supporting 145 applications with two operations staff.
  • 18F was able to reduce Authorization to Operate (ATO) from 9+ months to 3 days once auditors understood the automation in the open source Cloud Foundry instance at cloud.gov.
  • When planning the first product developed on Pivotal Cloud Foundry, CoreLogic allocated a team of 12 developers with four QA staff and a management team. The goal was to deliver the product in 14 months. Instead, the project ended up requiring only a product manager, one user experience designer and six engineers who delivered the desired product in just six months. As the second, third, and next projects roll on, you can expect delivery time to reduce even more.
  • A large retail user of Pivotal Cloud Foundry was running over 250 applications in non-production and nearly 100 applications in production after only 4 months using the platform.
  • Humana was able to launch an Apple Watch application in just five weeks.

Without a cloud platform like Pivotal Cloud Foundry in place, organizations will find it difficult, if not impossible, to achieve the infrastructure efficiencies in both development and at runtime needed to operate at full speed and effectiveness.

For reference, the below diagram based on Pivotal Cloud Foundry, is a “white board” of the functions a cloud platform provides:

For a gentle introduction to this stack, see the introductory discussion video at The New Stack. And, for even more discussion of the nature of cloud platforms, see Brian Gracely’s discussion of “structured cloud platforms” and Casey West’s write-up on the same topic.

Sizing: two pizza teams

While some sizing guidelines have been listed throughout, there are no hard and fast rules. The notion a “two pizza team” is a well accepted theory of software team sizes. As Amazon CEO Jeff Bezos is said to have decreedif a team couldn’t be fed with two pizzas, it’s too big. Without making too many assumptions about the size of a “large” pizza or how many slices each individual needs to eat, you could estimate this to around six to fifteen people. This may vary, of course, but the point is to keep things as small as possible to minimize communication overhead, context switching, and responsibility evasion due to “that’s not my job,” among other “wastes.”

The assumption is teams this small is that many of these small teams will organize to create big, overall results. Architectural approaches like microservices that emphasize loose coupling and defensive use of services help enable that coordination at a technical level, while making sure that teams are aware of and aligned to overall organizational goals help coordinate the “meatware.” This coordination is difficult, to be sure, but start looking at your organization a set of autonomous units working together, rather than one giant team. As an example, Pivotal Cloud Foundry engineering is composed of around 300 developers spread over 40 loosely coupled teams.

Dedicated, integrated teams lead to the best outcomes

Agile-think and DevOps teams seek to put every role and, thus, person, needed to deliver the software product from inception to running in production on the same team, with their time fully dedicated to that task. These are often called “integrated teams” or “balanced teams.”

IT has long organized itself in the opposite way, separating out roles (and, thus people) into distinct teams like operators, QA, developers, and DBAs in the hopes of optimizing the usage of these roles. This is the so called “silo” approach. When you’re operating in the more exploratory, rapid, agile cloud native fashion, this attempt at “local optimization” creates “global optimization” problems due to hand-offs between teams (leading to wastage from time and communication errors). A silo approach can also poor people interactions across team which damages the software quality and availability. “That’s not my responsibility” often leads it to being no one’s responsibility.

Most organizations have a “project” mindset when it comes to software as well. The functional roles emerge from their silos as needed to work on the project, and then disband once done. Good software development benefits from a “product” mindset where the team, instead, stays dedicated to the product.

Operating in this way, of course, is not always possible, if only at first as organizations switch over their mind-set from “siloed” teams to dedicated teams. So care must be taken when slicing a person up between different teams: intuitively we understand that people cannot truly “multitask,” and numerous studies have shown the high cost of context switching and spreading a person’s attention across many different projects. In discussing roles, keep in mind that the further you are from truly integrated, dedicated teams, the less efficiency and productivity you’ll get from people on those teams, and, thus, in the project as a whole.

More Reading

As described in the introduction, this is a quick overview of the roles and responsibilities for Agile and DevOps teams. The book Effective DevOpscontains much discussion of these topics and background on studies and proof-points. For a slightly more “enterprise-y” take, see the concepts in disciplined agile delivery which should be read as slightly “pre-cloud native” but nonetheless helpful.

Additionally, Pivotal Labs, and the Pivotal Customer Success and Transformation teams can discuss these topics further and help transform your organization accordingly. With over twenty years of experience and as the maintainers of Pivotal Cloud Foundry, one of the leading cloud platforms, we have been learning and perfecting these methods for sometime.

Thanks to the many reviewer comments from Pivotal folks, in particular Robbie CluttonCornelia Davis, and David McClure. And, as always, excellent post-post-ironic corporate clipart from geralt.

For a more recent, detailed study on this topic, check out my book Monolithic Transformation.

Self-motivated teams lead to better software

This post is pretty old and possibly out of date. There’s an updated version of it in my book, Monolithic Transformation.

In contrast to the way traditional organizations operate, cloud native enterprises are typically comprised of self-motivated and directed teams. This reduces the amount of time it takes to make decisions, deploy code, and see if the results helped move the needle. More than just focusing on speed of decision making and execution, building up these intrinsically motivatedteams helps spark creativity, fueling innovative thinking. Instead of being motivated by metrics like number of tickets closed or number of stories/points implemented, these teams are motivated by ideas like increased customer satisfaction with faster forms processing or helping people track and improve their health.

In my experience, one of the first steps in shifting how your people think — from the Pavlovian KPI bell — is to clearly explain your strategy and corporate principals. Having worked in strategy roles in the past, I’ve learned how poorly companies articulate their goals, constraints, and strategy. While there are many beautiful (and not so beautiful!) presentations that extol a company’s top motivations and finely worded strategies, most employees are left wondering how they can help day to day.

Whether you’re a leader or an individual contributor, you need to know the actionable details of the overall strategy and goals. Knowing your company’s strategy, and how it will implement that strategy, is not only necessary for breeding self-motivated and self-managed people, but it also comes in handy when you’re looking to apply agile and lean principles to your overall continuous delivery pipeline. Tactically, this means taking the time to create detailed maps of how your company executes its strategy. For example, you might do value-stream mappingalignment maps, or something like value chain mapping. This is a case where I, again, find that companies have much less than they think they do — often, organizations have piles of diagrams and documents — but — very rarely have something that illustrates everything — from having an idea for a feature to getting it in customer’s hands. A cloud native enterprise will always seek to be learning from and improving that end-to-end process. So, it’s required to map your entire business process out and have everyone understand it. Then, we all know how we deliver value.

The process of value-stream mapping, for example, can be valuable for simply finding waste (so much so that the authors of Learning to See cite only focusing on waste removal as a lean anti-pattern). As one anecdote goes — after working for several weeks to finally get everything up on the big white board, one of the executives looked up at all the steps and time wasted in their process to get software out the door and said, “there’s a whole lot of stupid up there.” The goal of these exercises is not only removing stupid, but also to focus on and improve the processes.

 

For a more recent, detailed study on this topic, check out my book Monolithic Transformation.

Addressing the DevOps compliance problem

Satisfying the mythical auditors is often one of the first barriers to spreading DevOps initiatives more widely inside an organization. While these process-driven barriers can be annoying and onerous, once you follow the DevOps tradition of empathetic inclusion — being all “one team” — they can not only stop slowing you down but actually help the overall quality of the product. Indeed, the very reason these audit checks were introduced in the first place was to ensure overall quality of the software and business. There’s some excellent, exhaustive overviews out there of dealing with audits and the like in DevOps. In this column, I wanted to go through a little mental re-orientation for how to start thinking about and approaching the “compliance problem.”

Three-Ring Binder Ninjas

In this context, I think of “auditors” as falling into the category of governance, risk and compliance (GRC) — any function that acts as a check on code as and how the code is produced and run as it goes through its lifecycle. I would put security in here as well, though that tends to be such a broad, important topic that it often warrants its own category (and the security people seem to like maintaining their occultic silo-tude, anyhow).

The GRC function(s) may impose self-created policies (like code and architectural review), third party and government imposed regulations (like industry standard compliance and laws such as HIPAA), and verification that risky behavior is being avoided (if you write the code, you can’t be the same person who then uses that code for cash payouts, perhaps, to yourself, for example). In all cases, “compliance” is there to ensure overall quality of the product and the process that created it. That “quality” may be the prevention of malicious and undesired behavior; that is, in a compliance-driven software development mindset, the ends rarely justify the means.

In many cases, the GRC function is more interested in proof that there is a process in place than actually auditing each execution of that process. This is a curious thing at first. Any developer knows that the proof is in the code, not the documentation. And, indeed, for some types of GRC the amount of automation that a DevOps mindset puts into place could likely improve the quality of GRC, ironically.

Establishing trust and automating compliance

Indeed, automation is one of the first areas to look at when reducing DevOps/GRC friction. First, treat complying with policies as you would any other feature. Describe it, prioritize it and track it. Once you have gotten your hands around it, you can start figure out how to best implement that “feature.” Ideally, you can code and automate your way out of having to do too much manual work.

There’s work being done in the US Federal government along these lines that’s helpful because it’s visible and at scale. First, as covered in a recent talk by Diego Lapiduz, part of what auditors are looking for is to trust the software and infrastructure stack that apps are running on. This is especially true from a security standpoint. The current way that software is spec’d out and developed in most organizations follows a certain “do whatever,” or even YOLO principal. App teams are allowed to specify which operating systems, orchestration layers and middleware components they want. This may be within an approved list of options, but more often than not it results in unique software stacks per application.

As outlined by Diego, this variation in the stack meant that government auditors had to review just about everything, taking up to months to approve even the simplest line of code. To solve this problem, 18F standardized on one stack — Cloud Foundry — to run applications on, not allowing for variance at the infrastructure layer. They then worked with the auditors to build trust in the platform. Then, when there was just the metaphoric or literal “one line of code” to deploy, auditors could focus on much less, certainly not the entire stack. This brought approval time down to just days. A huge speed up.

When it comes to all the paperwork, also look to ways to automate the generation of the needed listings of certifications and compliance artifacts. This shouldn’t be a process that’s done in opaque documents, nor manually, if at all possible. Just as we’d now recoil in horror at manually deploying software into production, we should try to achieve “compliance as code” that’s as autogenerated (but accurate!) as possible. To that end, the work being done in the OpenControl project is showing an interesting and likely helpful approach.

The lessons for DevOps teams here is clear: Standardize your stack as much as possible and work with auditors to build their trust in that platform. Also, look into how you can automate the generation of compliance documents beyond the usual .docx and .pptx suspects. This will help your GRC process move at DevOps speed. And it will also allow your auditors to still act as a third party governing your code. They’ll probably even do a better job if they have these new, smaller batches of changes to review.

Refactoring the compliance process

To address the compliance issue fully, you’ll need to start working with the actual compliance stakeholders directly to change the process. There’s a subtle point right there: Work with the people responsible for setting compliance, not those responsible for enforcing it, like IT. All too often, people in IT will take the strictest view of compliance rules, which results in saying “no” to virtually anything new — coupled with Larman’s Law, you’ll soon find that, mysteriously, nothing new ever happens and you’re back to the pre-DevOps speed of deployment, software quality levels and timelines. You can’t blame IT staff for being unimaginative here — they’re not experts in compliance and it’d be risky for them to imagine “workarounds.” So, when you’re looking to change your compliance process, make sure you’re including the actual auditors and policy setters in your conversations. If they’re not “in the room,” you’re likely wasting your time.

As an example, one of the common compliance problems is around “developers deploying to production.” In many cases and industries, a separation of duties is required between coding and deploying. When deploying code to production was a more manual, complicated process, this could be extremely onerous. But once deployments are push-button automated with a good continuous delivery pipeline,you might consider having the product manager or someone who hasn’t written code be the deployer. This ensures that you can “deploy at will,” but keeps the actual coders’ fingers off the button.

As another intriguing compliance strategy, suggested by Home Depot’s Tony McCulley (who also suggested the above approach to the separation of duties) is to give GRC staff access to your continuous delivery process and deployment environment. This means instead of having to answer questions and check for controls for them, you can allow GRC staff to just do it on their own. Effectively, you’re letting GRC staff peer into and even help out with controls in your software. I’d argue that this only works if you have a well-structured platform supporting your CD pipeline with good UIs that non-technical staff can access.

It might be a bit of a stretch, but inviting your GRC people into your DevOps world, especially early on, may be your best bet at preventing compliance slowdowns. And, if there’s any core lesson of DevOps, it’s that the real problems are not in the software or hardware, but the meatware. Figuring out how to work better with the people involved will go a long way towards addressing the compliance problem.

(I originally wrote this December 2015 for FierceDevOps, a site which has made it either impossible or impossibly tedious to find these articles. Hence, it’s now here.)

Barriers to DevOps in government

There’s just as much pull for DevOps in government as there is in the private sector. While most of our focus around adoption is on how businesses can and are using DevOps and continuous delivery, supported by cloud, to create better software, many government agencies are in the same position and would benefit greatly from figuring out how to apply DevOps in their organizations.

Just 13% of respondents in a recent MeriTalk/Accenture survey of 152 US Federal IT managers believed they could “develop and deploy new systems as fast as the mission requires.” The impact of improving on that could be huge. For example, the US Federal government, by conservative estimates, spends $84 billion a year on IT. And yet, the Standish Group believes that 94% of government IT projects fail. These are huge numbers that, with even small improvements, can have massive impact. And that’s before even considering the benefits of simply improving the quality of software used to provide government services.

As with any organization, the first filter for applicability is whether or not the government organization is using custom written software to accomplish it’s goals. If all the organization is doing is managing desktops, mobile, and packaged software, it’s likely that just SaaS and BYOD are the important areas to focus on. DevOps doesn’t really apply, unless there’s software being written and deployed in your organization or, as is more common in government agencies, for your organization as we’ll get to when we discuss “contractors.”

When it comes to adopting and being successful with DevOps, the game isn’t too different than in the business world: much of the change will have to do with changing your organization’s process and “culture,” as well as adopting new tools that automate much of what was previously manual. You’ll still need to actually take advantage of the feedback loop that helps you improve the quality of your software, in respect to defect, performance in production, and design quality. There are a few things that tend to be more common in government organizations that bear some discussion: having to cut through red-tape, dealing with contractors, and a focus on budget.

Living with red-tape

While “enterprise” IT management tasks can be onerous and full of change review boards and process, government organizations seem to have mastered the art of paperwork, three ring binders, and red tape in IT. As an example, in the US Federal government, any change needs to achieve “Authority To Operate” which includes updating the runbook covering numerous failure conditions, certifying security, and otherwise documenting every aspect of the change in, to the DevOps minded, infinitesimal detail. And why not? When was the last time your government “failed fast” and you said “gosh, I guess they’re learning and innovating! I hope they fail again!” No, indeed. Governments are given little leash for failure and when things go terribly wrong, you don’t just get a tongue lashing from your boss, but you might get to go talk to Congress and not in the fun, field-trip how a bill is made kind of way. Being less cynical, in the military, intelligence, and law enforcement parts of government, if things go wrong more terrible things than denying you the ability to upload a picture of your pot roast to Instagram can happen. It’s understandable — perhaps, “explainable” — that government IT would be wrapped up in red-tape.

However, when trying to get the benefits of continuous delivery, DevOps, and cloud (or “cloud native” as that tryptic of buzzwords is coming to be known), government organizations have been demonstrating that the comforting mantle of red-tape can be stripped. For example, in the GSA, the 18F group has reduced the time it takes to get a change through from 9–14 months to just two to three days.

They achieved this because now when they deploy applications on their cloud native platform (a Cloud Foundry instance that they run on Amazon Web Services) they are only changing the application, not the whole stack of software and hardware below the application layer. This means they don’t need to re-certify the he middleware, runtimes and development frameworks, let alone the entire cloud platform, operating systems used, networking, hardware, and security configurations. Of course, the new lines of application code need to be checked, but because they’re following the small batch principles of continuous delivery, those net-new lines are few.

The lesson here is that you’ll need to get your change review process — the red-tape spinners — to trust the standard cloud platform you’re deploying your applications on. There could be numerous ways to do this from using a widely used cloud platform like Cloud Foundry, building up trusted automation build processes, or creating your own platform and software release pipelines that are trusted by your red-tape mavens.

Contractors & Lost Competency

If you want to get staff in a government IT department ranting at you all night long, ask them about contractors. They loathe them and despise them and will tell you that they’re “killing” government IT. Their complaints is that contractors cannot structurally deal with an Agile mentality that refuses to lock-down a full list of features that will be delivered on a specific date. As you shift to not even a “DevOps mindset,” but an Agile mindset where the product team is more discovering with each iteration what the product will be and how to best implement it, you need the ability to change scope throughout the project as you learn and adapt. There is no “fail fast” (read: learning) when the deliverables 12 months out are defined in a 300 page document that took 3–6 months to scope and define.

Once again, getting into this state is likely explainable: it’s not so much that any actor is responsible, it’s more that the management in government IT departments is now responsible to fix the problem. The problem is more than a square peg (waterfall mentalities from contractors) in a round-hole (government IT departments that want to be more Agile) issue. After several decades of outsourcing to contractors, there’s also a skills and cultural gap in the IT departments. Just as custom written software is becoming strategically important to more organizations, many large IT departments find themselves with little experience and even less skill when it comes to software and product development. I hear these same complaints frequently from the private sector who’ve outsourced IT for many years, if not decades.

The Agile community has long discussed this problem and there are always interesting, novel efforts to get back to insourcing. A huge part is simply getting the terms of outsourcing agreements to be more compatible. The flip-side of this is simplifying the process to become a government contractor: it’s sure not easy at the moment. Many of the newer, more Agile and DevOps minded contractors are smaller shops that will find the prospect of working with the government daunting and, well, less profitable than working with other organizations. Making it easier for more shops to sign up will introduce more competitions rather than the more limited strangle-hold by paperwork, smaller market that exists now. The current pool of government contractors seems mostly dominated by larger shops that can navigate the government procurement process and seem to, for whatever reason, be the ones who are the most inflexible and waterfall-y.

Another part is refusing to ceed project management and scoping management to external partiers; and, making sure you have the appropriate skills in-house to do so. Finally, the management layers in both public and private sector need to recognize this as a gap that needs to be filled and start recruiting more in-house talent. Otherwise, the highly integrated state of DevOps — let alone a product focus vs. a project focus — will be very hard to achieve.

Addressing budgetary concerns with waste removal

Every organization faces budget problems. We call them “unicorns” because they have this mythical quality of seemingly unlimited budget. The spiral horn-festooned are the exception that proves the rule that all organizations are expected to spend money wisely. Government, however, seems to operate in a permanent state of shrinking IT budgets. And even when government organizations experience the rare influx of cash, there’s hyper-scrutiny on how it’s spent. To me, the difference is that private sector companies can justify spending “a lot” of money if “a lot” of profit results, where-as government organizations don’t find such calculations as easily. Effectively, government IT departments have to prove that they’re spending only as much money as necessary and strategically plan to have their budget stripped down in each budgetary cycle.

Here, the Lean-think part of DevOps can actually be very helpful and, indeed, may become a core motivation for government to look to DevOps. My simplification of the goals of DevOps are to:

  1. Ensure that the software has good availability (which it does by focusing on resilience vs. perfection, the ability to recover from failure quickly rather than avoiding all failure by rarely changing anything). This is something that recent failures in US Federal government IT can appreciate.
  2. Enable the weekly, if not daily, deployment of new code into production with continuous delivery. The goal here is to improve the quality of the software, both bugs and “design” quality, ensuring that the software is what users actually want by iterating over features frequently.

Those two goals end up working harmoniously together (with smaller batches of code deployed more frequently, you reduce the risk of each causing major downtime, for example). For government organizations focused on “budget,” the focus on removing as much “waste” from the system to speed up the delivery cycle starts to look very attractive for the cost-cutting minded. A well functioning DevOps shop will spend much time analyzing the entire, end-to-end cycle with value-stream mapping, stripping out all the “stupid” from the process. The intention of removing waste in DevOps think is more about speeding up the software release process and helping ensure better resilience in production, but a “side effect” can be removing costs from the system.

Often, in the private sector we say that resources (time, money, and organization attention) saved in this process can be reallocated to helping grow the business. This is certainly the case in government, where “the business” is, of course, understood not as seeking profits but delivering government services and fulfilling “mission” requirements. However, simply reducing costs by finding and removing unneeded “waste” may be an highly attractive outcome of adopting DevOps for governments.

“Bureaucracy” doesn’t have to be a bad word

As with any large organization, governments can be horrendous bureaucracies. Pulling out the DevOps empathy card, it’s easy to understand why people in such government bureaucracies can start to stagnate and calcify, themselves becoming grit in the gears of change if not outright monkey-wrenches.

In particular, there are two mind-sets that need to change as government staff adopt DevOps:

  1. Analysis paralysis — The almost default impulse to over analyze and specify with ponderous, multi-100 page documents the shifting to a more Agile and DevOps mindset. A large part of the magic of DevOps and Agile think is avoiding analysis paralysis and learning by doing rather than thinking in .docx. Government teams not familiar with smaller batch, experiment-based approaches to software development would do well to read up on Lean Startup think, perhaps checking out Lean Enterprise for a compendium of current best practices and, well, mindsets.
  2. Stagnant minds — large organizations, particularly government ones, can breed a certain learned helplessness and even laziness in individuals. If things are slow moving, impossible to change, and managed in a tall blade of grass gets cut style, individuals will tune out rapidly. If DevOps is understood as a practice to help jump-start all too slow IT organizations, it’ll often be the case that individuals in that organization are in this stagnated mindset. One of the key challenges becomes inspiring and then motivating staff to care enough to try something new and stick with it.

Again, these problems frequently happen in the private sector. But, they seem to be larger problems in government that bear closer attention. Thankfully, it seems like leaders in government know this: in a recent Gartner, global survey, 40% of government CIOs said they needed to focus more on developing and communicating their vision and do more coaching. In contrast, 60% said they needed to reduce the time spent in command-and-control mode. Leading, rather than just managing, the IT department, as ever, is key to the transformative use of IT.

More than rats dragging pizza

In any given time, it’s easy to be dismissive of government as wasteful and even incompetent. That’s the case in the U.S. at least, if you can judge based on the many politicians who seem to center their political campaigns around the idea of government waste. In contrast, we praise the private sector for their ability to wield IT to…better target ads to get us to buy sugar coated corn flakes. Don’t get me wrong, I’m part of the private sector and I like my role chasing profit. But we in the “enterprise” who are busy roaming the halls of capitalism don’t often get the chance to positively effect, let alone simply help and improve the lives of, everyone on a daily basis. Government has that chance and when you speak with most people who are passionate about using IT better in government, they want to do it because they are morally motivated to help society.

The benefits of adopting DevOps have been clearly demonstrated in recent years, and for businesses we’re seeing truth in the statement that you’re either becoming a software organization or losing to someone who is. As government organizations start to think about improving how they do IT, they have the chance to help all of us, “winning” isn’t zero-sum like it can be in the business world. To that end, as we in the industry find new, better ways to create and deliver software, it behoves us to figure out how government can benefit as well. That’ll get us a even closer towards making software suck less something we’ll all benefit from.

(I originally wrote this September 2015 for FierceDevOps, a site which has made it either impossible or impossibly tedious to find these articles. Hence, it’s now here.)

Management’s role in DevOps: orchestrating the why

Donkey teamwork

What’s the point of it all? Why are we doing this? These questions pop up frequently in IT teams where the reason for doing your daily activities — like churning through tickets, whizzing up builds, or “doing the DevOps” — seems only that someone, somewhere told you to do it.

If you’re in this situation — you have no idea how your activities are helping your organization make money — you should stop and find out quickly what your company’s goals and strategies are to make sure you’re not wasting time. The good news is the confusion is probably not your fault; the bad news is that you’ll have to convince management that the fault is theirs.

Gratuitous optimization by technology

The adoption of things like DevOps or the cloud sometimes happens for wrong or unknown reasons — gratuitous plans without a tight connection to business goals. We used to call this “management by magazine,” and it happens now more than ever. A process — even “cultural” — change like DevOps is not like the easy improvement fodder of virtualization. But you can’t blame IT management for trying gratuitous optimization by technology. The magic of VMware was that you just installed it, and things got better because it improved resource utilization. You didn’t need to figure out why or match it to goals. If you inject DevOps into an organization expecting it to just improve things without tightly coupling to strategy, you’ll get weird results. You’ll probably just create more work!

If you don’t know where you are going, any road will get you there

Agile, DevOps, and now “cloud native” (I hope you’re updating your buzzword lexicons!) need strong connections to the business goals — some would say “strategy” — to be successful. In order to operate in a lean fashion, you want to only do things that are valuable and useful to the customer (or obligatory to stay in business, like compliance and auditability). Indeed, being able to sort out what’s valuable and useful to the business is a key tool for doing DevOps successfully. You want to cut out all the stuff that doesn’t matter, or at least minimize it. Otherwise, you just sort of do everything and anything because there’s no way to determine if any given activity is helpful.

So how do you align your work with the overall business strategy?

There are tried and true (though seemingly new to the IT department) techniques like value-stream mapping: take any given business process and map out all the activities that happen from end-to-end, questioning if each is needed. Most people are shocked at how much “stupid” is going on in such maps and it’s a great technique for finding and removing bottlenecks.

If you’re in the consumer business — like so many “unicorns” are — it’s easy to understand the mission and the goals: get more people buying books, downloading your app, streaming more videos, and so forth. But in other, more traditional settings, it’s common to find a willful disentanglement between how IT is used and how it contributes to customer value. More than not, the stasis-inducing ludlum of time and success just numbs people’s collective minds and sets them into auto-pilot here.

You see this happen most often around decision making processes in business: things that need approval, planning processes and market assessments. People in large companies love cogitating and wrapping process around activities that cause change in the company; it feels like they almost like to slow down change and activity. You might even codify in a whole process with change review board meetings and careful inspection of all the changed components by a panel of architectural and security audit wizards.

Cultivating complainers

You can also identify where your processes aren’t matching with business goals and strategies by cultivating squeaky wheels.

When change happens, individuals often pipe up asking, “Why are we doing this? Why is this valuable to the customer?” More than likely, they’re seen as troublemakers or sand in the gears, and are shut down by the group, Five Monkeys style. At best, these individuals cope with learned helplessness; at worst, they leave, kicking off a sort of Idiocracy effect in the remaining organization.

These “complainers” are actually a valuable source of data for testing out how well understood a company’s goals and strategies are. You want to court these types of people to continually test out how effective the organization is at setting goals and strategy. One fun practice, as mentioned by Ticketmaster’s Jody Mulkey, is to interview new employees a month after starting to ask them what seems “screwy around here” before they get used to it.

Blame management

So what do you do when they or any other process you’ve tried identify real disconnects between what you’re doing and why? The fun begins — because it’s management’s job to fix this bug. The role of mid- and upper-level management in the cloud native era is poorly understood and documented (its always been so, of course, in creative-driven endeavors like software). To be successful at these types of initiatives, management has a lot of work to do and the managers who are overseeing DevOps teams can’t assume things will just proceed as normal. This is why, as with software, you need to continually test the assumption that people know the business goals and strategy.

This point has been stuck in my brain after reading Leading the Transformation (an excellent book for managers figuring out how to apply DevOps “in the large”), which states the point more plainly than I can:

Management needs to establish strategic objectives that make sense and that can be used to drive plans and track progress at the enterprise level. These should include key deliverables for the business and process changes for improving the effectiveness of the organization.

What I like about this advice (and the rest in the book) is that it’s geared to defining management’s job in all this DevOps hoopla. In said hoopla, we spend a lot of time on what the team does, but we don’t spend too much time on what management should do. It’s lovely thinking about flattening the organization and having everyone act as responsible peers, but in large organizations, this isn’t done easily or quickly. Just as with all those zippy containers, you need an orchestration layer to make sense of all the DevOps teams working together. That’s the job of management in the cloud native era: setting goals and orchestrating all the teams to make sure they’re all pulling in the right direction.

(I originally wrote this August 2015 for FierceDevOps, a site which has made it either impossible or impossibly tedious to find these articles. Hence, it’s now here.)