In this episode, we talk about the start of my project to write-up tactics for doing DevOps and getting all cloud-native in organizations that are resistant.
If there’s one complaint that I hear consistently in my studies of IT in large organizations, it’s that government IT, as traditionally practiced, is fucked. Compared to the private sector, the amount of paperwork, the role of contractors, and the seeming separation between doing a good job and working software drives all sorts of angst and failure.
Mark Schwartz’s book on figuring out “business value” in IT is turning out to be pretty amazing and refreshing, especially on the topic of government IT. He’s put together one of the better “these aren’t the Droids you’re looking for” responses to ROI for IT.
You know that answer: you just want to figure out the business case, ROI, or whatever numbers driven thing, and all the DevOps-heads are like “doo, doo, doo-doo – driving through a tunnel, can’t hear you!” and then they pelt you with Goldratt and Deming books, blended in with some O’Reilly books and The Phoenix Project. “Also, you’re argument is invalid, because reasons.”
A Zen-like calm comes over them, they close their eyes and breath in, and then start repeating a mantra like some cowl-bedecked character in a Lovecraft story: “survival is not mandatory. Survival is not mandatory. Survival is not mandatory!”
Real helpful, that lot. I kid, I jest. The point of their maniacally confusing non-answers is, accurately, that your base assumptions about most everything are wrong, so before we can even approach something as precise as ROI, we need to really re-think what you’re doing. (And also, you do a lot of dumb shit, so let’s work on that.)
But you know, no one wants to hear they’re broken in the first therapy session. So you have to throw out some beguiling, mind-altering, Lemarchand’s boxes to change the state of things and make sure they come to the next appointment.
Works as Designed
Anyhow, back to Schwartz’s book. I’ll hopefully write a longer book review over at The New Stack when I’m done with it, but this one passage is an excellent representation of what motivates the book pelters and also a good unmasking of why things are the way we they are…because we asked for them to be so:
The US government is based on a system of “checks and balances”—in other words, a system of distrust. The great freedom enjoyed by the press, especially in reporting on the actions of the government, is another indication of the public’s lack of trust in the government. As a result, you find that the government places a high value on transparency. While companies can keep secrets, government is accountable to the public and must disclose its actions and decisions. There is a business need for continued demonstrations of trustworthiness, or we might as well say a business value assigned to demonstrating trustworthiness. You find that the government is always in the public eye—the press is always reporting on government actions, and the public is quick to outrage. Government agencies, therefore, place a business value on “optics”—how something appears to the observant public. In an oversight environment that is quick to assign blame, government is highly risk averse (i.e., it places high business value on things that mitigate risk).
And then summarized as:
…the compliance requirements are not an obstacle, but rather an expression of a deeper business need that the team must still address.
Which is to say: you wanted this, and so I am giving it to you.
The Agile Bureaucracy
The word “bureaucracy” is something like the word “legacy.” You only describe something as legacy software when you don’t like the software. Otherwise, you just call it your software. Similarly, as Schwartz outlines, agile (and all software processes) are insanely bureaucratic, full of rules, norms, and other “governance.” We just happen to like all those rules, so we don’t think of them as bureaucracy. As he writes:
While disavowing rules, the Agile community is actually full of them. This is understandable, because rules are a way of bringing what is considered best practices into everyday processes. What would happen if we made exceptions to our rules—for instance, if we entertained the request: “John wants to head out for a beer now, instead of fixing the problem that he just introduced into the build?” If we applied the rules capriciously or based on our feelings, they would lose some of their effectiveness, right? That is precisely what we mean by sine ira et studio in bureaucracy. Mike Cohn, for example, tells us that “improving technical practices is not optional.”15 The phrase not optional sounds like another way of saying that the rule is to be applied “without anger or bias.” Mary Poppendieck, coauthor of the canonical works on Lean software development, uses curiously similar language in her introduction to Greg Smith and Ahmed Sidky’s book on adopting Agile practices: “The technical practices that Agile brings to the table—short iterations, test-first development, continuous integration—are not optional.” I’ve already mentioned Schwaber and Sutherland’s dictum that “the Development Team isn’t allowed to act on what anyone else [other than the product owner] says.”17 Please don’t hate me for this, Mike, Mary, Ken, and Jeff, but that is the voice of the command-and-control bureaucrat. “Not optional,” “not allowed,” – I don’t know about you, but these phrases make me think of No Parking and Curb Your Dog signs.
These are the kind of thought-trains that only ever evoke “well, of course my intention wasn’t the awful!” from the other side. It’s like with the ITIL and the NRA, gun-nut people: their goal wasn’t to put in place a thought-technology that harmed people, far from it.
Gentled nestled in his wry tone and style (which you can image I love), you can feel some hidden hair pulling about the unintended consequences of Agile confidence and decrees. I mean, the dude is the CIO of a massive government agency, so he be throwing process optimism against brick walls daily and too late into the night.
The cure, as ever, is to not only to be smart and introspective, but to make evolution and change part of your bureaucracy:
Rules become set in stone and can’t change with circumstances. Rigidity discourages innovation. Rules themselves come to seem arbitrary and capricious. Their original purpose gets lost and the rules become goals rather than instruments. Bureaucracies can become demoralizing for their employees.
So, you know, make sure you allow for change. It’s probably good to have some rules and governance around that too.
I started a new booklet project, the Cloud Native Cookbook.
The premise is this:
The premise of this book is to collect specific, tactical advice transitioning to a cloud-native organization. The reader is someone who “gets it” when it comes to agile, DevOps, cloud native, and All the Great Things. Their struggle is actually putting it all in place. Any given organization has all of it’s own, unique advantages and disadvantages, so any “fix” will be situational, of course.
This cookbook draws from actual experiences of what worked and didn’t work to try to help organizations hack out a path to doing software better. While we’ll allow ourselves some “soft,” cultural things here and there, each of the “recipes” should be actionable, tangible items. At the very least, the rainbows and unicorns stuff should have concrete examples, e.g., how do you get people to actually pair program when they think it’s a threat to their self-worth?
As with my previous cloud-native booklet, I have this one open for comments as I’m working on it. It’d be great to get your input.
Here’s some slides I’ve been using around all this.
My colleague Richard has a nice post suggesting the new functions enterprise architects can play in a cloud-native organization. I like this one in particular, help make the change:
Champion new team organization patterns. As an architect, you can bring developers and operations teams together. Recognize that functional silos slow down delivery. A DevOps-type approach really works. Architects are perfectly positioned to pioneer new team structures that increase velocity and customer attentiveness.
It’s brief, but there’s plenty of other good chunks of advice in there.
In a report giving advice to mainframe folks looking to be more Agile, Gartner’s Dale Vecchio and Bill Swanton give some pretty good advice for anyone looking to change how they do software.
Here’s some highlights from the report, entitled “Agile Development and Mainframe Legacy Systems – Something’s Got to Give”
Chunking up changes:
- Application changes must be smaller.
- Automation across the life cycle is critical to being successful.
- A regular and positive relationship must exist between the owner of the application and the developers of the changes.
This kind of effort may seem insurmountable for a large legacy portfolio. However, an organization doesn’t have to attack the entire portfolio. Determine where the primary value can be achieved and focus there. Which areas of the portfolio are most impacted by business requests? Target the areas with the most value.
An example of possible change:
About 10 years ago, a large European bank rebuilt its core banking system on the mainframe using COBOL. It now does agile development for both mainframe COBOL and “channel” Java layers of the system. The bank does not consider that it has achieved DevOps for the mainframe, as it is only able to maintain a cadence of monthly releases. Even that release rate required a signi cant investment in testing and other automation. Fortunately, most new work happens exclusively in the Java layers, without needing to make changes to the COBOL core system. Therefore, the bank maintains a faster cadence for most releases, and only major changes that require core updates need to fall in line with the slower monthly cadence for the mainframe. The key to making agile work for the mainframe at the bank is embracing the agile practices that have the greatest impact on effective delivery within the monthly cadence, including test-driven development and smaller modules with fewer dependencies.
It seems impossible, but you should try:
Improving the state of a decades-old system is often seen as a fool’s errand. It provides no real business value and introduces great risk. Many mainframe organizations Gartner speaks to are not comfortable doing this much invasive change and believing that it can ensure functional equivalence when complete! Restructuring the existing portfolio, eliminating dead code and consolidating redundant code are further incremental steps that can be done over time. Each application team needs to improve the portfolio that it is responsible for in order to ensure speed and success in the future. Moving to a services-based or API structure may also enable changes to be done effectively and quickly over time. Some level of investment to evolve the portfolio to a more streamlined structure will greatly increase the ability to make changes quickly and reliably. Trying to get faster with good quality on a monolithic hairball of an application is a recipe for failure. These changes can occur in an evolutionary way. This approach, referred to in the past as proactive maintenance, is a price that must be paid early to make life easier in the future.
You gotta have testing:
Test cases are necessary to support automation of this critical step. While the tooling is very different, and even the approaches may be unique to the mainframe architecture, they are an important component of speed and reliability. This can be a tremendous hurdle to overcome on the road to agile development on the mainframe. This level of commitment can become a real roadblock to success.
Another example of an organization gradually changing:
When a large European bank faced wholesale change mandated by loss of support for an old platform, it chose to rewrite its core system in mainframe COBOL (although today it would be more likely to acquire an off-the-shelf core banking system). The bank followed a component-based approach that helped position it for success with agile today by exposing its core capabilities as services via standard APIs. This architecture did not deliver the level of isolation the bank could achieve with microservices today, as it built the system with a shared DBMS back-end, as was common practice at the time. That coupling with the database and related data model dependencies is the main technical obstacle to moving to continuous delivery, although the IT operations group also presents cultural obstacles, as it is satis ed with the current model for managing change.
A reminder: all we want is a rapid feedback cycle:
The goal is to reduce the cycle time between an idea and usable software. In order to do so, the changes need to be smaller, the process needs to be automated, and the steps for deployment to production must be repeatable and reliable.
The ALM technology doesn’t support mainframes, and mainframe ALM stuff doesn’t support agile. A rare case where fixing the tech can likely fix the problem:
The dilemma mainframe organizations may face is that traditional mainframe application development life cycle tools were not designed for small, fast and automated deployment. Agile development tools that do support this approach aren’t designed to support the artifacts of mainframe applications. Modern tools for the building, deploying, testing and releasing of applications for the mainframe won’t often t. Existing mainframe software version control and conguration management tools for a new agile approach to development will take some effort — if they will work at all.
Use APIs to decouple the way, norms, and road-map of mainframes from the rest of your systems:
wrapping existing mainframe functions and exposing them as services does provide an intermediate step between agile on the mainframe and migration to environments where agile is more readily understood.
Contrary to what you might be thinking, the report doesn’t actually advocate moving off the mainframe willy-nilly. From my perspective, it’s just trying to suggest using better processes and, as needed, updating your ALM and release management tools.
Read the rest of the report over behind Gartner’s paywall.
Instead, the post emphasized new ways NASA is thinking about acceptable risk for these launches. During the space shuttle era, launch sites were governed by some 2,200 safety requirements. NASA worked to “strip down the expensive rules and came up with just 500 core safety requirements for launches.”
But, the post notes, ”at commercial launch sites, there are only 55 safety requirements.” This “underscores perhaps the single biggest difference between NASA and its commercial partners: risk tolerance. NASA likes to have next to none, and pays for it, while the companies tend to embrace calculated risk as a way to iterate and learn while still developing their products.”
This a major point in companies fixing up their software process: you probably have all sorts of rules and regulations that no longer apply. Just like with managing a legacy application portfolio, you have to aggressively manage your compliance portfolio to eliminate rules that no longer apply.
Matt Curry covers this point with a funny boat-metaphor at the start of his brand talk from last year. You also hear some cases of federal agencies forcing this culling by simply not following the rules and seeing if anyone complained: the compliance equivilent of just turning off old, dusty servers to see who’s using it.
Under the auspices of “how can we DevOps around here?” I’ve had numerous conversations with organizations who feel like they can’t get their people to change over to the practices and mind-set that leads to doing better software. The DevOps community has spent a lot of time trying to gently bring along and even brain-hack resistant people and there are numerous anecdotes and studies that doing things in DevOps/cloud native/agile/etc. way not only make businesses run more smoothy and profitably, but make the actual lives of staff better (interested in leaving work at 5pm and not working on weekends?)
Still, for whatever reasons, the idea of “the frozen-middle” is reoccurring. In government work, there’s often the frozen-ground” with staff throwing down their heels as well.
Most of the chatter on this topic is at the team level, and usually teams in already flexible, new organizations (like all those logos on scare-slides in presentations from people like Pivotal: UBER! AIRBNB! TESLA! SOFTWARE-IS-EATING-THE-WORLD, INC.!) If you’re already in a company that can change its culture every year, the pain of these “frozen” people is almost nil in comparison to being in government, a 50-100 years old company, or otherwise “in the real world.”
Right here, I’m supposed to establish empathy with these frozen people to get them to change. I should stop calling them “frozen people” and treating them like “resources,” and think of them as people. That is all true, but what do I do after I’ve cavorted with them in empathy splash-pads for months on end and we still can’t get them to configure the firewall so we can do the DevOps? What do you do with enterprise architects who have become specialists in telling you how impossibly fucked up everything is? How do you work with all those “frozen middle-managers” who are acting as a “shit umbrella” (or maybe in this case, a “rainbow umbrella”?) to all your fanciful DevOps notions?
There’s one last ditch tactic: money and firing. When it comes to changing the culture in a large orginizations, this is the big hammer that upper-level management has. That is, the middle-manager’s manger is the one who has to tools to fix the problem. By design, the surest, easiest, sanctioned way to change behavior is to change cash pay-outs and other compensation based on not only the performance of people (management and staff), but their willingness to adapt and change to new, better ways of operating. That is: you reward and punish people based on them doing what you told them to do, or not.
There’s all sorts of “little and cheap” things to do instead of bringing out the big hammer:
- Giving out acrylic trophies, showering people in attention from high-level execs
- Fame and glory internally and externally
- T-shirts (no, really! They work amazingly well)
- Sanctioning goofing off (“we may have a 15 day holiday policy, but, you know, you should leave work at 4 every day and don’t log vacation”)
Matt Curry covers most of these management-hacks in a great talk on changing large organizations. But let’s assume those aren’t working.
At this point, “leadership” (middle-management’s management) needs to reward and punish the unfreezing or freezing. If middle-management changes (“unfreezes”), they get paid more. If they don’t unfreeze at least their pay needs to freeze, and at “best” they need to get quarantined or fired (as a tone-note, back when I was young and the idea of getting “laid-off” or “RIFed” started being used I reprogrammed myself to only ever say “fired” in reference to loosing your job: those soft, BS words obscure the fact that you’ve been, well, fired – so, feel free to s/// whatever more HR-correct phrase you like into there).
If you look at the patterns of change in most companies going cloud native, they do exactly this. Organizations that successful change so that they can start doing software better set up separate “labs” where the “digital transformation” happens. These are often logically and physically separate: they often have new names and are often “off-site” or well hidden, in the skunk-works style. They do this because if they stay mentally and physically in the “old” system, they end up getting frozen simply due to proximity. The long-term, managerial strategy, here, is an application of the strangler pattern to the “legacy” organization: eventually, the new “labs” organization, through value to the business and sheer size, becomes the “normal” organization.
The staff process they use – almost as a side-effect! – is to only move over people who are willing to change. I’d argue that “skills” and aptitude have little to do with the selection of people: any person in IT can learn new ways of typing and right clicking if the company trains them and helps them. We’re all smart, here.
What happens is that you cull the frozen people, either leaving them behind in the old organization (where there’s likely plenty of work to be done keeping the legacy systems alive!), or just outright firing them. That’s all brutal and not in the rainbows and sandals spirit of DevOps, but it’s what I see happening out there to successfully change the fate of large organizations’ IT departments.
You don’t really hear about this in cheery keynotes at conferences, but at dark bars and (I shit you not) in quiet parking garages I’ve talked with several executives who paint out this unfreezing-by-attrition process as something that’s worked in their organization. One of them even said “we should have gotten rid of more people.”
This pattern gets more difficult in highly labor-regulated industries – like government work and government contractors – but setting up brand new organizations to shed the frozen old ways is at least something to try.
And, barring all else, as I like to say, if nothing is working, and people won’t actually change, Pivotal is always hiring.
If that’s too grim, don’t despair! My buddies have a much light-hearted, hopefully whack at the problem in a recent discussion of ours:
At first, you’re like, “oh, another piece where someone makes fun of us nerds and misunderstands our damagingly sarcastic way of saying everything that belies the privilege we all live under.” Then you’re like, “oh, this is actually a good piece.” E.g.:
Human rights work attempts to prevent the abusive deployment of power against those who have little of it. While technology might disrupt some power structures, it might also reinforce them, and it is rarely designed to empower the most vulnerable populations. Human rights defenders are innovative, and they have used software to do work it wasn’t designed to do, such as live-streaming police violence against civilian populations to press for government accountability. But perpetrators of mass violence are innovative too. Software alone is unlikely to provide clear human rights victories.
Assuming that adopting the tool is feasible, do the benefits this tool provides outweigh the cost of disrupting your existing workflow?
People rarely consider that, in all walks of life.
Whether it’s “DevOps,” “digital transformation,” or even “cloud” and “agile,” middle-management is all too common an issue. They simply won’t budge and help out. This isn’t always the case for sure, but “the frozen middle” is a common problem.
With a big ol’ panel of people (including two folks from RedMonk), we talk about tactics for thawing the frozen middle.
There’s a pretty good, brief overview of what all this digital, IoT, etc. stuff could mean for the auto industry by Victor Kingery (of GE) over on DZone.
He outlines are three areas:
- Connected cars – gathering all the info onboard and using it for, at least, preventive maintenance and some auto insurance tweaking.
- Self-driving cars – “More than likely, semi-autonomous vehicles will allow for the mundane tasks of driving to be done by the vehicle’s computers, such as monotonous highway driving, but allow the driver to take over the vehicle when compelled to do so due to unforeseen circumstances.”
- Electric vehicles – “It will take more than automakers improving battery technology. It will take governments and private business working together to create the infrastructure to enable wide scale battery recharging possible.”