Addressing the DevOps compliance problem

Dec 15, 2015

Satisfying the mythical auditors is often one of the first barriers to spreading DevOps initiatives more widely inside an organization. While these process-driven barriers can be annoying and onerous, once you follow the DevOps tradition of empathetic inclusion — being all “one team” — they can not only stop slowing you down but actually help the overall quality of the product. Indeed, the very reason these audit checks were introduced in the first place was to ensure overall quality of the software and business. There’s some excellent, exhaustive overviews out there of dealing with audits and the like in DevOps. In this column, I wanted to go through a little mental re-orientation for how to start thinking about and approaching the “compliance problem.”

Three-Ring Binder Ninjas

In this context, I think of “auditors” as falling into the category of governance, risk and compliance (GRC) — any function that acts as a check on code as and how the code is produced and run as it goes through its lifecycle. I would put security in here as well, though that tends to be such a broad, important topic that it often warrants its own category (and the security people seem to like maintaining their occultic silo-tude, anyhow).

The GRC function(s) may impose self-created policies (like code and architectural review), third party and government imposed regulations (like industry standard compliance and laws such as HIPAA), and verification that risky behavior is being avoided (if you write the code, you can’t be the same person who then uses that code for cash payouts, perhaps, to yourself, for example). In all cases, “compliance” is there to ensure overall quality of the product and the process that created it. That “quality” may be the prevention of malicious and undesired behavior; that is, in a compliance-driven software development mindset, the ends rarely justify the means.

In many cases, the GRC function is more interested in proof that there is a process in place than actually auditing each execution of that process. This is a curious thing at first. Any developer knows that the proof is in the code, not the documentation. And, indeed, for some types of GRC the amount of automation that a DevOps mindset puts into place could likely improve the quality of GRC, ironically.

Establishing trust and automating compliance

Indeed, automation is one of the first areas to look at when reducing DevOps/GRC friction. First, treat complying with policies as you would any other feature. Describe it, prioritize it and track it. Once you have gotten your hands around it, you can start figure out how to best implement that “feature.” Ideally, you can code and automate your way out of having to do too much manual work.

There’s work being done in the US Federal government along these lines that’s helpful because it’s visible and at scale. First, as covered in a recent talk by Diego Lapiduz, part of what auditors are looking for is to trust the software and infrastructure stack that apps are running on. This is especially true from a security standpoint. The current way that software is spec’d out and developed in most organizations follows a certain “do whatever,” or even YOLO principal. App teams are allowed to specify which operating systems, orchestration layers and middleware components they want. This may be within an approved list of options, but more often than not it results in unique software stacks per application.

As outlined by Diego, this variation in the stack meant that government auditors had to review just about everything, taking up to months to approve even the simplest line of code. To solve this problem, 18F standardized on one stack — Cloud Foundry — to run applications on, not allowing for variance at the infrastructure layer. They then worked with the auditors to build trust in the platform. Then, when there was just the metaphoric or literal “one line of code” to deploy, auditors could focus on much less, certainly not the entire stack. This brought approval time down to just days. A huge speed up.

When it comes to all the paperwork, also look to ways to automate the generation of the needed listings of certifications and compliance artifacts. This shouldn’t be a process that’s done in opaque documents, nor manually, if at all possible. Just as we’d now recoil in horror at manually deploying software into production, we should try to achieve “compliance as code” that’s as autogenerated (but accurate!) as possible. To that end, the work being done in the OpenControl project is showing an interesting and likely helpful approach.

The lessons for DevOps teams here is clear: Standardize your stack as much as possible and work with auditors to build their trust in that platform. Also, look into how you can automate the generation of compliance documents beyond the usual .docx and .pptx suspects. This will help your GRC process move at DevOps speed. And it will also allow your auditors to still act as a third party governing your code. They’ll probably even do a better job if they have these new, smaller batches of changes to review.

Refactoring the compliance process

To address the compliance issue fully, you’ll need to start working with the actual compliance stakeholders directly to change the process. There’s a subtle point right there: Work with the people responsible for setting compliance, not those responsible for enforcing it, like IT. All too often, people in IT will take the strictest view of compliance rules, which results in saying “no” to virtually anything new — coupled with Larman’s Law, you’ll soon find that, mysteriously, nothing new ever happens and you’re back to the pre-DevOps speed of deployment, software quality levels and timelines. You can’t blame IT staff for being unimaginative here — they’re not experts in compliance and it’d be risky for them to imagine “workarounds.” So, when you’re looking to change your compliance process, make sure you’re including the actual auditors and policy setters in your conversations. If they’re not “in the room,” you’re likely wasting your time.

As an example, one of the common compliance problems is around “developers deploying to production.” In many cases and industries, a separation of duties is required between coding and deploying. When deploying code to production was a more manual, complicated process, this could be extremely onerous. But once deployments are push-button automated with a good continuous delivery pipeline,you might consider having the product manager or someone who hasn’t written code be the deployer. This ensures that you can “deploy at will,” but keeps the actual coders’ fingers off the button.

As another intriguing compliance strategy, suggested by Home Depot’s Tony McCulley (who also suggested the above approach to the separation of duties) is to give GRC staff access to your continuous delivery process and deployment environment. This means instead of having to answer questions and check for controls for them, you can allow GRC staff to just do it on their own. Effectively, you’re letting GRC staff peer into and even help out with controls in your software. I’d argue that this only works if you have a well-structured platform supporting your CD pipeline with good UIs that non-technical staff can access.

It might be a bit of a stretch, but inviting your GRC people into your DevOps world, especially early on, may be your best bet at preventing compliance slowdowns. And, if there’s any core lesson of DevOps, it’s that the real problems are not in the software or hardware, but the meatware. Figuring out how to work better with the people involved will go a long way towards addressing the compliance problem.

(I originally wrote this December 2015 for FierceDevOps, a site which has made it either impossible or impossibly tedious to find these articles. Hence, it’s now here.)

Coté

Coté

Addressing the DevOps compliance problem

Three-Ring Binder Ninjas

Establishing trust and automating compliance

Refactoring the compliance process