Introducing microservices

There’s some good “how do I actually get my organization do all this unicorn stuff” comments in this interview with DreamWorks Animation’s Doug Sherman.

Here’s one sample bit on winning people over to microservices. Instead of going into the lab for six months to work on a tool that they think will be useful, they do a lot more user-driven work upfront and then do (it sounds like) weekly small batches to keep the users apprised of the tools and, you’d guess, give continuous feedback:

You have to understand what people want to do in their domain. In the past, Ive gotten it wrong. Ill come up with an idea I think is sound  I think its the coolest thing ever  and Ill work six months in isolation with my team, and then well do this big reveal. And every time we’ve done that, its gone horribly wrong, because 1) people feel like were lecturing to them, like we know better than them. And then 2) we would typically have over-engineered it! It would be like the 747 cockpit, you know? There would be this overwhelming amount of knobs and bits and pieces that I think are great to have, but from their viewpoint, they only need to do a few things, and thats an overwhelming amount of stuff to have to sign up to be able to do. So now, Ive gotten into a habit: before I even write a single line of code, I interview everybody that potentially will use the solution that Im going to write, and I keep them in lockstep with me and my team just about every week. We keep them engaged, helping to influence the direction Im basically trying to echo out in code all of what they want. Its gone so much better, because they feel invested. They don’t feel like in six months I’m revealing this big, mysterious thing. They feel like this is just something they’ve seen through iterations. And whats empowering about that, too, is if you can get the spiritual leaders of the different departments that you’re trying to encourage to use your solution, they’ll help sell it for you.

And then a bit on their progress:

Were about 50% of the way in having some amount of production coverage powered by microservices which are deployable in cloud containers powered by technologies such as Spring and Spring Cloud.

There’s more, good cultural change stories in the interview.

Print media doesn’t translate well to online, still – a travel magazine case study

After all these years, print media still struggles versus the Internet. This long piece on how the travel magazine industry has been suffering covers many great topics. I suspect much of the analysis is the same for all of print media.

One of the problems is the new set of demands on writers in that field:

There is the pain point of figuring out an internal work flow that functions across platforms. Journalists, writers, and content creators often have specialized skillsets, so asking one to write a story, create a listicle, take photos, and film compelling videos about a trip is a major challenge.

“We just started working more efficiently that way and it really, it’s painful to integrate digital and print,” said Guzmán. “The plays are different, the workloads are different, the story ideation is different. In doing this, there’s this huge cultural shift that is exciting and difficult.”

And, then, even after suffering through all that “cultural shift,” the results are often disappointing:

“The iPad was just going to be this Jesus of magazines and I never really quite believed that because I knew how challenging it was was to rejigger the content to fit that format,” said Frank, who oversaw Travel + Leisure’s digital strategy in the early 2010s. “Having just gone through the process of signing up and downloading a magazine, it took forever and was buggy and it just wasn’t necessarily a great solution. I was never really bought the gospel that the tablet was going to be our savior. But we did it. I mean, we created a great app and it was beautiful. It won awards, but that was knowing what the usership was is a little disheartening.”

And, as ever, there’s the tense line between blaming “most reader are dumb” and “rivals are evil” when it comes to what’s to blame:

“I could have written the greatest travel story ever known, and it would not have gotten on the cover of the traffic oriented site because a Swedish bikini teen saved a kitten from a tree; which is going to be more popular?”

Let them watch cats.

Still, as the article opens up with, it’s the old Curse of Web 2.0 – former readers, now just travelers – writing the useful content in the form of reviews on TripAdvisor and such:

“In general, people don’t read a review and make a decision,” said Barbara Messing, chief marketing officer of TripAdvisor. “Consumers will read six to eight reviews. They might dig in a certain characteristic that they are interested in, maybe they really are interested in what the quality of the beach is, or maybe they are really interested in whether it’s kid friendly or not kid friendly. In general, people will hone in on the characteristics of something that’s most important to them, find that answer on TripAdvisor, get that most recent insights, check out the photos, check the forums, and really be able to make an informed decision of whether something is right for them. I think that the notion that people could rely on the wisdom of the crowd and the wisdom of individuals to their detriment, I just think that’s false, and I don’t think the reality is that is going to happen.”

There’s also some M&A history of trading various assets like Lonely Planet, Zagat, and Frommer’s back and forth as different management figures out what to do with them.

As ever, I’m no expert on the media industry. It seems like the core issue is that “the Internet” is so much more efficient at the Job to be Done for travel (as outlined by the TripAdvisor exec above) that the cost structure and business process from print magazines is not only inefficient, but unneeded. Those magazines are now over-serving (and thus, over-spending) with a worse product. 

While the quality of TripAdvisor (and Yelp, for example) reviews is infinitely worse than glossy magazines, since there’s an infinite amount of more crappy reviews, with the occasional helpful ones…it sort of more than evens out in favor of Sweedish bikini cat rescuers. Plus, digital advertising has so much more spend (and overall, industry profit, if only by sheer volume if not margin) – it must be because it’s better at making the advertisers money and because it creates a larger market:

Link

HSBC’s Google Cloud use

A brief note, from William Fellows at 451, on HSBC’s use of Google Cloud’s big data/analytical services:

They have lot of data, that’s only growing:

6PB in 2014, 77PB in 2015 and 93PB in 2016

What they use it for:

In addition to anti-money-laundering workloads (identification and reducing false positives), it is also migrating other machine-learning workloads to GCP, including finance liquidity reporting (six hours to six minutes), risk analytics (raise compute utilization from 10% to actual units consumed), risk reporting and valuation services (rapid provisioning of compute power instead of on-premises grid).

As I highlighted over the weekend, it seems like incumbent banks are doing pretty well wtih all this digital disruption stuff.

Source: HSBC taps Google Compute Platform for Hadoop, is ‘cloud first’ for ML and big data

Banks are handling disruption well – Highlights

Thus far, it seems like the large banks are fending off digital disruption, perhaps embracing some of it on their own. The Economist takes a look:

  • “Peer-to-peer lending, for instance, has grown rapidly, but still amounted to just $19bn on America’s biggest platforms and £3.8bn in Britain last year”
  • “last year JPMorgan Chase spent over $9.5bn on technology, including $3bn on new initiatives”
  • From a similar piece in the NY Times: “The consulting firm McKinsey estimated in a report last month that digital disruption could put $90 billion, or 25 percent of bank profits, at risk over the next three years as services become more automated and more tellers are replaced by chatbots.”
  • But: “Much of this change, however, is now expected to come from the banks themselves as they absorb new ideas from the technology world and shrink their own operations, without necessarily losing significant numbers of customers to start-ups.”
  • Back to The Economist piece: “As well as economies of scale, they enjoy the advantage of incumbency in a heavily regulated industry. Entrants have to apply for banking licences, hire compliance staff and so forth, the costs of which weigh more heavily on smaller firms.”
  • Regulations and customer loyalty are less in China, resulting in more investment in new financial tech in Asia: 
  • As another article puts it: “China has four of the five most valuable financial technology start-ups in the world, according to CB Insights, with Ant Financial leading the way at $60 billion. And investments in financial technology rose 64 percent in China last year, while they were falling 29 percent in the United States, according to CB Insights.”
  • Why? “The obvious reason that financial start-ups have not achieved the same level of growth in the United States is that most Americans already have access to a relatively functional set of financial products, unlike in Africa and China.”
  • There’s some commentary on the speed of sharing blockchain updates can reduce multi-day bank transfers (and payments) to, I assume, minutes. Thus: ‘“Blockchain reduces the cost of trust,” says Mr Lubin of ConsenSys.’

Fixing legacy problems with new platforms, not easy

  • The idea of building banking platforms to clean up the decades of legacy integration problems.
  • Mainframes are a problem, as a Gartner report from last year puts it: “The challenge for many of today’s modernization projects is not simply a change in technology, but often a fundamental restructuring of application architectures and deployment models. Mainframe hardware and software architectures have defined the structure of applications built on this platform for the last 50 years. Tending toward large-scale, monolithic systems that are predominantly customized, they represent the ultimate in size, complexity, reliability and availability.”
  • But, unless/until there’s a crisis, changes won’t be funded: “Banks need to be able to justify the cost and risk of any modernization project. This can be difficult in the face of a well-proven, time-tested portfolio that has represented the needs of the banking system for decades.”
  • Sort of in the “but wasn’t that always the goal, but from that same article, Gartner suggests the vision for new fintech: ‘Gartner, Hype Cycle for Digital Banking Transformation, 2015, says, “To be truly digital, banks must pair an emphasis on customer-facing capabilities with investment in the technical, architectural, analytic and organizational foundations that enable participation in the financial services ecosystem.”’
  • BCG has a prescriptive piece for setting the strategy for all this, from Nov. 2015.

Case studies

  • A bit correlation-y, but still useful, from that BCG piece: “While past performance is no guarantee of future results, and even though all the company’s results cannot be entirely attributed to BBVA’s digital transformation plan, so far many signs are encouraging. The number of BBVA’s digital customers increased by 68% from 2011 to 2014, reaching 8.4 million in mid-2014, of which 3.6 million were active mobile users. Because of the increasing use of digital channels and efforts to reconfigure the bank’s branch network—creating smaller branches that emphasize customer self-service and larger branches that provide higher levels of personalized advice through a remote cross-selling support system—BBVA achieved a reduction in costs of 8% in 2014, or €340 million, in the core business in Spain. Meanwhile, the bank’s net profits increased by 26% in 2014, reaching €2.6 billion.”
  • And a more recent write-up of JPMC’s cloud-native programs, e.g.: ‘“We aren’t looking to decrease the amount of money the firm is spending on technology. We’re looking to change the mix between run-the-bank costs versus innovation investment,” he said. “We’ve got to continue to be really aggressive in reducing the run-the bank costs and do it in a very thoughtful way to maintain the existing technology base in the most efficient way possible.” …Dollars saved by using lower-cost cloud infrastructure and platforms will be reinvested in technology, he said.’ JPMC, of course, is a member of the Cloud Foundry Foundation which means, you know, they’re into that kind of thing.

Why Pivotal Serves Free Breakfast to All Employees

Free food, during a limited, half-hour window, both saves people some hassle and gets them to show up at the same time to kick off the workday.

To understand why this is so important, picture Pivotal without free breakfast. Let’s start with the obvious. Most developers would sleep late if it were up to them. They’d roll into the office around 10 or 11 AM. Which means they’d grab a coffee, maybe respond to a few emails, and then sync up with the team.

Before you know it, the morning is over and it’s time for lunch. But hey, that’s okay, we live in a digital world, and you can show up whenever, so long as you get your work done, right? Wrong. Pair programming only works when you have people to pair with. And that means you need to sync their schedules.

We ring a cowbell at 9:05 AM. (The Toronto office smacks a golden gong with a mallet.) It signals that breakfast is over and the office-wide meeting is about to start. After the five-minute standup, the teams have their own standup meetings, and then pairs break off to get rolling at their workstations.

While posed as a pair programming enabler, take out pairing from the above and it also gets the point of having people show-up on-time, not dick around, and do actual work.

If you’ve seen me talk you know the joke of “how a developer spends their day” which usually includes 1-2 hours of actual coding because of all the meetings, you know, those 30 minute sitdown-standup meetings, architectrual reviews, deciding where to go to lunch, the post-lunch-buffet comma, “researching on the Internet, etc…. it’s all just unsynchronized schedules and little not attention spent on actually managing your staff’s time.

Source: Why My Company Serves Free Breakfast to All Employees

Red Hat OpenShift Momentum – Highlights

Brian Gracely of Red Hat (and formally an analyst who did some of the best “cloud-native”/cloud platform work early on) has a momentum post on Open Shift. Here’s my highlights:

Sizing up revenue and deal-size:
[Q3, FY 2017] Also of note, we closed our second OpenShift deal over $10 million and another OpenShift deal over $5 million. And significantly, we actually had over 50 OpenShift deals alone that were six or seven figures, so really strong traction. [Q4, FY 2017] with our largest deals in Q4 approximately one-third had an OpenShift container platform component.
Red Hat hasn’t yet been too clear on OpenShift revenue, so you have to tea-leave out these revenue spreads, which I haven’t really done. Earlier in April, Jeffrey Burt at The Next Platform had this to say:
During the final three months of last year, subscription revenue for Red Hat’s application development-related [JBoss, etc] and other emerging technologies – which includes OpenShift – hit $125 million, a 40 percent increase from the same period in 2015, and revenue for the group accounted for about 20 percent of Red Hat’s overall revenues for the fourth quarter.
Today, we also announced that Barclays Bank, the Government of British Columbias Office of the CIO, and Macquarie Bank are also using Red Hat OpenShift Container Platform to modernize application development…. airplane manufacturer Airbus about their DevOps journey, and digital travel platform Amadeus about their transformation of handling 2,000x the number of online transactions…. how Amsterdams Schipol Airport (AMS) is using OpenShift to redefine the in-terminal travel experience, how Miles & More GmbH is better managing rewards programs for travelers, and how ATPCO is rethinking how they publish fare-related data to the airline and travel industry.
Much of the write-up focuses on community momentum, true to Red Hat, open source form:

The OpenShift Commons community has 260+ member organizations….

Red Hat engineers lead or co-lead in 10 of the 24 Kubernetes SIG activities.
Finally, some commentary on their strategic shift to Kubernetes:
The huge architectural shift that we made a few years ago in adopting open standards for containers and the Kubernetes container scheduler has allowed us to delivered a unified platform to containerize existing applications and deliver agility and scalability for cloud-native applications and microservices. We call this combination Enterprise Kubernetes+, or Enterprise-Ready Kubernetes.
Red Hat’s OpenShift is, of course, a competitor to us over at Pivotal.

Cloud-native at Comcast, working with Pivotal – Highlights

I’m doing a podcast with Comcast in a few weeks, so I’ve been going over all their public talks on their cloud-native efforts. They’ve been working with Pivotal since around 2014 and are one of the more impressive customer cases with over a 1,000 applications now on Pivotal Cloud Foundry.
Here are some highlights from the talks I’ve been watching. As always, things I put in square brackets are my own comments, the rest are quotes or summaries of what people said:

August, 2016 – Empowering Devops with Cloud Foundry – Sergey Matochkin, Neville George; Comcast

  • Sergey Matochkin.
  • Slides.
  • (17:00) Every deployment to production took at least 6 weeks, but most commonly around 2 months end-to-end. Which also means you need to plan capacity much in advance.
  • We started to use virtualization and containerization “well, well before Docker existed… it was some success, we had some improvements, but those improvements were marginal.”
  • Traditionally, it’d take at least 4-6 months to setup your dev/test infrastructure. But, luckily, virtualization came along.
  • (9:20) Business drivers… Comcast phone service, set-top boxes get DVRs, VoD, etc. All of these require apps on the backend, so the portfolio of apps starts to grow, and with they way they were before it meant they had to build a new datacenter every six months. Virtualization helped here, of course.
  • Also, virtualization allowed us to put a service layer [think “platform”] on-top of the infrastructure.
  • It’d take 4-6 weeks for testing environment, but now it takes 10-15 minutes in a self-service portal.
  • Demo of using Pivotal Cloud Foundry for much of the automation needed to deploy and scale an application.
  • (~32:00) We used to have things like “order servers” and “make load-balancer changes” and somewhere in the bottom of the backlog was “write some code and do some testing.” [That is, they were focusing on items with low business value, below “the value line,” rather than customer features.]
  • “What Cloud Foundry essentially helped us with was to get all those unnecessary user stories out of our backlog so we can focus on the writing code, on testing, and deploying rather than managing infrastructure.”
  • (33:45) momentum/proof-points:
  • momemtum
  • 9 PCF instances; 900+ developers; 2,000+ active apps “most of which are in “the critical path of our customer experience”; 4,100 application instances; 2,000 requests per second.
  • Lots of Slack/ChatOps usage for monitoring and such.

August 3rd, 2016 – Transforming the monolith at 20M tph – Nick Beenham, Comcast

  • Slides.
  • Existing state:
    • 250m transaction per day.
    • Would take 3 months to get a server useful, from moment of purchasing to using.
    • “Over a 100 services run by development teams.”
    • In functional, silo roles.
  • (3:45) “We knew we had that large, rigid infrastructure. [Pivotal] Cloud Foundry and it’s adoption really enables us to change that to gain the agility, to gain the elasticity at scale.
  • Taking away roles to reduce finger-pointing and all the negative stuff, and unified team, of course.
  • (7:35) Anecdote of Nick going from “ops guy” to writing code and liking coding.
  • (12:18) ESP router that was a small router written in Go to translate SOAP requests as part of a strangler pattern. Decades old SOA layer that they wanted to modernize. But they couldn’t strip it out, would take so long. So, were going to duck-type as SOA, but do REST and micro services underneath. Strangler pattern, etc. This is what the ESP router does marshals and unmarshalls between microservices and SOAP stuff. But new things need to be done in new style.
  • Also, “de-mingling data,” moving off Oracle RAC/GoldenGate for multi-site. Some simpler CRUD services to front the data.
  • (~15:00) Used to take a week+ to deploy the entire stack, but with Pivotal Cloud Foundry it takes minutes. It gives us a great deal of velocity that we’ve never had before. “Sometimes we’ll deploy multiple times an hour.”
  • (17:00) From 1,000’s of lines of bash to deploy out to various WebLogic clusters, which has for the most part moved to Cloud Foundry.
  • Improving production updates: bringing new node up and shutting old node down slowly; canary updates, with a CI test suite, then switching over to a production install.

August 1st, 2016 – James Taylor – The Power of Partnership & Building a Cloud Native Tier-1 Platform

  • @jctbmwi8
  • “Sparrow, Service Activation Platform.”
  • “Helping someone put a smile on their face is one of the greatest gifts we can give each other.”
  • Their VP provides the feedback loop of things to focus on. Right now: reducing technical debt, reducing incidents, increasing velocity, experimentation.
  • (~6:30) “You can’t move forward – innovate – if you don’t have time to try new things.”
  • (~18:35) “If you’re spending time configuring a Docker container, that’s time you’re not spending coding or solving a problem.”
  • (13:51): “At the end of the day, [business] value is what puts money in everyone’s pocket. If our company, Comcast, can’t create something of value, no one’s gonna pay for us…if we can’t create value. So it’s important for us to understand ‘how can you create value?’”
  • (~22:02, starting epic rant!) “Who is our customer and what value do we bring to our customers…”
  • If you’re spending money on support, that’s cutting into your margins. A call coming in costs $8 right off the bat, then more as it takes longer. So you want to figure out preventing customer support problems… which points to understanding your customers more.
  • [A good overview of thinking about “value” in the context of a specific application, their customer activation center, Sparrow.] “If you have a [support] call rate of 30%, you’re probably cutting out all the value… So we try to figure out, how do we prevent calls?” [Very similar to IRS cloud-native story.]
  • “We’ve been holding technical workshops”: Internal training things every month with Pivotal people, leveraging Pivotal knowledge. With our development teams every month: webinar, or on-site visit.
  • Sparrow: 5 junior Java developers… we built it from scratch in parallel while existing teams maintained the platform… we then had to integrate the processes together… figure out decomposing the monolith platforms, etc….then we had to just cut off stuff when it was too much of a hassle.

August 17th, 2016 – Greg Otto SpringOne Platform keynote

  • Slides.
  • X1 boxes – a new release about once a month.
  • Processing 10’s of millions of transactions on this new platform daily on Pivotal Cloud Foundry/new platform.
  • “About a 75% lift in velocity as well as time to market, and the business is really feeling it.”
  • Developer reactions:
  • comcast what customers are saying.png
  • Momentum Stats:
  • comcast key state from otto.png
    • 40 apps to 900 apps, 2015 to 2016
    • 300 AIs to 4,100 AIs, 2015 to 2016
  • All with “zero outbound marketing from my team, this all word of mouth from all those happy developers.”

June 9th, 2016 – Greg Otto CF Summit keynote

  • “Late last year in 2015” – live in production [on Pivotal Cloud Foundry] with business critical systems from our back-office systems on our Cloud Foundry environment.
  • We put Pivotal Cloud Foundry directly in the customer critical path.
  • Applications doing 30,000 event a second on Cloud Foundry.
  • Started in 2014, met with Pivotal.
  • Had sort of thrown all the people into the Pivotal Cloud Foundry pool, they had to do a lot of research and such.
  • But, people were really interested in the ease of working with the platform [the productivity improvements].
  • Successful prototype app 30 days after platform.
  • Idea to feature, before after: “several weeks, at least”/“2-3 days”
  • Time-line and summary:
  • comcast otto summary.png

June, 2016 – Open source at Comcast story

  • Write-up.
  • “If Comcast has a problem to solve, there are three possible approaches: solve it themselves by making an investment in teams and resources; solve it through a commercial vendor that could build a product for them; or work with the open source community.”
  • OpenStack: “In addition to Linux, Comcast is a heavy user of OpenStack. They use a KVM hypervisor, and then a lot of data center orchestration is done through OpenStack for the coordination of storage and networking resources with compute and memory resources. Muehl said that Comcast has roughly a petabyte of memory and around a million virtual CPU cores that they are running under the OpenStack umbrella. As an operator, Comcast does a lot of things around operations, and they use Ansible to deploy and manage OpenStack at scale.”
  • Cloud Foundry: “They also use Cloud Foundry, but according to Muehl that work is in the very early stages at Comcast.”

May 2015 – Running Cloud Foundry at Comcast talk

  • Neville George, Sam Guerrero, Tim Leong, Sergey Matochkin
  • They wanted to make custom URLs.
  • Used Puppet for stuff.
  • (~8:30) Their requirements for a platform:
  • comcast platform requirements.png
  • A lot of emphasis on self-service and the micro services benefits of operating independently, product management wise.
  • They use OpenStack, Docker, and [Pivotal] Cloud Foundry.
  • Pre-provisioning resources for a pool of containers that are ready to go, etc.
  • (~27) a couple applications in production today… we’ll be ramping up quickly.
  • (Either this video or the 2016 one, a few minutes from the end) Q, training mode. A, Sergey: “I can’t say we have a really good training model…. We do brown-bags to have people aware. We focus on 12 factor application model… on overall microservices model, not just to shape application, but also data. Developers need to understand how they [do] applications for PaaS instead of traditional.

DIUx working in streamlining IT projects at the DoD

Since May 2016, DIUx has completed 21 contracts using other transaction (OT) authority and the average time is 78 days, Shah said at the New America Foundation Future of War summit in Washington.

The mission of DIUx, he said, “is to do agile culture change.…We are never going to be the acquisition arm of the Department of Defense, we’re not the R&D arm of the department.”
DIUx has so far comprised $42 million in program funding, which Shah characterized as a “rounding error of a rounding error” of the DOD budget.

Hey, they’re trying over there in the government. It ain’t easy. I’ve meet with some of the folks there and they sure seem genuine about fixing things up and curious to work closer with the civilian IT world.

When I meet with military people they use the word “agile” over and over: meaning, they’re incredibly interested in modernizing. It’s just the tiny matter of figuring out how to get from here to there.

Link

The coming billions in updating bank’s COBOL stacks

Commonwealth Bank of Australia, for instance, replaced its core banking platform in 2012 with the help of Accenture and software company SAP SE. The job ultimately took five years and cost more than 1 billion Australian dollars ($749.9 million).

Being conservative, multiply $500m across the top 20 banks, and you’ve got $10bn, using $749.8m directly, you get much closer to $15bn.

Better start planning.

Source: Banks scramble to fix old systems as IT ‘cowboys’ ride into sunset

Advice on introducing DevOps from Merrill Corp & SPS Commerce – Highlights

Nicely moderated by Bridget. Some of my notes and highlights:

  • Amy talks about pace of change, sustaining it in the beginning, etc.
    • The amount of time it took us to get going was a surprise – was longer.
    • If you can start to show results early, it helps build up momentum. “Having enough wins, like that, really helped us to keep the momentum going while we were having a culture change like DevOps.”
    • It takes the right people to keep that energy going, but also be able to go back to the business to show that why we are putting these changes in place.
    • You’re going to be able to see the changes to the business right away.
  • Peg – tools, don’t try to fix the old ones, like ITIL service desk tools. Instead we just had Jenkins open tickets and such, automating the toil of dealing with old tools
  • Global/offshore tactics, from Amy:
    • What with all the retrospective stuff, you need to be able to get teams together, physically. The collaboration angles are much better in person
    • Set-up each “shore” as an architecturally and management island, make them as independent as possible. They also need their own context, not held up by time zones so they don’t need to wait 24-48 hours for authorizations and collaboration. [To my mind, this means taking advantage of the organizational de-coupling you can get with microservices.]
  • Starting change, even when they company needs it. Amy: You have to start with the business need, what’s the big driver behind a change like DevOps. [Managers often don’t make sure they figure this out, let alone decimate it to staff.]