Cloud-native at Comcast, working with Pivotal – Highlights

I’m doing a podcast with Comcast in a few weeks, so I’ve been going over all their public talks on their cloud-native efforts. They’ve been working with Pivotal since around 2014 and are one of the more impressive customer cases with over a 1,000 applications now on Pivotal Cloud Foundry.
Here are some highlights from the talks I’ve been watching. As always, things I put in square brackets are my own comments, the rest are quotes or summaries of what people said:

August, 2016 – Empowering Devops with Cloud Foundry – Sergey Matochkin, Neville George; Comcast

  • Sergey Matochkin.
  • Slides.
  • (17:00) Every deployment to production took at least 6 weeks, but most commonly around 2 months end-to-end. Which also means you need to plan capacity much in advance.
  • We started to use virtualization and containerization “well, well before Docker existed… it was some success, we had some improvements, but those improvements were marginal.”
  • Traditionally, it’d take at least 4-6 months to setup your dev/test infrastructure. But, luckily, virtualization came along.
  • (9:20) Business drivers… Comcast phone service, set-top boxes get DVRs, VoD, etc. All of these require apps on the backend, so the portfolio of apps starts to grow, and with they way they were before it meant they had to build a new datacenter every six months. Virtualization helped here, of course.
  • Also, virtualization allowed us to put a service layer [think “platform”] on-top of the infrastructure.
  • It’d take 4-6 weeks for testing environment, but now it takes 10-15 minutes in a self-service portal.
  • Demo of using Pivotal Cloud Foundry for much of the automation needed to deploy and scale an application.
  • (~32:00) We used to have things like “order servers” and “make load-balancer changes” and somewhere in the bottom of the backlog was “write some code and do some testing.” [That is, they were focusing on items with low business value, below “the value line,” rather than customer features.]
  • “What Cloud Foundry essentially helped us with was to get all those unnecessary user stories out of our backlog so we can focus on the writing code, on testing, and deploying rather than managing infrastructure.”
  • (33:45) momentum/proof-points:
  • momemtum
  • 9 PCF instances; 900+ developers; 2,000+ active apps “most of which are in “the critical path of our customer experience”; 4,100 application instances; 2,000 requests per second.
  • Lots of Slack/ChatOps usage for monitoring and such.

August 3rd, 2016 – Transforming the monolith at 20M tph – Nick Beenham, Comcast

  • Slides.
  • Existing state:
    • 250m transaction per day.
    • Would take 3 months to get a server useful, from moment of purchasing to using.
    • “Over a 100 services run by development teams.”
    • In functional, silo roles.
  • (3:45) “We knew we had that large, rigid infrastructure. [Pivotal] Cloud Foundry and it’s adoption really enables us to change that to gain the agility, to gain the elasticity at scale.
  • Taking away roles to reduce finger-pointing and all the negative stuff, and unified team, of course.
  • (7:35) Anecdote of Nick going from “ops guy” to writing code and liking coding.
  • (12:18) ESP router that was a small router written in Go to translate SOAP requests as part of a strangler pattern. Decades old SOA layer that they wanted to modernize. But they couldn’t strip it out, would take so long. So, were going to duck-type as SOA, but do REST and micro services underneath. Strangler pattern, etc. This is what the ESP router does marshals and unmarshalls between microservices and SOAP stuff. But new things need to be done in new style.
  • Also, “de-mingling data,” moving off Oracle RAC/GoldenGate for multi-site. Some simpler CRUD services to front the data.
  • (~15:00) Used to take a week+ to deploy the entire stack, but with Pivotal Cloud Foundry it takes minutes. It gives us a great deal of velocity that we’ve never had before. “Sometimes we’ll deploy multiple times an hour.”
  • (17:00) From 1,000’s of lines of bash to deploy out to various WebLogic clusters, which has for the most part moved to Cloud Foundry.
  • Improving production updates: bringing new node up and shutting old node down slowly; canary updates, with a CI test suite, then switching over to a production install.

August 1st, 2016 – James Taylor – The Power of Partnership & Building a Cloud Native Tier-1 Platform

  • @jctbmwi8
  • “Sparrow, Service Activation Platform.”
  • “Helping someone put a smile on their face is one of the greatest gifts we can give each other.”
  • Their VP provides the feedback loop of things to focus on. Right now: reducing technical debt, reducing incidents, increasing velocity, experimentation.
  • (~6:30) “You can’t move forward – innovate – if you don’t have time to try new things.”
  • (~18:35) “If you’re spending time configuring a Docker container, that’s time you’re not spending coding or solving a problem.”
  • (13:51): “At the end of the day, [business] value is what puts money in everyone’s pocket. If our company, Comcast, can’t create something of value, no one’s gonna pay for us…if we can’t create value. So it’s important for us to understand ‘how can you create value?’”
  • (~22:02, starting epic rant!) “Who is our customer and what value do we bring to our customers…”
  • If you’re spending money on support, that’s cutting into your margins. A call coming in costs $8 right off the bat, then more as it takes longer. So you want to figure out preventing customer support problems… which points to understanding your customers more.
  • [A good overview of thinking about “value” in the context of a specific application, their customer activation center, Sparrow.] “If you have a [support] call rate of 30%, you’re probably cutting out all the value… So we try to figure out, how do we prevent calls?” [Very similar to IRS cloud-native story.]
  • “We’ve been holding technical workshops”: Internal training things every month with Pivotal people, leveraging Pivotal knowledge. With our development teams every month: webinar, or on-site visit.
  • Sparrow: 5 junior Java developers… we built it from scratch in parallel while existing teams maintained the platform… we then had to integrate the processes together… figure out decomposing the monolith platforms, etc….then we had to just cut off stuff when it was too much of a hassle.

August 17th, 2016 – Greg Otto SpringOne Platform keynote

  • Slides.
  • X1 boxes – a new release about once a month.
  • Processing 10’s of millions of transactions on this new platform daily on Pivotal Cloud Foundry/new platform.
  • “About a 75% lift in velocity as well as time to market, and the business is really feeling it.”
  • Developer reactions:
  • comcast what customers are saying.png
  • Momentum Stats:
  • comcast key state from otto.png
    • 40 apps to 900 apps, 2015 to 2016
    • 300 AIs to 4,100 AIs, 2015 to 2016
  • All with “zero outbound marketing from my team, this all word of mouth from all those happy developers.”

June 9th, 2016 – Greg Otto CF Summit keynote

  • “Late last year in 2015” – live in production [on Pivotal Cloud Foundry] with business critical systems from our back-office systems on our Cloud Foundry environment.
  • We put Pivotal Cloud Foundry directly in the customer critical path.
  • Applications doing 30,000 event a second on Cloud Foundry.
  • Started in 2014, met with Pivotal.
  • Had sort of thrown all the people into the Pivotal Cloud Foundry pool, they had to do a lot of research and such.
  • But, people were really interested in the ease of working with the platform [the productivity improvements].
  • Successful prototype app 30 days after platform.
  • Idea to feature, before after: “several weeks, at least”/“2-3 days”
  • Time-line and summary:
  • comcast otto summary.png

June, 2016 – Open source at Comcast story

  • Write-up.
  • “If Comcast has a problem to solve, there are three possible approaches: solve it themselves by making an investment in teams and resources; solve it through a commercial vendor that could build a product for them; or work with the open source community.”
  • OpenStack: “In addition to Linux, Comcast is a heavy user of OpenStack. They use a KVM hypervisor, and then a lot of data center orchestration is done through OpenStack for the coordination of storage and networking resources with compute and memory resources. Muehl said that Comcast has roughly a petabyte of memory and around a million virtual CPU cores that they are running under the OpenStack umbrella. As an operator, Comcast does a lot of things around operations, and they use Ansible to deploy and manage OpenStack at scale.”
  • Cloud Foundry: “They also use Cloud Foundry, but according to Muehl that work is in the very early stages at Comcast.”

May 2015 – Running Cloud Foundry at Comcast talk

  • Neville George, Sam Guerrero, Tim Leong, Sergey Matochkin
  • They wanted to make custom URLs.
  • Used Puppet for stuff.
  • (~8:30) Their requirements for a platform:
  • comcast platform requirements.png
  • A lot of emphasis on self-service and the micro services benefits of operating independently, product management wise.
  • They use OpenStack, Docker, and [Pivotal] Cloud Foundry.
  • Pre-provisioning resources for a pool of containers that are ready to go, etc.
  • (~27) a couple applications in production today… we’ll be ramping up quickly.
  • (Either this video or the 2016 one, a few minutes from the end) Q, training mode. A, Sergey: “I can’t say we have a really good training model…. We do brown-bags to have people aware. We focus on 12 factor application model… on overall microservices model, not just to shape application, but also data. Developers need to understand how they [do] applications for PaaS instead of traditional.

One thought on “Cloud-native at Comcast, working with Pivotal – Highlights

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s