When it comes to measuring developer productivity driven by AI, we’ll probably land on the same conclusion as always: counting lines of code isn’t as useful as measuring the full cycle time from idea to code to delivery to a person actually using the app - lead time, concept to cash, whatever you want to call it. And as we rediscover every time we rediscover this, it’s very hard to measure, because it crosses so many different groups.
I say this because the current discussion around AI in development always seems to open with “it was never about writing code,” or “writing code is only 10% to 40% of programming.” Which raises the question: then what’s the point of applying AI to coding at all?
Maybe what people are fumbling toward with “writing code faster isn’t the problem” is really “we need to apply AI to the other parts of the SDLC.” (Well, that and the quiet part: “don’t fire the developers and replace them with robots.”)
The lesson labor learns in these cycles is that it can’t dodge management’s urge to measure it. You have to offer up some proof of performance. We can invoke all the Seeing Like a State legibility stuff we want - but the measuring middle can’t allocate budget and priorities, or decide who to lay off and who to reward, without measurements.
So at the very least, management needs to answer a basic question: what should our token budget be? What can we measure to know whether it should be $1,200 a year or $100,000?
Meanwhile, I suppose there’s a second experiment running: what if we just fire middle management - the measurers - instead?
That wouldn’t mean the people making apps stop being measured; it would mean AI starts measuring them. Will it be any better for workers?
We need some kind of metrics. Do we just throw the DevOps ones on again?