[Summary from the post of metrics to use:]
Rate — Request rate, in requests/sec
Errors — Error rate, in errors/sec
Latency — Response time, including queue/wait time, in milliseconds.
Saturation — How overloaded something is, which is related to utilization but more directly measured by things like queue depth (or sometimes concurrency). As a queue measurement, this becomes non-zero when you are saturated, often not much before. Usually a counter.
Utilization — How busy the resource or system is. Usually expressed 0–100% and most useful for predictions (as Saturation is probably more useful). Note we are not using the Utilization Law to get this (~Rate x Service Time / Workers), but instead looking for more familiar direct measurements.
Original source: How to Monitor the SRE Golden Signals