Today I’m a DevOps/Cloud Architect, and when mentoring juniors, one of the first topics is “never use :latest in production.” But I learned this in practice.
Some Context
We worked with Azure Kubernetes, logging/monitoring team. Built custom OpenSearch images - pipeline built, tagged with version, and moved :latest tag.
After each build, we manually updated the version in Helm chart for deployment. One day my colleague and I watched another StatefulSet restart and thought: “Why bother? Let’s just use :latest? We deploy right after build anyway?”
Changed it. PullPolicy was Always, seemed fine.
Things we realised
Lasted two deploys in dev environment. Then issues emerged:
- Can’t rollback easily
- Can’t test images separately before deployment - they go straight to deploy
- Most important: dev was “dev” for developers, but production for us. They still needed their logs)))
Quick fix, reverted to explicit versions. Just work.
Recent Example
A month ago helped a team with MongoDB crash in dev:
image: mongo # no tag, which means :latest is used
Node went for maintenance, pod moved and image re-pulled. Mongo v8 was released 16 days earlier, the :latest tag moved to it. Pod started with new version, but storage had database from old one - incompatible. Dev was down for hours.
Developers were more surprised than upset - they work locally in docker compose anyway, so for them it was just “huh, interesting, didn’t know this could happen”.
Now I always use explicit versions in manifests. :latest only for local development, and only when I’m too lazy to type new one for every rebuild.
When I want to drive the point home, or explain the nuances, I mention: :latest tag isn’t “latest stable”, it’s “whatever sits under this tag right now”. Google “:latest problems” - tons of “10 ways to shoot yourself in the foot” stories.
What’s your :latest story?